I had a box with 32GB RAM have its mysql just die (no logs, just die). init tried to restart it (RHEL6) and I got this:
160104 8:55:55 InnoDB: Initializing buffer pool, size = 3.9G 160104 8:55:55 InnoDB: Error: cannot allocate 4194320384 bytes of InnoDB: memory with malloc! Total allocated memory InnoDB: by InnoDB 48401248 bytes. Operating system errno: 12 InnoDB: Check if you should increase the swap file or InnoDB: ulimits of your operating system. InnoDB: On FreeBSD check you have compiled the OS with InnoDB: a big enough maximum process size. InnoDB: Note that in most 32-bit computers the process InnoDB: memory space is limited to 2 GB or 4 GB. InnoDB: We keep retrying the allocation for 60 seconds...
The setting of a 4G buffer pool was working for months until just now.
I reduced it to 3G and mysql restarted ok.
Top shows: Mem: 32827576k total, 2274940k used, 30552636k free, 165052k buffers Swap: 2097148k total, 276068k used, 1821080k free, 413640k cached
Cacti (up until mysql crashed, wink) shows ps load going up, but that is normal for this box, and RAM usage stayed constant, right before the crash. However, we did launch a long ps that spawns a lot of little ps's (like maybe 100 at a time). This is all normal for this box, and it never has done this before.
I'm guessing RAM fragmentation:
#cat /proc/buddyinfo Node 0, zone DMA 2 1 1 1 1 0 1 0 1 1 3 Node 0, zone DMA32 1815 1789 1553 1294 1007 688 415 212 103 67 477 Node 0, zone Normal 32859 67280 77301 62304 45390 29754 19144 11069 4786 530 177 (anything else to check?)
Looks like tons of pages avail of every size, assuming mysql uses "normal".
Any ideas? My only thought now is to reboot server more often to clear fragmentation. Maybe newer kernels solve this "bug" a bit better, but I'm stuck on RHEL6 right now.
It looks like a ISP DNS blockage caused one of our ps's to fall behind on and have around 5900 little sub-ps's pile up. Then a oom-killer triggered. oom-killer is actually wonderful in this case as it logged the state of everything to disk.
Jan 4 08:55:29 kernel: [5384435.328060] Node 0 DMA: 2*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15744kB Jan 4 08:55:29 kernel: [5384435.329687] Node 0 DMA32: 14*4kB 43*8kB 62*16kB 35*32kB 7*64kB 16*128kB 67*256kB 21*512kB 2*1024kB 3*2048kB 20*4096kB = 123024kB Jan 4 08:55:29 kernel: [5384435.331315] Node 0 Normal: 15448*4kB 66*8kB 7*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 62432kB
So it is fragmentation, or simple ram exhaustion, due to runaway small ps's due to blocked DNS. Time to rejig the app to handle DNS going down. :-)
I am surprised. Doesn't a current high-grade OS like Linux (maybe not Windows) routinely close the gaps in RAM by relocating running code and data? This is quite efficiently do-able in CPU architectures that use base/index/displacement memory addressing in the instruction set. This was introduced in the IBM 360 in 1964, carried forward into Intel x86, and probably used everywhere else since then.
With base/index/displacement memory addressing, running code and data can easily be relocated by copying it to the new location and changing the contents of a few base registers, unlike the older architecture of entire linear addresses stored in the instructions that would require the whole program and data to have its addresses adjusted/rewritten after relocation.
(Yes, if this sounds "academic", I have an M.Sc. (1975) in computer science.) :)
Hartmut W Sager - Tel +1-204-339-8331
On 4 January 2016 at 11:32, Trevor Cordes trevor@tecnopolis.ca wrote:
It looks like a ISP DNS blockage caused one of our ps's to fall behind on and have around 5900 little sub-ps's pile up. Then a oom-killer triggered. oom-killer is actually wonderful in this case as it logged the state of everything to disk.
Jan 4 08:55:29 kernel: [5384435.328060] Node 0 DMA: 2*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15744kB Jan 4 08:55:29 kernel: [5384435.329687] Node 0 DMA32: 14*4kB 43*8kB 62*16kB 35*32kB 7*64kB 16*128kB 67*256kB 21*512kB 2*1024kB 3*2048kB 20*4096kB = 123024kB Jan 4 08:55:29 kernel: [5384435.331315] Node 0 Normal: 15448*4kB 66*8kB 7*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 62432kB
So it is fragmentation, or simple ram exhaustion, due to runaway small ps's due to blocked DNS. Time to rejig the app to handle DNS going down. :-) _______________________________________________ Roundtable mailing list Roundtable@muug.mb.ca http://www.muug.mb.ca/mailman/listinfo/roundtable
On 2016-01-04 Hartmut W Sager wrote:
I am surprised. Doesn't a current high-grade OS like Linux (maybe not Windows) routinely close the gaps in RAM by relocating running code and data? This is quite efficiently do-able in CPU architectures
I just read an article about this. Supposedly modern linux (not sure about RHEL6) does mem "defrag" only on oom condition (going from vague memory). Discussions always crop up about having a kthread that can do periodic defrag outside of oom condition but I don't think this is being done at present.
Defrag on oom does sound like what I just saw today, because when the oom happened there was almost no contig ram left, but when I checked the server a while later there were lots of bigger contigs, but still not enough for a 4G alloc.
It sounds like it was doing just a partial RAM defrag. Otherwise, there should be a single large contig (hopefully > 4GB) right after the defrag.
Hartmut W Sager - Tel +1-204-339-8331, +1-204-515-1701, +1-204-515-1700, +1-810-471-4600, +1-909-361-6005
On 4 January 2016 at 12:54, Trevor Cordes trevor@tecnopolis.ca wrote:
On 2016-01-04 Hartmut W Sager wrote:
I am surprised. Doesn't a current high-grade OS like Linux (maybe not Windows) routinely close the gaps in RAM by relocating running code and data? This is quite efficiently do-able in CPU architectures
I just read an article about this. Supposedly modern linux (not sure about RHEL6) does mem "defrag" only on oom condition (going from vague memory). Discussions always crop up about having a kthread that can do periodic defrag outside of oom condition but I don't think this is being done at present.
Defrag on oom does sound like what I just saw today, because when the oom happened there was almost no contig ram left, but when I checked the server a while later there were lots of bigger contigs, but still not enough for a 4G alloc. _______________________________________________ Roundtable mailing list Roundtable@muug.mb.ca http://www.muug.mb.ca/mailman/listinfo/roundtable