Huh??? My load has been stuck at 27-ish for an hour now even after I killed all the CPU and mem hogs, and I'm stumped... How can idle be around 90% and load avg be 27? System feels a bit sluggish, but not load 27 sluggish! Quad core system. (And yes, "daily" is supposed to be high CPU, but it's niced out the wazoo.)
top - 16:20:12 up 31 days, 12:26, 58 users, load average: 27.17, 27.20, 27.23 Tasks: 496 total, 1 running, 481 sleeping, 13 stopped, 1 zombie %Cpu(s): 1.2 us, 0.8 sy, 8.4 ni, 89.5 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 8174168 total, 2590468 free, 2260492 used, 3323208 buff/cache KiB Swap: 8376316 total, 7449712 free, 926604 used. 5777204 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11450 trevor 39 19 151372 52464 10000 S 33.4 0.6 0:12.71 daily 2170 root 20 0 448728 132620 23652 S 3.0 1.6 644:53.37 Xorg 4190 trevor 20 0 929896 105796 11352 S 2.6 1.3 146:56.66 gnome-terminal- 2794 trevor 20 0 267904 7628 6020 S 1.0 0.1 394:39.23 gkrellm 4492 trevor 20 0 773060 17256 12936 S 1.0 0.2 65:17.14 audacious ...
That's legit (at least potentially). Means there's 27 processes in the run queue. Most likely all sitting in IOWAIT on a shared resource, otherwise your system would be pretty sluggish. -Adam
On 16-03-05 04:23 PM, Trevor Cordes wrote:
Huh??? My load has been stuck at 27-ish for an hour now even after I killed all the CPU and mem hogs, and I'm stumped... How can idle be around 90% and load avg be 27? System feels a bit sluggish, but not load 27 sluggish! Quad core system. (And yes, "daily" is supposed to be high CPU, but it's niced out the wazoo.)
top - 16:20:12 up 31 days, 12:26, 58 users, load average: 27.17, 27.20, 27.23 Tasks: 496 total, 1 running, 481 sleeping, 13 stopped, 1 zombie %Cpu(s): 1.2 us, 0.8 sy, 8.4 ni, 89.5 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 8174168 total, 2590468 free, 2260492 used, 3323208 buff/cache KiB Swap: 8376316 total, 7449712 free, 926604 used. 5777204 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11450 trevor 39 19 151372 52464 10000 S 33.4 0.6 0:12.71 daily 2170 root 20 0 448728 132620 23652 S 3.0 1.6 644:53.37 Xorg 4190 trevor 20 0 929896 105796 11352 S 2.6 1.3 146:56.66 gnome-terminal- 2794 trevor 20 0 267904 7628 6020 S 1.0 0.1 394:39.23 gkrellm 4492 trevor 20 0 773060 17256 12936 S 1.0 0.2 65:17.14 audacious ... _______________________________________________ Roundtable mailing list Roundtable@muug.mb.ca http://www.muug.mb.ca/mailman/listinfo/roundtable
On 2016-03-05 Adam Thompson wrote:
That's legit (at least potentially). Means there's 27 processes in the run queue. Most likely all sitting in IOWAIT on a shared
27.20, 27.23 Tasks: 496 total, 1 running, 481 sleeping, 13 stopped,
Wouldn't "1 running" be "27 running" (or >1 anyhow) then? Or do IOWAITs show up in "sleeping".
Any way to confirm this with another stat tool (not top?)
Simple ps -ef showed me the problem:
#ps -ef | grep Quickbak |wc 28 308 2569
I have a "quick backup" that runs once an hour and it's currently running 28 times (should be just once, for about 5s!). Doh... so there's your "waiting on resource". Now to troubleshoot that...
FWIW, I've found pstree(1) to be highly useful in detecting things like this because by default it summarizes identical processes. -Adam
On 16-03-05 04:34 PM, Trevor Cordes wrote:
Simple ps -ef showed me the problem:
#ps -ef | grep Quickbak |wc 28 308 2569
I have a "quick backup" that runs once an hour and it's currently running 28 times (should be just once, for about 5s!). Doh... so there's your "waiting on resource". Now to troubleshoot that... _______________________________________________ Roundtable mailing list Roundtable@muug.mb.ca http://www.muug.mb.ca/mailman/listinfo/roundtable
OK, MESSED UP!!! WTF?
on workstation with load 27: ll /data/Quickbak/Todo <freeze>
on nfs file server that houses /data ll /data/Quickbak/Todo <shows directory properly>
Huh?
I killed the ~26 Quickbaks. Load is dropping like a lead balloon.
Now on ws with load 27: ll /data/Quickbak/Todo <shows directory properly>
WTF? Something doing a simple cp -fa src /data/Quickbak/Todo on ws completely foobar'd that dir for read access by any ps on the client? Very very odd. Methinks reboot time.