[RndTbl] Load average under RHEL 6.x systems?
Gilbert E. Detillieux
gedetil at cs.umanitoba.ca
Fri May 11 11:33:02 CDT 2012
On 2012-04-11 17:00, I wrote:
> After upgrading many of our systems, both workstations and servers, from
> CentOS 5.x to Scientific Linux 6.x, I'm seeing higher load averages on
> idle systems than I used to. Under EL5, loads would drop to zero and
> pretty much stay there most of the time for idles systems. Under EL6,
> the load might drop down to 0.1, but doesn't stay there for very long,
> and even on seemingly idle systems, I see loads at or near 1 (sometimes
> even higher than 1 on some of our servers). It's also intermittent, with
> load averages dropping and climbing on fairly short intervals (of a few
> minutes or so).
Problem solved (at long last)!...
It turns out the problem was with "hald" polling the CD/DVD-ROM drive
every two seconds. I had previously dismissed that as the potential
problem, given that this seemed to be no different than the way hald
worked under EL5 systems.
> Running top, iotop, ftop, iftop, etc. doesn't really point to any major
> culprits. I've even run PowerTop, and implemented some of its suggested
> improvements, but that didn't make a difference on load.
My bad... PowerTop had indeed recommended I disable polling in hald,
but I wasn't sure I wanted to disable that feature, particularly on the
workstations (not really needed on the servers, though). Also, as I
said above, I didn't think this was any different than in EL5, but
apparently it is.
Also, hald-addon-storage (the sub-process that does the polling) wasn't
sticking around long enough to show a big CPU load in "top",
particularly with the default 3 second update delay, but when I dropped
the delay to 1/2 a second, I was seeing it show up briefly every once in
a while. (I was also seeing the irqbalance process show up as well, and
mistakenly thought it might be the culprit. This seemed to make sense
at the time, since I was seeing higher loads on our 16-core servers than
the dual-core workstations, but that was a red herring.)
> Just wondering if anyone else has seen similar behaviour with hosts
> running Red Hat and/or Fedora distributions? Would moving to the
> "tickless" kernel have anything to do with it? (I.e. does it somehow
> affect the way load averages are calculated?)
Still not sure if the new kernel makes a difference or not, but there
must be something different about the way hald-addon-storage interacts
with it to do the polling in EL6, compared to EL5. (Or have they just
made the polling more aggressive, by reducing the interval?)
> Or is it some system service that can be shut down? (If it is, it's not
> creating an obvious load on its own, that top or ftop would show, but it
> may be affecting something in the kernel...)
As you can see by the attached graph of the load average, disabling
polling on the CD-ROM drive yesterday afternoon seems to have made all
the difference. Here's the command PowerTop recommended:
hal-disable-polling --device /dev/cdrom
(Device name may vary.) The beauty of this, compared to disabling
polling for all storage devices, is that you can disable it on a device
basis, and keep polling enabled, e.g. for USB devices that might get
inserted.
--
Gilbert E. Detillieux E-mail: <gedetil at muug.mb.ca>
Manitoba UNIX User Group Web: http://www.muug.mb.ca/
PO Box 130 St-Boniface Phone: (204)474-8161
Winnipeg MB CANADA R2H 3B4 Fax: (204)474-7609
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mrtg-load-day.png
Type: image/png
Size: 3837 bytes
Desc: not available
URL: <http://www.muug.mb.ca/pipermail/roundtable/attachments/20120511/96c8467e/attachment.png>
More information about the Roundtable
mailing list