Re: [RndTbl] Load average under RHEL 6.x systems?

11 May 2012


      On 2012-04-11 17:00, I wrote:
...
After upgrading many of our systems, both workstations and servers, from
CentOS 5.x to Scientific Linux 6.x, I'm seeing higher load averages on
idle systems than I used to. Under EL5, loads would drop to zero and
pretty much stay there most of the time for idles systems. Under EL6,
the load might drop down to 0.1, but doesn't stay there for very long,
and even on seemingly idle systems, I see loads at or near 1 (sometimes
even higher than 1 on some of our servers). It's also intermittent, with
load averages dropping and climbing on fairly short intervals (of a few
minutes or so).
Problem solved (at long last)!...
It turns out the problem was with "hald" polling the CD/DVD-ROM drive 
every two seconds.  I had previously dismissed that as the potential 
problem, given that this seemed to be no different than the way hald 
worked under EL5 systems.
...
Running top, iotop, ftop, iftop, etc. doesn't really point to any major
culprits. I've even run PowerTop, and implemented some of its suggested
improvements, but that didn't make a difference on load.
My bad...  PowerTop had indeed recommended I disable polling in hald, 
but I wasn't sure I wanted to disable that feature, particularly on the 
workstations (not really needed on the servers, though).  Also, as I 
said above, I didn't think this was any different than in EL5, but 
apparently it is.
Also, hald-addon-storage (the sub-process that does the polling) wasn't 
sticking around long enough to show a big CPU load in "top", 
particularly with the default 3 second update delay, but when I dropped 
the delay to 1/2 a second, I was seeing it show up briefly every once in 
a while.  (I was also seeing the irqbalance process show up as well, and 
mistakenly thought it might be the culprit.  This seemed to make sense 
at the time, since I was seeing higher loads on our 16-core servers than 
the dual-core workstations, but that was a red herring.)
...
Just wondering if anyone else has seen similar behaviour with hosts
running Red Hat and/or Fedora distributions? Would moving to the
"tickless" kernel have anything to do with it? (I.e. does it somehow
affect the way load averages are calculated?)
Still not sure if the new kernel makes a difference or not, but there 
must be something different about the way hald-addon-storage interacts 
with it to do the polling in EL6, compared to EL5.  (Or have they just 
made the polling more aggressive, by reducing the interval?)
...
Or is it some system service that can be shut down? (If it is, it's not
creating an obvious load on its own, that top or ftop would show, but it
may be affecting something in the kernel...)
As you can see by the attached graph of the load average, disabling 
polling on the CD-ROM drive yesterday afternoon seems to have made all 
the difference.  Here's the command PowerTop recommended:
hal-disable-polling --device /dev/cdrom
(Device name may vary.)  The beauty of this, compared to disabling 
polling for all storage devices, is that you can disable it on a device 
basis, and keep polling enabled, e.g. for USB devices that might get 
inserted.
-- 
Gilbert E. Detillieux		E-mail: gedetil@muug.mb.ca
Manitoba UNIX User Group	Web:	http://www.muug.mb.ca/
PO Box 130 St-Boniface		Phone:  (204)474-8161
Winnipeg MB CANADA  R2H 3B4	Fax:    (204)474-7609

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

Re: [RndTbl] Load average under RHEL 6.x systems?