On 25/04/2012 5:49 PM, Mike Pfaiffer wrote:
I'm somewhat hesitant to reply to this thread since, compared to
you guys, I'm an amateur. However the folks at the CLL lab in Winnipeg and I have noticed similar behaviour in standalone machines not connected to an ntp server. Clearly this isn't the same problem you're experiencing but it looks close. The reason it looks this way in the standalone machines is because the internal battery used to maintain the settings is running low on power. I phrased it this way because the same behaviour occasionally appears in Macs as well as PCs. The solution is to replace the batteries. However we can put off replacing the battery if we connect the machine to an ntp server. Eventually it gets to the point where the battery can't maintain ANY settings. As the charge goes down the results are similar to what you are seeing.
Considering we deal with OLD (but mostly useful) machines at the
CLL I am inclined to look at hardware rather than software as the major source of problems.
I doubt this will be useful to you but it is best to check out all
possibilities starting with the simple stuff first.
Thanks, Mike. Some of the comments I found when researching this online suggested a weak battery as a possible cause as well, so it's a possibility I'll explore tomorrow. I would have thought that NTP would compensate for this, but I guess it will only compensate so much. I had also been under the impression that the kernel maintained the system clock internally, separately from the hardware clock, but I'm now getting the impression that current Linux systems tend to synchronize these dynamically. It's a 4.5 year old machine, so it could well be due for a new battery.
Other possible causes that have been implicated in clock drift are Xen related issues and NIC problems. I had installed Xen on the server that's giving me problems now, and I had switched to a non-xen kernel in March, but hadn't removed all the xen-related packages and configurations, so it's possible that some of that was causing problems. I've removed all the xen and virtualisation packages and libraries, and I'm going to reboot again in the wee hours to see if that helps. If not, I'll see about replacing the battery. Failing all that, I'll see about putting in a new network card to replace the onboard NIC, as I've read about someone who solved an NTP clock drift problem doing just that. The trouble with the hardware solutions is they'll require downtime on our mail and web server during the day when I'm there.
Gilles