I have another system (not the same one with RTC problems). It loses about 1s every 1m in system time!! ntp is running, configured the same way I have 20 other boxes that work just fine.
Right now the system thinks it's 6:38pm when the actual time is 20:14pm. It lost that much time since I last ntpdate'd Dec 7!
ntpq shows some weird values:
#ntpq ntpq> lpeers remote refid st t when poll reach delay offset jitter ============================================================================== ntp1.sscgateway 209.87.233.53 3 u 5 64 377 46.645 5757133 6180.03 time.netspectru 132.246.11.228 3 u 62 64 377 40.248 5755927 6235.07 backup.relay.mt 132.246.11.227 3 u 14 64 377 40.161 5756937 6165.72 74.3.161.36 140.142.16.34 2 u 46 64 377 49.971 5756303 6236.14
However, the above indicates ntp is finding peers that it can sync with.
The last message in /v/l/m about ntp is from the restart:
Dec 7 14:38:32 firewall ntpd[25512]: peers refreshed Dec 7 14:38:32 firewall ntpd[25512]: Listening on routing socket on fd #26 for interface updates Dec 7 14:38:33 firewall ntpd[25512]: format error frequency file /var/lib/ntp/drift Dec 7 14:38:33 firewall ntpd[25512]: 0.0.0.0 c016 06 restart Dec 7 14:38:33 firewall ntpd[25512]: 0.0.0.0 c012 02 freq_set kernel 0.000 PPM Dec 7 14:38:33 firewall ntpd[25512]: 0.0.0.0 c011 01 freq_not_set Dec 7 14:38:39 firewall ntpd[25512]: 0.0.0.0 c61c 0c clock_step +5.921205 s Dec 7 14:38:45 firewall ntpd[25512]: 0.0.0.0 c614 04 freq_mode Dec 7 14:38:46 firewall ntpd[25512]: 0.0.0.0 c618 08 no_sys_peer Dec 7 14:42:07 firewall ntpd[25512]: 0.0.0.0 c628 08 no_sys_peer
This seems to match up with what my other systems look like. Though the kernel 0.000 PPM is a bit weird. no_sys_peer seems entirely normal though it sounds bad.
I also don't get why ntp isn't barfing out like it normally does after it goes out more than 1000s.
Any ideas on a) fixes for this board losing time, and b) having ntp do what it's supposed to.
I have wiped out the /v/l/ntp/drift file as a precaution that maybe the drift was wrong (it was leftover from a previous board). Ntp doesn't seem to be updating this file anymore, and I have verified it's ntp:ntp 644.
In all my years I've never seen such a chronically chronologically challenged chronometer.
On 2012-12-10 Trevor Cordes wrote:
I have another system (not the same one with RTC problems). It loses about 1s every 1m in system time!!
I think I solved my own problem, after spending another hour on it :-(
A clue that I hadn't noticed (or noted in my post) is that ntpstat showed "unsynchronised" instead of the normal: synchronised to NTP server (208.73.56.29) at stratum 3
Also, the ntpq output indicated in its own obtuse way that it wasn't sync'd because there was no server with a * to the left of it.
No matter what I did, I couldn't get ntp to show "synchronised". I even would run ntpdate immediately followed by service ntp start and it wouldn't sync. And I confirmed ntpdate was indeed setting the clock perfectly. It made no sense! I knew it was *not* a ntp config error since it's config'd the same way as 20 other good boxes.
Anyhow, I started playing with kernel time sources and noticed this CPU (Petnium D) supports HPET.
cat /sys/devices/system/clocksource/clocksource0/available_clocksource tsc hpet acpi_pm
Current setting was tsc
So I changed it to hpet echo hpet >> /sys/devices/system/clocksource/clocksource0/current_clocksource
Then ntpdate again
Then start up ntpd
BOOM! Instantly ntpd syncs to a source and ntpstat shows sync'd! WTF? Oh well, at least it seems fixed. Been 10 mins now and no time loss.
To get grub2 to change the kernel line so my hpet change survives reboot was another story... Grub2 looks nothing like grub! Can you say "over-engineered"?
On 12/11/2012 02:35 PM, Trevor Cordes wrote:
On 2012-12-10 Trevor Cordes wrote:
I have another system (not the same one with RTC problems). It loses about 1s every 1m in system time!!
I think I solved my own problem, after spending another hour on it :-(
A clue that I hadn't noticed (or noted in my post) is that ntpstat showed "unsynchronised" instead of the normal: synchronised to NTP server (208.73.56.29) at stratum 3
Also, the ntpq output indicated in its own obtuse way that it wasn't sync'd because there was no server with a * to the left of it.
No matter what I did, I couldn't get ntp to show "synchronised". I even would run ntpdate immediately followed by service ntp start and it wouldn't sync. And I confirmed ntpdate was indeed setting the clock perfectly. It made no sense! I knew it was *not* a ntp config error since it's config'd the same way as 20 other good boxes.
Anyhow, I started playing with kernel time sources and noticed this CPU (Petnium D) supports HPET.
cat /sys/devices/system/clocksource/clocksource0/available_clocksource tsc hpet acpi_pm
Current setting was tsc
So I changed it to hpet echo hpet>> /sys/devices/system/clocksource/clocksource0/current_clocksource
Then ntpdate again
Then start up ntpd
BOOM! Instantly ntpd syncs to a source and ntpstat shows sync'd! WTF? Oh well, at least it seems fixed. Been 10 mins now and no time loss.
To get grub2 to change the kernel line so my hpet change survives reboot was another story... Grub2 looks nothing like grub! Can you say "over-engineered"?
Rather than changing the grub2 configuration, I think you can make that clocksource change permanent by adding the following line to /etc/sysctl.conf:
devices.system.clocksource.clocksource0.current_clocksource = "hpet"
There's a similar issue, apparently, under Xen, as described in this post:
http://forums.epicgames.com/threads/629267-Negative-Delta-Time-Linux-Xen-Pro...
On 2012-12-11 Gilles Detillieux wrote:
Rather than changing the grub2 configuration, I think you can make that clocksource change permanent by adding the following line to /etc/sysctl.conf:
devices.system.clocksource.clocksource0.current_clocksource = "hpet"
I never could find docs on sysctl.conf and whether it applies to /proc or /sys or both and/or why we even split up /proc and /sys! All the stuff in my sysctl.conf is for /proc only. I didn't know you could specify /sys stuff and/or how to do it.
I'm sure it makes sense to someone and is documented somewhere...
PS the system in question has kept perfect time since the last post and even updated its drift file (-16).