ntpdc 'sysinfo' output inconsistent with ntpq-p output

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Jan Ceulee » Wed, 26 Jul 2006 01:02:35



Just to avoid it being missed: a similar issue exists with ntptrace,
which is a perl script that relies on ntpq -n -c rv being consistent
across the trace.

An example of the contrary:

# ntptrace localhost ; ntpq -p localhost
localhost: stratum 1, offset 0.000087, synch distance 0.461839, refid '
'
remote refid st t when poll reach delay offset
jitter
==============================================================================
LOCAL(0) .LOCL. 10 l 52 64 377 0.000 0.000
0.001
GENERIC(0) .DCFa. 0 l 976 64 0 0.000 -2.266
0.001
+cerber.obs.coe. 145.238.110.68 3 u 386 1024 377 67.058 0.866
0.244
*chronos.zedat.f .GPS. 1 u 363 1024 377 31.280 0.905
0.440
-salukes.opensou 185.55.101.136 2 u 324 1024 377 14.834 2.596
1.284
-217.71.122.144 80.190.252.238 3 u 327 1024 377 15.051 -4.640
3.759
+ntp1.belbone.be 195.13.23.250 2 u 337 1024 377 10.421 1.627
0.385
-time.ijs.si 193.2.4.2 2 u 331 1024 377 50.810 0.121
0.045
-bear.zoo.bt.co. 194.81.227.227 2 u 327 1024 375 23.634 5.755
1.546
skr03.xperim.be 192.168.1.1 2 u 379 1024 377 0.760 -3.400
5.686


This shows that ntptrace (and hence ntpq -n -c rv localhost) is
inconsistent with ntpq -p localhost. This is perfectly reproduceable on
my system. Note that in this case the refclock has last produced a valid
sample 976 seconds ago.

Note also that this issue causes ntptrace to incorrectly output the
refid. In this particular case what is output seems to be a line feed,
but sometimes it's a bit messier than that (e.g. Ctrl-E, causing my ssh
client to output its ID string (PuTTY)).

Cheers, Jan
 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Richard B. » Wed, 26 Jul 2006 01:17:10


Ntpd can select a synchronization source very quickly, especially if you
use "iburst" on your server lines in ntp.conf.

It can take far longer for ntpd to bring your local clock into close
agreement with UTC where "close" means < 20 milliseconds.

You can reduce the time required by using the -g option when you start
ntpd. That will cause ntpd to set the clock unconditionally to
something close to the correct time. Ntpd will still require some time
to adjust the speed of your local clock so that it neither gains nor
loses time.

A "cold" start (no drift file) of ntpd can take many hours to reach a
stable state with both the correct time and the correct frequency
adjustment. A warm start is faster but can still require twenty or
thirty minutes to beat your local clock into submission. :-)

You can probably see similar results from ntpq and ntpdc in far less
time than thirty minutes. As nearly as we could tell, you had allowed
less than two minutes since startup.

 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Harlan Ste » Wed, 26 Jul 2006 07:44:43

we could use a maintainer for ntptrace...

H
 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Harlan Ste » Wed, 26 Jul 2006 07:48:13

>>> In article < XXXX@XXXXX.COM >, XXXX@XXXXX.COM (Tripathi, Anurag) writes:

Anurag> Here is ntpq -crv output: ---- assID=0 status=c624 sync_alarm,
Anurag> sync_ntp, 2 events, event_peer/strat_chg, version="ntpd 4.2.0a"?,
Anurag> processor="i686", system="Linux/2.6.13.4-ws-symbol", leap=11,
Anurag> stratum=16, precision=-20, rootdelay=0.000, rootdispersion=10.965,
Anurag> peer=17460, refid=INIT, reftime=00000000.00000000 Thu, Feb 7 2036
Anurag> 11:58:16.000, poll=6, clock=0xc86f5eb1.a257a786, state=3,
Anurag> offset=9.887, frequency=0.000, noise=3.628, jitter=2.328,
Anurag> stability=0.000 ------

You are in state 3, which means you are not sync'd. Therefore, in this
case, ntpdc seems to be stating correct info.

Just because ntpq shows that you are sync'd to the remote system does not
mean that your ntpd is ready to announce to other systems that it is stable.

Anurag> As suggested earlier after 30mins 'ntpdc sysinfo' starts showing the
Anurag> correct information i.e. 'synched'.

By this time I bet ntpq -crv will show that your ntpd is in state 4, which
is sync'd.

Anurag> But isn't 30mins too long a delay for show an information which ntpq
Anurag> declares within seconds?

Just to be thorough, I believe you are seeing the difference between "ntpd
is not in state 4 but knows who to believe" and "ntpd is in state 4 and
knows who to believe".

H
 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Jan Ceulee » Wed, 26 Jul 2006 23:57:27

anny Mayer wrote:


Danny,

Referring back to the example in my earlier message, the following was
the only line output by 'ntptrace localhost' (because it thought that
localhost was at stratum 1 and therefore synchronised to its refclock):

localhost: stratum 1, offset 0.000087, synch distance 0.461839, refid '
'

The refid is supposed to be displayed in single quotes, which here seems
to be an LF (newline). It displays like that even in an 80-column
window; this is not a result of newsgroup word-wrapping.

This shows that the ntpq -n -c rv output had not yet caught on to the
fact that the refclock had not yielded a valid sample for 976 seconds
and that another server has been selected as the syspeer:

remote refid st t when poll reach delay offset
jitter
==============================================================================
LOCAL(0) .LOCL. 10 l 52 64 377 0.000 0.000
0.001
GENERIC(0) .DCFa. 0 l 976 64 0 0.000 -2.266
0.001
+cerber.obs.coe. 145.238.110.68 3 u 386 1024 377 67.058 0.866
0.244
*chronos.zedat.f .GPS. 1 u 363 1024 377 31.280 0.905
0.440
-salukes.opensou 185.55.101.136 2 u 324 1024 377 14.834 2.596
1.284
-217.71.122.144 80.190.252.238 3 u 327 1024 377 15.051 -4.640
3.759
+ntp1.belbone.be 195.13.23.250 2 u 337 1024 377 10.421 1.627
0.385
-time.ijs.si 193.2.4.2 2 u 331 1024 377 50.810 0.121
0.045
-bear.zoo.bt.co. 194.81.227.227 2 u 327 1024 375 23.634 5.755
1.546
skr03.xperim.be 192.168.1.1 2 u 379 1024 377 0.760 -3.400
5.686


As I said, this is reproduceable. I've just run the same couple of
commands again (ntptrace localhost; ntpq -p localhost) and this is the
output:

localhost: stratum 1, offset 0.000497, synch distance 0.449679, refid '

remote refid st t when poll reach delay offset
jitter
==============================================================================
LOCAL(0) .LOCL. 10 l 24 64 377 0.000 0.000
0.001
GENERIC(0) .DCFa. 0 l 113 64 164 0.000 1262304
1262304
-roxane.home-dn. 130.149.17.8 2 u 779 1024 377 13.128 4.907
1.264
-195.244.96.13 235.190.183.255 2 u 924 1024 377 17.820 40.431
20.218
*n3.surbl.org 193.0.4.7 2 u 787 1024 377 10.977 1.743
2.962
-host-213-189-18 98.221.144.0 2 u 778 1024 377 13.678 17.895
10.933
+tilia.zsx.hu 192.53.103.108 2 u 774 1024 377 36.650 0.513
0.541
-ntp1.belbone.be 195.13.23.250 2 u 770 1024 377 9.894 2.941
2.934
-time.ijs.si 193.2.4.2 2 u 780 1024 377 50.661 4.039
1.605
+bear.zoo.bt.co. 130.149.17.8 2 u 789 1024 377 24.860 0.708
2.960
-skr03.xperim.be 192.168.15.6 5 u 779 1024 377 0.741 -19.073
58.959

Note that the IP address of the current sys_peer is 213.222.11.222
(which is what gets output in binary as the refid).

One more data point: here is the output of 'ntpq -p -c rv localhost'
which ntptrace turned into the above output:

assID=0 status=06f4 leap_none, sync_ntp, 15 events, event_peer/strat_chg,
version="ntpd XXXX@XXXXX.COM Sun Jul 9 11:18:35 UTC 2006 (1)",
processor="i686", system="
 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by David L. M » Thu, 27 Jul 2006 03:58:47

ichard,

At cold start (no frequency file) the daemon sets the clock at the first
update, then waits 15 minutes, measures and corrects the frequency
(usually within 1 PPM) and assumes normal operation. This saves many
hours to converge the frequency within 1 PPM, especially if the
intrinsic frequency error is large, like 200 PPM.

Until the frequency is corrected, the machine can have serious offset
and frequency errors that would drive dependent clients nuts. Up to now,
the leap bits and stratum were not initialized until the frequency was
corrected. However, the clock was set and local clients were free to
believe or not believe the clock. After review, the external behavior is
only mildly worse than without the initial training period, so the leap
bits and stratum are now set at the first update. The billboards should
now be consistent with the Principle of Least Astonishment.

Having revealed thus, the billboards on both ntpq and ntpdc should be
the same, since they are derived from the same data. However, ntpdc has
not been properly maintained for many years and there could well be bugs
in that program. If ntpq and ntpdc disagree, use ntpq.

If the kernel discipline is available and the frequency file is not
present, the kernel offset and frequency should be set to zero (ntptime
-o 0 -f 0) first. This is the approved way to restart from scratch.

Dave

Richard B. Gilbert wrote:
 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Jan Ceulee » Sat, 29 Jul 2006 02:09:14


Does anyone else experience this behaviour? I thought reporting it here
would cause it not to be missed (along with the original problem
reported by Tripathi Anurag). This thread appears to be petering out
without any unequivocal statements on this...

I don't want to open a bug report unless this really is a bug, rather
than an artefact of my particular system (but I will if asked).
 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Harlan Ste » Sat, 29 Jul 2006 05:41:12

There is no maintainer for ntptrace, so bugs get fixed there when somebody
feels like working on it.

The "default" maintainer is me, and I have over 100 bugs in my queue.
I work on the bugs that are blocking the "upcoming" release when I am not
working on release issues. And I'm still a volunteer and do this work in my
spare time.

H
 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Ronan Floo » Sat, 29 Jul 2006 22:22:17

On Tue, 25 Jul 2006 16:57:27 +0200,



It's not a problem with ntpq either, looking at that. If ntpd is
reporting itself as stratum-1, the refid should be the text tag of
its refclock. If ntpd has switched sys_peer from the refclock to
a server, it shouldn't be reporting itself as stratum-1 ...

--
Ronan Flood<< XXXX@XXXXX.COM >>
working for but not speaking for
Network Services, University of London Computer Centre
(which means: don't bother ULCC if I've said something you don't like)
 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Jan Ceulee » Sun, 30 Jul 2006 03:45:59

Harlan,



This is worring: you do not seem to have caught on to the fact that this
issue is caused by ntpq -n -c rv being inconsistent with ntpq -p.

Let me say that again: ntpq is inconsistent with ntpq.

The ntptrace issue is just a symptom.

I suspect that this is another form of the problem mentioned in the
subject of this thread.


Thank you (x1000). Really!

But this is also worrying: from what you say the ntpd project is in a
constant state of fire-fighting. Releases are driven by a lack of
blocking issues, rather than by a vision of what new or improved
functionality is needed that outweighs the risks of deployment.

I would call upon the ICT industry to lend a hand, since time
synchronisation is, frankly, too important to the industry to be left to
volunteers who are very understandably forced to compromise.

My employer is a bit busy merging right now, or I'd try and get some
support going. Has anyone tried talking to industry federations?

Cheers, Jan
 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Harlan Ste » Sun, 30 Jul 2006 08:25:09

I was responding to an item in that thread that discussed ntptrace.

At least, I thought so...

As for the ntodc sysinfo v. ntpq -p issue, I thought I carefully described
my observation of one of the detailed reports earlier - in that case ntpq
was reporting it was sync'd to the remote server but it was still in state
3, and the sysinfo command showed that the local machine was not yet sync'd.

H
 
 
 

ntpdc 'sysinfo' output inconsistent with ntpq-p output

Post by Ronan Floo » Tue, 08 Aug 2006 23:04:03


I can confirm this problem with ntpd-4.2.2, as I'm seeing this on one of
our GPS receivers which is having intermittent reception at the moment.
Here are ntpq -np outputs I happen to have logged from one of its peers
at consecutive two-minute intervals this morning:

remote refid st t when poll reach delay offset jitter

+193.62.22.74 .GPS. 1 u 20 64 377 5.780 0.087 0.055

-193.62.22.74 .<C1>>^Vb. 1 u 12 64 377 5.780 0.087 0.043

+193.62.22.74 .<C1>>^Vb. 1 u 2 64 377 5.834 0.049 0.031

-193.62.22.74 192.36.134.17 2 u 57 64 376 5.834 0.049 0.044

(<C1>>^Vb is textified ASCII of 193.62.22.98, another peer)

Time to log a bug?

--
Ronan Flood < XXXX@XXXXX.COM >
working for but not speaking for
Network Services, University of London Computer Centre
(which means: don't bother ULCC if I've said something you don't like)