[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference help::decnet-osi_for_vms

Title:DECnet/OSI for OpenVMS
Moderator:TUXEDO::FONSECA
Created:Fri Feb 22 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3990
Total number of notes:19027

3975.0. "time-provider and DTSSservers on LAN" by NNTPD::"fruehwirth@mail.dec.com" (Martin Fruehwirth) Tue May 27 1997 16:21

Xposted in DTSS #623 and DECNET-OSI_for_VMS

hi,

i've troubles solving a DTSS problem.

config:

four DTSSservers;
one DTSSserver is equiped with a time-provider,
and three DTSSservers, all on the same LAN.

problem:

the three DTSSserver synchorize with each other, but all
three together drift away from the DTSSserver with the time-provider.



all four DTSSservers are configured as NONCOURIER (no WAN at all).

the time of the DTSSserver with the time-provider is correct -
        - checked with PTT telephon time service.

the idea is to have a working DTSSserver environment in the case
that the time-provider hardware breaks.

there is one comment from my customer, which i also don't understand
and what i can't verify:
the whole configuration sould have worked fine until changing from
winter to daylight-saving-time last march.

what they have done as an workaround is configuring
server/clerk-environment, where the DTSSserver that one with
the time-provider.

question:

is the desired config valid?
what could be wrong?

thanks in advance for your help
martin

[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
3975.1Yes, something is strange...STKHLM::WEBJORNWed May 28 1997 15:1819
    
    This has also been observed at DAGAB, where time is provided from a
    GPS receiver. When the receiver drops satellites, all servers drift 
    away, when the correct time comes back, the server quickly leaves
    the other gang of three, so that times does not intersect
    
    They cannot follow the abrupt change from bad consensus time
    to good provider time.
    
    We have been unable to manipulate the server setting so that
    the problem goes away, without setting unreasonable numbers
    on the non-povider servers.
    
    
    Very interested to hear what happens with your case...
    
    
    Gullik
    
3975.2your solution/workaround?UTOPIE::FRUEHWIRTH_MWed May 28 1997 21:0711
hi Gullik, 
   
>    We have been unable to manipulate the server setting so that
>    the problem goes away, without setting unreasonable numbers
>    on the non-povider servers.
    
what was (is) your solution?
having one server (with time-provider) and all other are clerks?

best regards
martin
3975.3re:This has also been observed at DAGAB...TWICK::PETTENGILLmulpThu May 29 1997 03:3818
I'd like to see the server trace files used to generate graphs for a case
when this happens.

The only way that this can happen is that the clocks on the servers are
drifting faster than the spec'd inaccuracy.

This is based on the theory behind DTSS.

And the theory is backed up by many years of managing a set of time servers
where the external time provider occasionally fails.  There were periods
where the global time servers were afu and reporting times that were wildly
offset from true with with huge inaccuracies and other times where a number
of the global time servers were faulty.  At no time have I seen the local
servers fail to intersect with the server(s) with external time providers.

In fact, since my long time time provider is the dialup service ACTS,
my TP enables and disables itself dynamically, to the TP is coming and
going frequently.
3975.4COMEUP::SIMMONDSloose canonFri May 30 1997 03:295
    I agree with .3  .. mayhaps the Inaccuracy of the Server with TP attached
    never gets reduced when the local TP device 'recovers' after failure..
    (programming error in the DTSS$PROVIDER being used)
    
    John.
3975.5Observed behaviour not intuitive.STKHLM::WEBJORNFri May 30 1997 09:1540
    re: .2
    
    We have not been able to solve this properly yet.
    
    The customer wants 4 time servers to be able to take one down without
    getting below 'servers required = 3' default.
    
    That way other departments can run 'stock' decnet+
    
    Clocks on the machines also seem to drift excessively. This requires 
    the customer to schedule a 'sync dtss set clock true' frequently,
    so that if a server is rebooted, the hardware clock will be within
    a reasonable accuracy. The provider gives a time +/- 6 mS at the server
    
    Announcing accuracy with +/- 200 mS gives a too small interval.
    
    This means that even if the customer knows time is +/- 6 mS,
    
    some *very much larger* inaccuracy must be used ( 10-20 SECONDS)
    
    Otherwise, the drift when GPS time is not available becomes
    large enough that servers will not intersect, and hence not resync.
    
    We thought that if the server WITH provider announced time with
    200 mS, and the other 3 servers were set up with a larger
    inaccuracy, that the gang-of-three would sync up when they found
    out that the 'good' server was back, and weighting it's time
    estimate higher, pulling all 3 back in line.
    
    ( Does a 'better' accuracy weigh better in the calculations ??? )
    
    the TP program changes between 2 inaccuracies when the GPS 'REAL'
    time flag is set. Currently good accuracy is set at 200 mS and
    bad accuracy ( GPS freewheeling ) is set at +/- 5 S.
    
    if and when the good server dropped out, all four would drift SLOWLY
    due to the 'mass' of the four server. We don't understand why this
    strategy fails.
    
    Gullik
3975.6Ok, let's see if we can diagnose the problem here...TWICK::PETTENGILLmulpSat May 31 1997 00:3435
What you are describing is counter to my experience.

I've written a number of notes about how DTSS works in the DTSS conference
and the DTSS documentation seems pretty good to me, so let's get some
data for me to look at.

$run SYS$COMMON:[SYSHLP.EXAMPLES.DTSS]DTSS$GRAPH will give you some
instructions, but I will make my request specific:

Issue the following command on the system with the TP and at least one
server, preferably all of them:

  $mc ncl set dtss synch trace true

This will start a log file, although it will take a while before it
appears.

Ideally, this would run for say six hours, then the GPS TP is disabled
for 6-12 hours, and then the GPS TP is re-enabled.  Even more ideal,
the problem you describe would have occurred.

During this time, there should be no "$mc ncl set dtss" commands issued.

At then end of at least 24 hours, send me the sys$manager:dtss$inacc.log
file from each of the servers, being sure to label each log file according
to the server that it came from.  I will also need the Ethernet addresses
of each system.

For a bonus, you would include a DECnet event log for all the DTSS server
systems.

Or alternately, you could send me the event logging for all the DTSS servers.
I've lost the command file that I've used previously, so I'll create a
new set of NCL commands to collect the DTSS events from all the systems
in one log file and post them later.
3975.7power down the timeproviderclock itself?UTOPIE::FRUEHWIRTH_MTue Jun 03 1997 16:1018
re .-1

>Ideally, this would run for say six hours, then the GPS TP is disabled
>for 6-12 hours, and then the GPS TP is re-enabled.  Even more ideal,
>the problem you describe would have occurred.

Should we manually power down the timeproviderclock itself,
or does it mean that i should disable the whole DTSS on the DTSSserver
which has the timeproviderclock attached?


>Or alternately, you could send me ...

I supply you with the requested information as soon as possible,
it depends on the customer ...

best regards
martin
3975.8PISGAH::PETTENGILLmulpWed Jun 04 1997 04:2810
    >Should we manually power down the timeproviderclock itself,
    >or does it mean that i should disable the whole DTSS on the DTSSserver
    >which has the timeproviderclock attached?
    
    
    Just disconnect or disable the timeprovider software or hardware.
    
    One of the points of doing this is to see how this system's time
    behaves relative to the other systems in the LAN.  Disabling the
    server would defeat the whole purpose.
3975.9COMEUP::SIMMONDSloose canonWed Jun 04 1997 07:4614
    Re: .5
    
|    ( Does a 'better' accuracy weigh better in the calculations ??? )
    
    Certainly, provided the interval with low inaccuracy intersects with
    intervals from a majority of other Servers.
    
    Btw, a Server with a TP will not query other Servers' times when it
    synchronizes.. it's Inaccuracy is under your TP program's control.
    
    You're lucky to receive a helping hand from Mike..  resolution
    shouldn't be too far off once he gets your synch. trace...
    
    John.
3975.10traces -> next weekUTOPIE::FRUEHWIRTH_MWed Jun 04 1997 09:309
>    You're lucky to receive a helping hand from Mike..  resolution
>    shouldn't be too far off once he gets your synch. trace...
    
i will supply you with the trace-files during the next week
(my customer is low on manpower due to holidayseason).

best regards
martin