[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

1787.0. "Alarming questions..." by DUNDEE::CLEARY (A deviant having fun...") Mon Nov 11 1991 06:58

    I have two questions about alarms.  I am using BMS V1.1 with the
    patches from this conference and am using workarounds like adding a
    dummy dynamic domain to avoid the notification disabled mess.
    
    The first problem I have is that once an alarm is triggered it appears
    to trigger continuously rather than once each polling interval.  The
    batch queue fills at a great rate of knots with alarm jobs.  The alarm
    contains AT EVERY 0:10:0 as the interval.  Has anyone else seen this or
    am I being really stupid ?
    
    The second problem is to do with polling for reachablility of DECnet
    phase IV nodes.  I have a rule which asks a routing node if the state
    of the target node is reachable.  This is based on the sample rule
    supplied in the kit.  When I enable all the alarms (using the batch job
    hack) I exhaust the logical links available on the router and finish up
    with half the alarms disabled.  Does anyone have a workaround for this
    one ?
    
    desperately seeking solace,
    -mark
T.RTitleUserPersonal
Name
DateLines
1787.1Use the Event collection capabilityTOOK::CAREYTue Nov 12 1991 11:5915
    
    Setup DECmcc on your node as an event sink for the reachability changes
    or adjacency down events from the routing node.
    
    Reachability takes a long time to change on a large network; on a LAN,
    adjacency down events from a router give you much quicker response.
    
    Look in the DECnet Phase IV Use manual and use the sample EVL startup
    command file (something like SYS$STARTUP:MCC_STARTUP_DNA4_EVL.COM) for
    some hints on setting up MCC as the event repository and dumping events
    from the routing node into your DECmcc.
    
    This will free up the logical links and give you more rapid response.
    
    
1787.2Need more to help identify the problemTOOK::ORENSTEINTue Nov 12 1991 17:139
    >> The first problem I have is that once an alarm is triggered it appears
    >> to trigger continuously rather than once each polling interval.
                                                                    
    I have never seen this happen.  Could you please sent up your
    environment the way you see this happening, use the LOG sample
    command procedures on all your rules, trigger your alarm then 
    send me the logs you accumulate and also a DIR/DAT of the logs.
    
    aud...
1787.3Progress..DUNDEE::CLEARYA deviant having fun..."Wed Nov 13 1991 21:1448
    re .1
    
    I modified all the rules to start at different times - they are
    currently staggered by 30 seconds.  This is not too bad for V1.1 where
    you have to create individual rules but not viable for V1.2 where wild
    card entities can be used.  Do we need tosolve this problem or warn
    people about it.  A solution would be preferable since a burst of
    alarms all at once will cause performance problems on the management
    system.  I can just see someone creating a generic rule to check the
    reachability of all 1000 nodes in our area.
    
    I have played around with using phase IV events and will probably go
    that way eventually.  This is a LAN only network so reachability and
    adjacency down will be about as fast as each other.  
    
    Re .3  
    
    I made substantial progress yesterday.  The creation dates for the
    rules all appear to be 11 days 9 hours plus or minus a few minutes in
    the past.  I don;t know how MCC gets confused about the time but it
    appears to be the source of the problem.  A simple rule with no
    schedule defined which will always evaulate true (snmp hub6 synoptics
    s3000chassis s3chassisfanstatus=OK) should run from NOW till FOREVER
    every 15:00 minutes.  It runs about 3 times each second.
    
    MCC_TDF was defined as "+10" which the correct offset.  I played around
    with this and found values like "+0-10:0:0" causes major havoc.  MCC
    couldn't get at the MIR at all.  Eventually I used "+10:0:0" and
    suddenly alarms work as expected.  I'm mystified, but then I expect
    nothing less from MCC :-).  I guess KITINSTAL.COM should check the
    syntax of the TDF if getting it wrong is going to cause such havoc.
    
    Some history.
    
    This system was running DECdts T1.0 but I disabled that and rebooted
    before trying the above.  That didn't help.  Earlier still, I also had
    incredible trouble getting DNS to work.  It had been installed
    previously and I wanted to change the node name and address so I tried
    to re-install the name server.  This went well except that the IVP
    failed with an `unable to talk to server' type error.  No further 
    error status explaining was given.  I gather that this can be caused by
    time going backwards but I don't know how that could happen in a new
    installation.  Eventually I deleted all the DNS$*.* files, re-installed
    the client files by copying them from another system then installed the
    server and things started to work.  This took three days of screwing
    around to fix.  Par for the course with DECdns :-(
    
    -mark
1787.4Please use the correct syntax for specifying MCC_TDFTOOK::GUERTINDon't fight fire with flamesThu Nov 14 1991 08:5212
    I believe for V1.1 the "true" MCC_TDF syntax is
    		"[+|-][dd ]hh:mm"
    
    			dd=days
    			hh=hours
    			mm=minutes
    
    Please use "10:00" or "+10:00"
    
    Without the colon, "+10" may be interpreted as 10 days.
    
    -Matt.
1787.5yes, but...DUNDEE::CLEARYA deviant having fun..."Fri Nov 15 1991 22:4121
    re .4
    
    That's also the conclusion I came, but I guess I was too subtle
    about the implications.
    
    Unless I am missing something fundamental, there must be a bug in the
    way DECmcc handles time.  It should not matter what value the tdf has
    as long as it is constant.  DECmcc's internal notion of time is
    presumably UTC and the TDF is used to convert to and from local system
    time.  If the result of converting a system time to UTC, adding a 10
    minute offset then converting back to system time is off by 11 days and
    9 hours then there is either a bug in the UTC routines or I am way off
    base.  
    
    I know 10 days is a nonsensical TDF but is is easily obtained through
    accident.  If the consequences of getting it wrong are so severe, then
    either the bug in the time routines needs to be fixed (before it bites
    somewhere else) or there should be a reasonableness check on the TDF to
    hide the bug.
    
    -mark
1787.6Agreed. We plan on validating what the user entersTOOK::GUERTINDon't fight fire with flamesMon Nov 18 1991 13:394
    In the V1.2 kit, if the user enters an invalid TDF, we bring it to the
    users attention immediately.
    
    -Matt.
1787.7Thanks.DUNDEE::CLEARYA deviant having fun..."Tue Nov 19 1991 01:190