[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

5775.0. "ip-polling and bridge errors, start ethernet failed" by CSC32::J_WIELAND () Tue Dec 07 1993 13:25

Problem 1)

Customer randomly sees bridge alarms from his DEC bridges, approximately 100
per minute that indicate "start ethernet failed".  The customer checkes his
local ethernet, and finds no errors via netstat -i.  When he clicks on the
bridge in question, he can see that both ethernet interfaces on the bridge
are up and running fine. He says that this symptom will move from one bridge
to another in his network so it does not seem to be related to a specific
bridge... (note: customer does have multi port bridges, ie: 620's, but this
    also occurs on this dec bridge 100.  having customer ensure he has 
    terminators/loopbacks on his unused ports, but we know the 100 has 
    both its ports in use.  any suggestions?)  ps: his rule set says to
    check the bridges every 20 minutes..

Problem 2)

Using the standard IP poller, the customer starts a periodic poll of his ip
nodes in his network.  He finds that randomly, the ip polls to his Sinoptics
300 fiber optic concentrators, will fail.  He can immediately ping these
concentrators and they will respond correctly.  He has also placed a sniffer
on the lan and can see the response from the concentrator and it looks fine.
The customer has try to change the polling timer from 30 seconds to 90 seconds,
and also from 1 retry to 6 retrys without success.  A related problem is that
these alarms come in as 'critical', and he would like to change the serverity
of this alarm but does not know how.    He finds that the polling does not
    retry the synoptics concentrator according to the sniffer, probabbly
    because it saw a valid response to the icmp message.  Is there a break
    somewhere between the poller and the alarm notification process?
    (note: the devices polled are 60 synoptics, 10 dec bridges, 1 cisco,
    and one spark station.  the spark station has the synoptics management
    software on it and has never reported any down conditions when polling
    the synoptics.  thoughts anyone?)
    


I told the customer that I would review this call with Pat and Johnny, and have
someone give him a call back in the morning.


T.RTitleUserPersonal
Name
DateLines
5775.1File a CLD (or 2)RACER::daveAhh, but fortunately, I have the key to escape reality.Tue Dec 07 1993 19:364
If you expect any support file 2 (TWO) CLD's, one for each problem.

Also, please DO NOT put a "pointer" to the notes file in the CLD,
but put the real information there.
5775.2Some hintsBIKINI::KRAUSEEuropean NewProductEngineer for MCCWed Dec 08 1993 08:1420
>per minute that indicate "start ethernet failed".  The customer checkes his

This message means that the BRIDGE AM could not register the RBMS
protocol with the Ethernet driver on the MCC machine. This could happen
if two alarm rules try to access the RBMS protocol at the same time. I
don't know exactly how this case is handled in the BRIDGE AM but to
proof my theory you could try using _one_ wildcard rule instead of
separate alarm rules for the bridges. A wildcard rule will 'serialize'
the access. 

>nodes in his network.  He finds that randomly, the ip polls to his Sinoptics
>300 fiber optic concentrators, will fail.  He can immediately ping these

I remember a problem with IP Poller when polling more than 50 IP nodes
ore some such. It was described somewhere in this notes conference...

You see, I'm stabbing in the dark. Opening CLDs surely is the best (and
only) way to get real answers. 

*Robert
5775.3Thanks for the info. Did escalate.CSC32::P_TOMAROWed Dec 08 1993 13:376
    Thanks for the best effort info.  I have escalated the call.  I did
    reference this note but also included the problem descriptions and
    pointers to supporting documentation. (file of alarms, rules, ...)
    should be coming through the system soon.
    Pat.