[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

2565.0. "DECmcc v1.1, Translan AM, alarms and exception rule firing..." by GO4GUT::NASHT (Trevor Nash, CSC, SYDNEY) Sun Mar 15 1992 23:39

Hi,

A customer here is getting his exception alarm rules firing (intermittently)
when polling TL C at 1 minute intervals in the following diagram. The rules
fire with the exception messages:
"Cannot communicate with target"
or
"Communication with the target has been interrupted"

His configuration looks like this:

	--------------------------------+-----------  Seg D
			       		TL D
				       /|
				      / | 2Mb - ST back up
		======================	|
		|			|
		|			TL C
		|		    ----+-------+----   Seg C
		|				MetroWave
		|				|
		|				|
	2Mb	|				MetroWave
	ST_F	|		    ----+-------+----   Seg B
		|			MetroWave
		|			|
		|			|
		TL A			MetroWave
	--------+-----------------------+----
			Seg A


The batch file contains alarm rules for both Translans and the Lanbridge
150s(in the MetroWaves).

The customer has found three things that appear to make the problem go away:
1) disconnecting one of the MetroWaves - so the path between TL C-TL D goes
   into STP_F ie the MetroWaves do NOT pass the management packets
2) splitting the alarm rules batch job into two - one for the Translan and
   one for the Lanbridge 150.
3) increasing working set for the alarm rule batch job.

Can anybody tell me how many times the TL AM will poll, and at what interval,
before the above exception rule fires producing the above messages ?

Can the TL's accept multiple, simultaneous polls eg if the Exporter is running
collecting information ?

Any comments on the 'fixes' apparently found by the customer ?

Thanks for any replies,

Trevor
T.RTitleUserPersonal
Name
DateLines
2565.1Busy TransLAN or shaky link.TOOK::MCPHERSONSave a tree: kill an ISO working group.Mon Mar 16 1992 00:3933
'Cannot communicate with target' means pretty much that.   The AM couldn't
communicate withing the acceptable timeout period.   There's nothing that 
increasing WS for the alarm rules batch would do to help this, so item 3 os
probably a red herring.

If there was a problem with contention for the ethernet device, you'd see a
_different_ error.   (I assume that this might have been why the customer split
the batch jobs...)   

'Communication with the target has been interrupted' means that the AM *had*
actually established a connection and gotten at least a *partial* reponse from
the LANbridge, but something happened.  

I would closely examine all pieces of the link between the management station
and the TransLAN. Suspect any box between the two. Also watch the stability of
the Spanning Tree very carefully.  If the MetroWave link goes marginal, it's
possible that the Spanning Tree will reconfig and *that* could cause you some
problems, too.

TransLANs (like pretty much all bridges that I know of) can only service one
management request at a time.  If the TransLAN is busy servicing a poll, then a
pending poll will have to try again.   If I remember correctly, the TransLAN AM
will retry 3 times before it gives up waiting for a response, returning "Cannot
communicate with target."

Also, note that TransLANs forward packets above all else.  If there are packets
that need to be filtered/routed/whatever, then management directives will be
ignored until it can service them.   Thus, a very busy TransLAN will have
trouble responding to management directives sometimes.


/doug
2565.2Translan AM - poll interval?GO4GUT::NASHTTrevor Nash, CSC, SYDNEYMon Mar 16 1992 21:0517
Doug,

Thanks for the reply.

The STP is not running - all data packets are being passed normally when this
problem occurs.

Seg C also has barely any devices on it - just several terminal servers, the TL
and the Metrowave.

Do you know at what intervals the 3 'trys' occur ? If the boxes are polled at
1 minute interval - by multiple alarm rules - and the timeout period is 20
seconds then this could create a problem...

Regards,

Trevor