[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference netcad::hub_mgnt

Title:DEChub/HUBwatch/PROBEwatch CONFERENCE
Notice:Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7
Moderator:NETCAD::COLELLADT
Created:Wed Nov 13 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:4455
Total number of notes:16761

3323.0. "DR900TM RMON alarm with S/W V2.0" by IAMOSI::DANIEL () Tue Mar 05 1996 11:25

--------------------------------------------------------------------------------
DANIEL JEYACHANDRAN          <F/W V2.0 problem.>               05-MAR-1996 14:44
--------------------------------------------------------------------------------

	HUBWATCH 4.1.1 & MAM module  V4.0.2
	DR900TM S/W V2.0.0, H/W V3, RO v04
                      ^
                      |
                      |

	HP OpenView reporting "RMON rising alarm exceeded 1" etc.

	Hubwatch reports, "RPT900 Missing data LED14  PCOM LED program
						LED15 No information." etc
	I have not been able to find any docs on hubwatch messages.
	
	I found the following entry in STARS database indicating a probable 
	bug in V2.0 S/W of DR900TM. Any fix for this problem yet?


	Thanks in advance.

	Daniel.

	CSC, Sydney.
------------------------------------------------------------------------------
{Elev} problems with repeater after micro code upgrade getting traps


COPYRIGHT (c) 1988, 1993 by Digital Equipment Corporation.
ALL RIGHTS RESERVED. No distribution except as provided under contract.

PRODUCT or COMPONENT:  DECrepeater 900TM

OP/SYS:  OPENVMSVAX

VERSION INFORMATION:

Operating System Version(s): OSF1 V3.2
Layered Product/Component Version(s): {include all relevant version numbers}
Polycenter netview 3.1B
Hubwatch V3.1

SOURCE:   Digital Equipment Corporation


SYMPTOM:

Customer has seen a problem with traps occuring since upgrading several
DECrepeater 900tm's from V1.1.0 to V2.0.0. The customer noted that some of
the upgraded repeaters are not having the problem. Customer is launching
hubwatch standalone, he is seeing the traps attached below via polycenter
netview. His version of hubwatch does not have "alarms" in the applications
pull down window so it appears that it can not be set up to log the traps.
Customer is seeing the traps hundreds of times a day on the affected
repeaters, some of the repeaters having the problem have hundreds of PC's
connected and some only have a device connected that monitors a remote sites
UPS.
The customer clicked on one of the affected modules and and read the status
screen to me:

status: enabled
Health Text: 0 ports are not operational, 0 ports are auto partitioned, 11
media are not available.
Health Text Changes: 414
Partitioned Ports: 0
Media Unavailable Ports: 11
Transmit Collisions: 335000 (uptime of 35 days on a busy network per customer).

Customer can think of nothing unique to those DECrepeater 900tm's that are
having the problem and those that are not.

Below is information received by the customer via FAX which is everything
the customer has on the traps, there may be a few inaccuracies due to a few
words not being readable:


 A RMON falling alarm repeater mau repeater information
repeater mau total media unavailable 0
fell below threshold 1; value = 21: (sample type = 2)
specific = 2
enterprise= rmon 1.3.6.1.2.1.16

A RMON rising alarm: repeater repeater information repeater health text
changes 0

exceeded threshold 1; value 40 (sample type =2; alarm index =4)
specific: 1
generic : 6
catagory: threshold events
enterprise: rmon 1.3.6.1.2.1.16
source: agent (A)
hostname: rep2.shost.ksc.com
severity: critical


RMON rising alarm
repeater extensions, repeater basic package, repeater repeater information 5.0
exceeded threshold 1; value 83 (sample type =2; alarm index = 5)

specific: 1
generic : 6
catagory: threshold events
enterprise: rmon 1.3.6.1.2.1.16
source: agent (A)
hostname: rep2.elrio.ksc.com
severity: critical

MTI.fddi.ksc.co  N elrio2.elrio.ksc.com
reported different link address than obtained from mti.fddi.ksc.com by snmp


specific: 58982401 (hex: 3840001)
generic : 6
catagory: mode configuration events
enterprise: netview 1.3.6.1.4.1.2.6.3.1
source: network (N)
hostname: net1.fddi.ksc.com
severity: indeterminate

I have talked to Frank Levesque regarding the problem, Frank has agreed to
look into what the traps are indicating and whether or not polycenter
netview could be a factor.

Customer has been somewhat difficult to work with and does not understand
why we are asking questions regarding version and topology information. I
have explained to him that we can not find information regarding what the
traps are telling us so we therefore can not tell him how to resolve the
issue.

The customer wants information on what the traps mean and how to correct
them, I have reviewed RFC 1157 (SNMP) and did not see them in there.



DIGITAL RESPONSE:

This problem has been reported to Engineering.


WORKAROUND:

{workaround}


ANALYSIS:

{cause}


T.RTitleUserPersonal
Name
DateLines
3323.1NETCAD::GALLAGHERTue Mar 05 1996 12:31135
>Customer has seen a problem with traps occuring since upgrading several
>DECrepeater 900tm's from V1.1.0 to V2.0.0. 

Why is this a problem?  If customers don't want to get traps they should
not provided trap sinks (trap destination IP addresses).  

Traps don't always report "bad" things.  Sometimes they're just imformational,
like the trap below:

>A RMON falling alarm repeater mau repeater information
>repeater mau total media unavailable 0
>fell below threshold 1; value = 21: (sample type = 2)
>specific = 2
>enterprise= rmon 1.3.6.1.2.1.16

Definition:

>  8: erptrMauTotalMediaUnavailable         One or more media have become
>     1.3.6.1.4.1.36.2.18.11.5.1.1.5.1.1.0  available or unavailable.

This usually means that someone plugged in a a cable, or removed a cable.

>A RMON rising alarm: repeater repeater information repeater health text
>changes 0
>
>exceeded threshold 1; value 40 (sample type =2; alarm index =4)
>specific: 1
>generic : 6
>catagory: threshold events
>enterprise: rmon 1.3.6.1.2.1.16
>source: agent (A)

Traps are also sent when changes to healthText occur.   The alarmed object
is in the DEC Private "Extended Repeater MIB".  It's definition is:

>erptrHealthTextChanges OBJECT-TYPE
>    SYNTAX  Counter
>    ACCESS  read-only
>    STATUS  mandatory
>    DESCRIPTION
>            "This counter increments each time the rptrHealthText object
>            defined in RFC 1516 is modified."
>    REFERENCE
>        "Reference RFC 1516 repeater MIB"
>    ::= { erptrRptrInfo 4 }

And the repeater MIB's rptrHealthText object is defined as:

>   rptrHealthText OBJECT-TYPE
>       SYNTAX    DisplayString (SIZE (0..255))
>       ACCESS    read-only
>       STATUS    mandatory
>       DESCRIPTION
>               "The health text object is a text string that
>               provides information relevant to the operational
>               state of the repeater.  Agents may use this string
>               to provide detailed information on current
>               failures, including how they were detected, and/or
>               instructions for problem resolution.  The contents
>               are agent-specific."
>       REFERENCE
>               "Reference IEEE 802.3 Rptr Mgt, 19.2.3.2,
>               aRepeaterHealthText."
>       ::= { rptrRptrInfo 3 }

This basically means that rptrHealth text is used to report anything
deemed "interesting" by the repeater implementation.  The trap is meant
to alert network managers to look at the health text.

>specific: 58982401 (hex: 3840001)
>generic : 6
>catagory: mode configuration events
>enterprise: netview 1.3.6.1.4.1.2.6.3.1
>source: network (N)
>hostname: net1.fddi.ksc.com
>severity: indeterminate

I'm not sure what this is, but it looks like it's coming form a host
rather than a repeater.  Can you confirm this?

I've attached a list of object on repeaters which are alarmed.

>The customer wants information on what the traps mean and how to correct
>them, I have reviewed RFC 1157 (SNMP) and did not see them in there.

rfc1757 describes RMON and contains definitions for the RMON rising and
falling event traps.

						-Shawn

-------------------------------------------------------------------------


REPEATER/PORTswitch (see Matrix below):

  DEFAULT ALARMS (NAME & OBJECTID)         TRIGGER OF EVENT
  --------------------------------         -----------------
  1: pcomEsysNVRAMavailableOctets          There is no more memory for 
     1.3.6.1.4.1.36.2.18.11.2.7.6.0        nonvolatile parameters.

  2: rptrTotalPartitionedPorts             One or more ports has been
     1.3.6.1.2.1.22.1.1.6.0                autopartitioned, or a port that 
                                           was previously autopartitioned
                                           is now operational.

  3: erptrHealthTextChanges                The module's operational state
     1.3.6.1.4.1.36.2.18.11.5.1.1.1.1.4.0  has changed.

  4: erptrTotalPortEvents                  The total number of times a port
     1.3.6.1.4.1.36.2.18.11.5.1.1.1.1.5.0  has become nonoperational, 
                                           autopartitioned, or unavailable.

  5: erptrTotalRptrErrors                  The total number of errors for
     1.3.6.1.4.1.36.2.18.11.5.1.1.1.1.6.0  this module.

  6: erptrDprTotalStateChange              The module's link state change has
     1.3.6.1.4.1.36.2.18.11.5.1.1.3.1.1.0  occurred while using redundant-link
                                           configuration.

  7: erptrSecurityRptrSecurityViolation    A security violation has occurred
     1.3.6.1.4.1.36.2.18.11.5.1.1.4.1.1.0  on one or more ports.

  8: erptrMauTotalMediaUnavailable         One or more media have become
     1.3.6.1.4.1.36.2.18.11.5.1.1.5.1.1.0  available or unavailable.

  9: erptrSecurityRptrSecurityViolation    A security violation has occurred
     1.3.6.1.4.1.36.2.18.11.5.1.1.4.1.1.0  on one or more ports.

 10: erptrMauTotalMediaUnavailable         One or more media have become
     1.3.6.1.4.1.36.2.18.11.5.1.1.5.1.1.0  available or unavailable.



  * indicates the module supports the alarm

3323.2IAMOSI::DANIELFri Mar 08 1996 04:0418
	Hi Shawn,

	Thanks for your quick reply. The attachment I sent was not from my
	customer.

	My cust log did not reveal any details like the one attached, but 
	only a single line summary 
	"RMON rising alarm exceeded 1"
		"	"
 		"	"
	RMON falling alarm below 1
	
	etc.

	I have asked him to give me a detailed print-out of it.

	Daniel.
	
3323.3NETCAD::MILLBRANDTanswer mamFri Mar 08 1996 13:369
from .0 -

>	HUBWATCH 4.1.1 & MAM module  V4.0.2
>	DR900TM S/W V2.0.0, H/W V3, RO v04

You should be running V4.1 of the MAM with 4.1 HUBwatch.

	Dotsie