[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference netcad::hub_mgnt

Title:DEChub/HUBwatch/PROBEwatch CONFERENCE
Notice:Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7
Moderator:NETCAD::COLELLADT
Created:Wed Nov 13 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:4455
Total number of notes:16761

3353.0. "except vo=08 sr=2700 pc=00601aa4: failed" by BERFS4::NORD () Tue Mar 12 1996 05:15


	Good morning, good evening and something between,

	saw a problem at customer side and weren't able to locate the problem:

	DMHUB-MA, DS900EF, DC900MX, DR90T+ and two DS90L+, the display of the
	agent writes: Except VO=08 SR2700 PC=00601AA4: FAILED.

	All connected device have power, the network and sefttest LEDs are on.

	I have to look, whether this problem is permanent or not, so I power
	down/up the backplane, the display writes:

		102, ... , 306, 307, Except VO=08 SR=2700
				     PC=00601AA4: FAILED

	I disconnected the large modules after power down, do a power up,
	the same writing.

	Fact:	The large modules didn't get any power, no LED is comming up,
		the small modules get power, it seems, that the selftest is
		ok and the network is ok (all LEDs are green)

	Can someone explain, what that error is?, why didn't the large modules
	get any power, are the small modules ok and is the customer able to
	work with the small modules?

	Next:	What do I have to change, the agent or the backplane?

	Greetings from

	Wolfgang Nord
	MCS at Berlin at Germany 

T.RTitleUserPersonal
Name
DateLines
3353.1NETCAD::MILLBRANDTanswer mamTue Mar 12 1996 16:4321
Hello -

The hub failed during Test 307, Interrupt Priority.  If you tell
us which hub version you are running, we can look up where the PC
is.  

Because it is a crash and not a reported selftest failure, you
may have a hub firmware problem. V4.0.0 and V4.0.2 of the hub 
had a problem that would cause a selftest failure with certain 
components.  If your customer is running one of these, upgrade 
to V4.1.1 or V4.1.0.  Of course, if you never get out of selftest
you can't upgrade, so you must replace the hub card.

90 modules are always powered up if there is power in the hub.
That is a legacy of the DEChub 90.  900 modules enter into a
power arbitration with the hub firmware to determine if there
is sufficient power available for all modules.  If the hub has
failed, this arbitration will not occur and no 900 module will
be (fully) powered up.

	Dotsie
3353.2NPSS::WADENetwork Systems SupportFri Mar 15 1996 15:5939
Wolfgang,

   >      
   >      	DMHUB-MA, DS900EF, DC900MX, DR90T+ and two DS90L+, the display of the
   >      	agent writes: Except VO=08 SR2700 PC=00601AA4: FAILED.

      By, "the display of the agent writes", I assume you are referring to 
      the error log.
  
      What is the version of Management Agent Module (MAM) firmware running on 
      this DMHUB?  Should be 4.1.x.
                                
    .
    .
    .
    
   >      	Can someone explain, what that error is?, why didn't the large modules
   >      	get any power, are the small modules ok and is the customer able to
   >      	work with the small modules?
   >      

    As mentioned in .1, 
    " 
     V4.0.0 and V4.0.2 of the hub
    had a problem that would cause a selftest failure with certain
    components.  If your customer is running one of these, upgrade
    to V4.1.1 or V4.1.0.  Of course, if you never get out of selftest
    you can't upgrade, so you must replace the hab card" Management Agent
    Module (MAM)
    
       If they are running down-rev MAM firmware, they need to upgrade to 
       4.1.x.

   >      	Next:	What do I have to change, the agent or the backplane?
   
    You need to replace the MAM if it does not complete selftest.
    
    Bill
    
3353.3< Same Error Entry with V4.1.0>ZUR01::ACKERMANNTue Apr 23 1996 12:3867
    
    
            Hello all together,
    
            I have exactly the same Problem as reported in Note 3353.0.
            At a big Customer Site, the MAM is  n o t accessible any
            more nether from Hubwatch nor from Netview.
            The Hub itself is working. The Modules are accessible
            with Netview.
            The Setup Port of the HUB also hangs.
    
            The LCD Display shows:
            ----------------------
    
            Except V0-08 SR=2700
            PC 00601AA4: Failed
    
            The Error Log of the MAM shows:
            -------------------------------
    
    
            Entry        = 6115
            Time Stamp   = 0 1326000
            Reset Count  = 10
            Catch VO=008 SR=2200 PC=4087AE F=87AE000
    
            Entry        = 6114
            Time Stamp   = 0 42305500
            Reset Count  = 9
            Catch VO=008 SR=2200 PC=4087AE F=87AE000
    
            Entry        = 6113
            Time Stamp   = 0 1434059900
            Reset Count  = 8
            Catch VO=008 SR=2200 PC=4087AE F=87AE000
    
            Action: The Customer replaced the MAM Module, than
            it worked about 5-10 Minutes, than the MAM Module
            hangs again with the above Error.
    
       
            Config:
            ------
    
            MAM Module: V4.1.0
    
            Slot    Module                  Version
    
            1       Packetprobe 90          V2.6
            2       DECserver 900 TM        NAS 1.5 BL95-33
            3       Empty
            4       Portswitch 900TP        V2.1.0
            5       "                       "
            6       "                       "
            7       DECswitch 900EF         V1.5.2
            8       DECconcentartor 900MX   V3.1.1
    
            The Module witch provides IP Services is Slot 8, the Conc900MX.
            So, what Could the Problem be, a HW Problem of the Backplane,
            or that the IP Service Module filles up the Memory of the MAM
            Module, or just a Bug of the MAM SW or Code?
    
    
            Thanks for any Help,
                                                                     
            Daniel MCS Switzerland
     
3353.4NETCAD::DOODYMichael DoodyTue Apr 23 1996 14:0331
Daniel,

It is unclear what is happening. You say the MAM hangs after 5-10
minutes, but this should not be possible. There is a watchdog 
timer which will reset the MAM if it hangs. 

Possibly, the MAM has crashed due to the PC=4087AE
error shown in the error log, but then when it is rebooting after the 
crash it then hangs during self-test at the PC=00601AA4 error. 
It is possible the PC=4087AE error is causing the PC=00601AA4.

Some information:  Thanks for providing the MAM firmware version.
Without it, we can tell you nothing. 

The PC=00601AA4 error is confusing. It is possibly a self-test
diagnostic failure, or a firmware bug in diagnostics. 

The PC=4087AE error looks very much like a MAM firmware bug. It is related
to backplane communication between the MAM and the modules. Is there
a lot of traffic going through the hub? Is there a lot of management
traffic (like using the polling feature of Hubwatch)?

You could try using the DECswitch in slot 7 as the IP services module
to see if the problem goes away.

It is strange that the hub was up for 160 days without a problem and then
suddenly it happens often. What has changed with this hub recently?  Added
modules? Upgraded any firmware (modules or MAM)? New management / monitoring
software? Change which module does IP services?  Any clues will help.

-Mike
3353.5Thanks and a QuestionZUR01::ACKERMANNWed Apr 24 1996 15:1017
    
    Hello Michael,
    
    Thanks for Your fast Response. I could not reach today the
    responsible Person at the Customer Site to ask all the
    Questions. He will be there tomorrow.
    Bu as far as I know, he has enabled Traps on the MAM and 
    the Modules.
    You said to try Slot 7, the Switch 900EF as the IP service Module. 
    Is it possible, that it can overfill the Memory of the MAM for Exampl 
    with Traps, when the Traps can not be send to the MGNT Station and
    these would cause the MAM to hang?
    
    Thanks and regards,
    
    Daniel
    he has a problem to 
3353.6Nope, SNMP trap messages aren't the problem.NETCAD::GALLAGHERWed Apr 24 1996 16:4711
>    Is it possible, that it can overfill the Memory of the MAM for Exampl 
>    with Traps, when the Traps can not be send to the MGNT Station and
>    these would cause the MAM to hang?
 
Nope.  This is not possible.  The MAM does not store SNMP Trap messages.
When it's time to send an SNMP trap, the trap message is encoded and passed
to UDP, IP, and the data-link.  If the data-link is down, the message is
dropped.  If it's up, then the message is sent.  If there's no management
station around to rcv the trap messages, then the message just falls off
the end of the Ether.
						-Shawn