[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference netcad::hub_mgnt

Title:DEChub/HUBwatch/PROBEwatch CONFERENCE
Notice:Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7
Moderator:NETCAD::COLELLADT
Created:Wed Nov 13 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:4455
Total number of notes:16761

2847.0. "Help with Errorlog from DEChub 900MS crash" by CSC32::R_BUCK (Have been assimilated) Mon Oct 09 1995 16:39

    The following (edited), errorlog information was FAX'd to me from a
    field service engineer.  The DEChub 900MS lost all configuration
    information as far as backplane connections, etc.  Did retain IP
    configuration.
    
    Engineer believes that since re-seating the DECswitch 900EF module the
    customer has not had any other unexpected hub crashes.  Does not know
    of any event that could be related to the original crash.  Engineer 
    states that all modules, and the DEChub 900MS itself, are running with
    recent firmware, i.e., the 4.0 kit.
    
    Basic questions are what does this errorlog information point to?  How
    are entries interpreted? Is there any reference and/or formal
    documentation concerning crash codes on the DEChub?  (the infamous) Any
    known problems that this matches?
    
    I believe the bottom line for the Field Engineer is that he just needs
    to know more about the crash so that he can give the customer a
    reasonable explanation.  Also he wants to be sure that he does not
    start swapping hardware if the problem is specific to firmware. 
    
    Thanks 
    Randall Buck
    
    --------------------------------------------------------------------
DEChub 900 MultiSwitch

                    DUMP ERROR LOG
                    Current Reset Count: 17

                    Entry = 14
                    TimeStamp =00
                    Reset Count =15
                    Stop Thrash; Cleared Nonvolatile Data.

    Dump another entry [Y]/N? y

                    Entry =13
                    Time Stamp = 0 3500
                    Reset Count =14
                    Catch VO=07C SR=2004 PC=41B608

    Dump another entry [Y]/N? y

                    Entry =12
                    Time Stamp =0 17700
                    Reset Count =13
                    Catch VO=07C SR=2004 PC=41B60E

    Dump another entry [Y]/N? y

                     Entry =11
                     TimeStamp =0 7100
                     Reset Count =12
                     Catch VO=07C SR=2004 PC=41B2FO

    Dump another entry [Y]/N? y

                     Entry =10
                     Time Stamp =0 136300
                     Reset Count =11
                     Catch VO=07C SR=2009 PC=41B2BC

    Dump another entry [Y]/N? y

                     Entry =9
                     Time Stamp =0 10600
                     ResetCount =10
                     Catch VO=07C SR=2004 PC=41B2B8

    Dump another entry [Y]/N? y

              No more Error Log entries.
                          Press Return for Main Menu...

DECswitch 900EF - Slot 8

=================================================================
               DUMP ERROR LOG
             Current Reset Count: 9
=================================================================

Entry# =2
EntryStatus =0 [0=valid,1=write_error,2=Invalid,3=empty,4=crc_error
Entry Id =10
Firmware Rev =1.5
Reset Count =6
Timestamp = 0 2B D131
Write Count =5
FRU Mask =0
Test ID = DEAD
Error Data = SR=2000 PC=0303516A Error Code=00002008 ProcCsr=556D
Registers = D0=05000220 D1=00000002 D2=00000002 D3=00000002
            D4=00000000 D5=00000000 D6=00000000 D7=0000FFFF
            A0=00002094 A1=030555DC A2=00000003 A3=05000012
            A4=00058D5A A5=05000016 A6=0004B650 A7=0004B5CC

Dump another entry [Y]/N? Y

Entry# =1
EntryStatus =0 [0=valid,1=write_error,2=Invalid,3=empty,4=crc_error
Entry Id =11
Firmware Rev =1.5
Reset Count =6
Timestamp = 0 1F 68BD
Write Count =5
FRU Mask =0
Test ID =0
Error Data =SR=00000000 PC=00000000 ErrorCode=00000000
Registers =Phy1Csr = 00000000 ElmBase     =00000000 MacBase =00000000
           CamCsr   =00000000 CamData15_00=00000000 PmCsr   =00000000
        CamData31_16=00000000 CamData47_32=00000000 PortDataA =00000000
           RtosTimer=00000000 RtosTimerVal=00000000 PortDataB =00000000
         i68k68kInt =00000000 i68k68kMask =00000000 Dmaint    =00000000
         i68kForceInt=00000000 DmaMask    =00000000 HostData  =00000000
         HostInt0Mask =00000000 HostInit0 =00000000 PortStatus=00000000
         PortCtrlMask=00000000 HostDmaMask=00000000 PortCtrlInt=00000000
         FmcControl=00000000 FmcStatus=00000000 FmcInt=00000000


Dump another entry [Y]/N? y

Entry# =0
EntryStatus =0 [0=valid,1=write_error,2=Invalid,3=empty,4=crc_error
Entry Id =11
Firmware Rev =1.5
Reset Count =6
Timestamp = 0 14 9B41
Write Count =5
FRU Mask =0
Test ID =0
Error Data =SR=00000000 PC=00000000 ErrorCode=00000000
Registers =Phy1Csr = 00000000 ElmBase     =00000000 MacBase =00000000
           CamCsr   =00000000 CamData15_00=00000000 PmCsr   =00000000
        CamData31_16=00000000 CamData47_32=00000000 PortDataA =00000000
           RtosTimer=00000000 RtosTimerVal=00000000 PortDataB =00000000
         i68k68kInt =00000000 i68k68kMask =00000000 Dmaint    =00000000
         i68kForceInt=00000000 DmaMask    =00000000 HostData  =00000000
         HostInt0Mask =00000000 HostInit0 =00000000 PortStatus=00000000
         PortCtrlMask=00000000 HostDmaMask=00000000 PortCtrlInt=00000000
         FmcControl=00000000 FmcStatus=00000000 FmcInt=00000000

Dump another entry [Y]/N? y

No more Error Log entries
    
T.RTitleUserPersonal
Name
DateLines
2847.1NETCAD::DOODYMichael DoodyMon Oct 09 1995 17:4175
    Looks to me like the Hub's backplane configuration got corrupted 
    somehow, bad enough to cause the hub to crash several times until 
    it erased it's configuration to stop itself from thrashing. 
    
    It does not look like a hardware problem; its probably a firmware problem
    related to the Hub's backplane management. Many bugs were fixed related
    to backplane configuration in the latest MAM V4.1. I suggest you have
    them upgrade.
    
    
                    DUMP ERROR LOG
                    Current Reset Count: 17

    
    
    This entry means the hub has crashed too many times in a row and it 
    has erased it's configuration. This includes the IP address, etc.  So
    you were misinformed on this point (maybe someone set the IP address
    again afterwards):
    
                    Entry = 14
                    TimeStamp =00
                    Reset Count =15
                    Stop Thrash; Cleared Nonvolatile Data.

    Dump another entry [Y]/N? y

    
    This entry means the hub crashed at 35 seconds uptime, in the routine 
    at PC=41B608. Which routine this is depends on the _exact_ MAM firmware
    version. But the address is definitely backplane management related:
    
    
                    Entry =13
                    Time Stamp = 0 3500
                    Reset Count =14
                    Catch VO=07C SR=2004 PC=41B608

    Dump another entry [Y]/N? y

    	The rest of the entries are similar, the hub is crashing due to
    	corrupted configuration:
    
    
                    Entry =12
                    Time Stamp =0 17700
                    Reset Count =13
                    Catch VO=07C SR=2004 PC=41B60E

    Dump another entry [Y]/N? y

                     Entry =11
                     TimeStamp =0 7100
                     Reset Count =12
                     Catch VO=07C SR=2004 PC=41B2FO

    Dump another entry [Y]/N? y

                     Entry =10
                     Time Stamp =0 136300
                     Reset Count =11
                     Catch VO=07C SR=2009 PC=41B2BC

    Dump another entry [Y]/N? y

                     Entry =9
                     Time Stamp =0 10600
                     ResetCount =10
                     Catch VO=07C SR=2004 PC=41B2B8

    Dump another entry [Y]/N? y

              No more Error Log entries.
                          Press Return for Main Menu...

2847.2Appreciate the explainationCSC32::R_BUCKHave been assimilatedMon Oct 09 1995 20:587
    Michael,
    
    Thanks for taking the time to explain the entries.  Will pass the
    information along to the field engineer along with the suggestion to 
    go ahead and upgrade to the latest firmware.
                                               
    Randall