[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference netcad::hub_mgnt

Title:DEChub/HUBwatch/PROBEwatch CONFERENCE
Notice:Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7
Moderator:NETCAD::COLELLADT
Created:Wed Nov 13 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:4455
Total number of notes:16761

3135.0. "DECswitch crash 2380" by COMICS::REYNOLDS (Mad Dogs and Englishmen) Tue Jan 09 1996 09:23

    
    Hello,
    
    
    		does anyone know of a problem whereby a DECswitch 900EF
    repeatedly crashes (after a few seconds) when port 3 is assigned to
    the backplane?
    In this case, we have swapped the DECswitch hardware and got the same 
    result, but only when connected to a particular DEchub900. Obviously a 
    hardware problem with the DEChub900 is now suspected but it looks 
    like this is sweating out a bug in the DECswitch code.
    Interestingly, the bridge will crash just with port 3 as a backplane
    'stub' and NOT connected to a channel. If it did require a channel
    connection to cause the crash, you could guess at a shorted segment
    problem.
    
    The bridge is running:   h/w 1/2  ROM 0.4 and s/w 1.5.2
    It always crashes with code 2380.
    
    Is this crash code meaningful and should the bridge crash under
    any such circumstance?
    
    (I'll post result of the backplane swap)
    
    
    John Reynolds, UK CSC, Comms.
     
T.RTitleUserPersonal
Name
DateLines
3135.1Some suggestions....NETCAD::BATTERSBYTue Jan 09 1996 15:3329
    Well, I don't recall ever seeing this type of behavior.
    There's several things you could do to help diagnose the 
    crash. 
    After creating a crash, allow the module to power back up,
    but don't try to connect port 3 to the backplane. 
    
    Next re-direct to the EF console and verify that there are/aren't 
    other error codes along with the 2380 error code.
    
    Also for further clarification, what rev of MAM firmware does
    the HUB have that the crash occurs in?
    
    Also, what rev of HUBwatch is being used to manage the HUB & the EF?
    
    The 2380 error code can probably be attributed to the EF's watchdog
    timer expiring as a result of the firmware not "kicking" it. This
    though may be a second order affect to what is really causing the
    crash.
    
    The only other thing that comes to my mind is the possibility that
    the module (with its latest rev firmware), is running in a HUB with
    an older rev firmware, and/or being managed by old rev HUBwatch FW.
    There have been instances where it has been seen that a module with
    its latest rev FW in a HUB with old FW or being managed by an older
    rev of HUBwatch will cause wierd problems to occur. That's why I
    asked about the rev of MAM FW and rev of HUBwatch being used.
    
    Bob
    
3135.2I'm that man!!!CHEFS::DAVIES_JWed Jan 10 1996 14:4441
    I'm the engineer involved in all of this. I was hoping to swap the
    backplane today but the customer was not happy about upsetting a disaster
    tollerant cluster. Should be no problem but you still have to listen to
    the customer. This is now scheduled for next Wednsday.
    
    I've also ordered a MAM module just in case it could be that device. 
    Also in talking to a contractor who works for Digital and another
    company he also has exactly the same problem but at the swapping EF
    switch stage. No doubt we are chasing a bug here. I was using the DOS
    version V4.1 to configure the hub. The rev levlel of the hub is at
    4.0.4. 
    
    After pointing port 3 of the switch to the backplane the switch reboots
    and then continously reboots. I'm not sure how many times it does this
    but it looks as if eventaully it does come up and stay up hence the
    error log below. Otherwise what I have done in the past is done a reset
    on the hub which clears it.
    
    Here is the full error log entry for the EF switch after it
    successfully gets booted.
    
    Entry Station = 0
    entry id=10
    Firm rev =1.5
    Reset count =31
    timestamp = 0 0 9
    Write count =12
    FRU mask = 0
    Test Id= DEAD
    Error data = sr=2409 pc=030343b2 Error code= 00002380 Proc csr= 776d
    Reg= do=00000001 , d1=00000005 , d2= 00000107, d3= 00000002
         d4= 00000000,  d5=00000000, d6=00000000 , d7=0000ffff
         a0=00051147 a1=00002090, a2=00058da6., a3=03089b58
         a4=00058d8a a5=03055cdc, a6=0004b644, a7=0004b620
     
    
    
    
    	John Davies
    	Network Services 
    	Bristol UK
3135.3NETCAD::DOODYMichael DoodyWed Jan 10 1996 15:379
    John,
    
    It probably would be worth upgrading the Hub Manager (MAM) to v4.1 -
    DMHUB410.bin. This should be a fairly painless thing to try since
    you already have V4.1 of hubwatch. I'm really not sure that it will 
    help, but there is a reasonable chance that it will, assuming the
    hardware is not to blame.
    
    -Mike
3135.4alas poor 900-EF I knew him well.CHEFS::ATTWOOL_JEndoftheworld and I feelfineMon Jan 15 1996 10:1928
    
    
    In the words of shakespeare: I am that contractor, who works at Digital
    as mentioned in -.2.
    
    I've received the third replacement 900-EF module here at DEC and have 
    tried it out in our test Multi-switch. And the 900-EF upon command 
    ( port 3 to the back, please !) reboots/resets its self.
    
    The configuration is:
    Option                  Firmware Level
    -----------------------------------------
    Multi-Switch            V4.1.0
    900-EF                  V1.5.2
    Hubwatch                V4.1.1
    
    Both the Multi-Switch and EF firmware were from gatekeeper.
    
    I'm on my forth 900-EF. The first had a "real" H/W fault; second
    replacement took it out of the box, it rattled and a component fell
    out. Third showed signs aka this note (CSC looked and advised
    replacement); Forth this one showing signs of this note again.
    
    I'll be trying to test it out on another hub later on today.. I'll post
    the results if it goes ahead.
    
    Justyn
    
3135.5Show us the way oh great one!!!CHEFS::DAVIES_JMon Jan 15 1996 15:059
    I was going to try Multiswitch v4.1 as suggested but since Justin has
    also got the same problem with that version it seems a little
    pointless.
    
    	Where do we go from here  except for swapping out everything and
    hoping we can find a combination that works???
    
                                                                        
    	John Davies
3135.6NPSS::WADENetwork Systems SupportMon Jan 15 1996 17:1717
    Are there any other error logs in the EF or MAM?.  The error log
    listed in this note shows the EF as being up for 90 msec.
    
    Could you describe the hub configuration, module/slot?  Have you tried
    moving the EF to another slot?  
    
    If this problem is reproducing with a combination of backplane and EF
    I'd like to see both shipped back to eng for analysis.  You can send it
    to me at -
       
    	Digital Equipment
    	550 King St
    	Littleton, Ma 01460
    
    	Att: Bill Wade  LKG1-2/G10