[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference netcad::hub_mgnt

Title:DEChub/HUBwatch/PROBEwatch CONFERENCE
Notice:Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7
Moderator:NETCAD::COLELLADT
Created:Wed Nov 13 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:4455
Total number of notes:16761

2532.0. "900MX and power failures" by CSC32::MACGREGOR (Colorado: the TRUE mid-west) Mon Jul 17 1995 16:19

    
    I understand that there is a problem with the 900MX (2.0.0) and 
    power failures when it is connected to a DH900 running 3.1.0 that
    results in having to rebuild the configuration.
    
    Does this problem still exist in DH900 4.x?  Is there a workaround?
    
    This is for a customer.
    
    Marc MacGregor
    Network Support Team
    DTN 592-4361
    
T.RTitleUserPersonal
Name
DateLines
2532.1NETCAD::DOODYMichael DoodyMon Jul 17 1995 16:546
    900MX? You mean DECconcentrator 900MX?  or DECbridge 900MX?
    
    What do you mean by power failure? When power is lost to the Hub or 
    when the module is removed while the Hub is running?
    
    md
2532.2Hardware is DECbridge 900MX (and maybe others..)CSC32::R_BUCKHave been assimilatedMon Jul 17 1995 19:1120
    Tracking similar problem to the one reported by Marc.  My customer
    states that his DECbridge 900MX (aka DECswitch 900EF), lost all
    configuration characteristics when it powered off for a period of time. 
    In his case about 45 minutes.  In fact the entire DEChub 900MS that it
    was in was shutdown for this period of time.  
    
    Customer also said he had a DECbridge 900FP experience the same
    behavior, however, I believe the actual hardware must be a DECrepeater
    900FP, since I am not aware of any bridge module by the name given. 
    Another person on our team stated that he was aware of the problem with
    DECbridge 900MX modules loosing configuration information if they are
    powered off for a period of time.  Recalls it was something like 10
    minutes for this to occur.  Was lead to believe that this problem had
    been elevated and a firmware fix was in the works.  So far I cannot
    locate anything that loks like a formal elevation.
    
    So, any confirmation or other reports of 900 series modules loosing
    configuration information when powered off for a period of time?
    
    Randall Buck
2532.3We need more accurate information of hardware & firmware please....NETCAD::BATTERSBYMon Jul 17 1995 19:4010
    >I understand that there is a problem with the 900MX (2.0.0) and
    >power failures when it is connected to a DH900 running 3.1.0 that
    >results in having to rebuild the configuration.
    
    Marc, could you help us understand what you mean by "900MX (2.0.0)"?
    Is this the firmware rev (that you are told) of the DECbridge 900MX?
    If so, there is no such rev 2.0.0 of the DECbridge 900MX *or* of the
    DECswitch 900EF.
    
    Bob
2532.4What does the customer mean by Configuration?NETCAD::BATTERSBYMon Jul 17 1995 19:4610
    Furthermore....I suspect we need to clear up a possible
    descrepancy as to what is meant by "configuration characteristics".
    Is the customer referring to his HUBwatch created configuration,
    or is he referring to parameters stored in NVRAM?
    HUBwatch created configurations are not stored in the DECbridge 900MX
    (aka DECswitch 900EF), the HUBwatch created configuration is stored
    in the MAM. I realize terminologies can get confusing and
    mis-understood over the phone etc.
    
    Bob
2532.5CSC32::MACGREGORColorado: the TRUE mid-westMon Jul 17 1995 19:4923
    
    Bob,
    
    I suppose the 2.0.0 could be a software version.  Unfortunately I'm not
    too familar with the HUB products and I took the information to be
    accurate.  I was told that the DECbridge 900MX was running firmware
    2.0.0 (I guess not 8^)
    
    Upon further conversation with some coworkers, there appears to be at
    least three examples of this problem.  One of the key pieces of
    information seems to be that the problem does not exhibit itself during
    quick power downs of the backplane.  However, during longer periods
    (such as the power for the company is out for an hour because of a
    storm) this problem occurs.
    
    I understand that this was a known problem with "non-latest"
    firmware/software.  I was trying to find out if it was still a problem
    with the latest.  It appears that it might still be an issue.  However,
    it could be that the people on this team are all making the same
    mistake, which is why I'm trying to verify it here.
    
    Marc
    
2532.6*seems* to be module NVRAMCSC32::R_BUCKHave been assimilatedMon Jul 17 1995 20:0315
    RE .4
    
    As I understood my customer, (different one than Marcs'), what was lost
    was the configuration of the ports, ie. were they pointed to the back
    or front.  Basically it *seemed* like it reverted to factory defaults
    as far as port configuration.  Will need to talk with the customer to
    get further clarification.
    
    To the best of my knowledge the DEChub 900MS is running version 3.x
    firmware.  Customer is using Hubwatch for Windows V3.1.  Not absolutely
    sure of the firmware version for the DECbridge 900MX.  Customer did
    state that all other modules came back configured correctly.  LANs he
    had created within the backplane still existed.  
    
    Randall Buck
2532.7ThisNETCAD::BATTERSBYMon Jul 17 1995 20:1215
    Marc, yes this does sound like an old problem where the HUB
    MAM module would lose its configuration of all or certain modules
    when powered off for some time.
    I don't remember which rev cleared up this problem (I think it was 
    perhaps fixed with V3.1 of the DEChub 900 Multiswitch firmware.
    So I'm sure all DEChub 900 Multiswitch firmware revs of V3.1 or
    newer have this problem fixed. A concerted effort should be made
    to such customers to please upgrade their HUBs and HUB modules with
    the latest possible code they can so that such re-occurences will
    not re-occur, and to also take advantage of all the newest features.
    
    In short, this is no longer a problem and hasn't been such for quite
    a while.
    
    Bob
2532.8CSC32::MACGREGORColorado: the TRUE mid-westMon Jul 17 1995 23:0914
    
    Bob,
    
    I got ahold of the customer (FE) and have more accurate information. 
    Sorry about the confusion.  900MX is a concentrator not a bridge.  He
    also saw the problem on the 900FP concentrator and the 900TM repeater. 
    The DEChub 900 Multiswitch firmware is V3.1.0.  Given this information,
    it appears that something doesn't make sense.
    
    In any case, the customer will install the latest kit (see note 2) and
    see if that solves the problem.
    
    Marc
    
2532.9Find out specifically what is meant by "lost configs"NETCAD::BATTERSBYTue Jul 18 1995 13:4815
    Marc, I spoke to one of the 900MX firmware developers and he
    mentioned that there was a 60 day upgrade after the 2.0.0
    wave 2 release. This has a rev of V2.8.0. If the customer still
    has pangs about doing the full wave 3 upgrade (which would be
    more desirable), they could upgrade the concentrator to 2.8.0
    and observe the subsequent behavior. It was also mentioned that
    that there were no known problems with configurations (backplane
    lan-hopping), being lost with wave 2. He also suggested probing the 
    customer a little to determine what the customer means by "lost 
    configurations". Are they referring to lost IP address for example, 
    or are they referring to HUB backplane configs? Depending on what is 
    being "lost", the finger would then either point towards the Concentrator 
    or towards the HUB (MAM), as being suspect.
    
    Bob
2532.10NETCAD::DOODYMichael DoodyTue Jul 18 1995 13:5910
    The MAM team is not aware of any issues related to the Hub being
    powered off for extended periods. The Hub does not have a hardware
    clock, and there is no difference between being powered off for 1
    second or 1 month. We are probably looking at a coincidence here.
    
    The fact that there is some sort of configuration loss is troubling but
    like Bob said, it is difficult to figure out what exactly is being
    lost.
    
    -Mike
2532.11JULIET::LEE_CAThu Jul 20 1995 01:1883
    reg .10
    
     Let me try to clear up what is being lost. I have one of the sites in
    question, actually the one that started this note. On this site my hub
    looks like this.
    
    	Hub is at 3.1.0
    	900mx is at sw 2.0.0
    	the rest I'm unsure of without revisting the site.
    
    	Lans within the hub are as follows
    
		Name		Type
    		--------------------------------
    		Thinwire        Ethernet
        	FDDI            FDDI
    		Computer room	Ethernet
    		Floor1		Ethernet
    		Floor2		Ethernet
    		Floor3		Ethernet
    		Floor4		Ethernet
    
    
    	Slot 1	900MX concentrator ports A&B to the back. Connected
    		the FDDI lan.
    	Slot 2	900FP all ports configured as redundant pairs. The first
    		five pairs are in use and directed to the backplane
    		as pulldown 1-5. The thinwire/ports 11&12 pulldown is 
    		connected to the thinwire lan in the hub. Pairs 1-5
    		are connected as follows
    			1 to ethernet lan floor2
    			2 to ethernet lan floor3
    			3 to ethernet lan floor4
    			4&5 to ethernet lan floor1
    	Slot 3	DECBrouter 90 point to point bridge to two remote sites
    	Slot 4	900TM connected in backplane to thinwire and computer
    		room lan.
    	Slot 5	DECBrouter 90 Frame relay bridge no routing
    	Slot 6	DECServer 900
    	Slot 7  DECBrouter 90 point to point bridge to two remote sites
    	Slot 8	900EF All ports exept 6 are directed to the backplane
    		port 1 is connected to the FDDI lan
    		port 2 is connected to the computer room lan
    		port 3 is connected to the floor2 lan
    		port 4 is connected to the floor3 lan
    		port 5 is connected to the floor4 lan
    		port 6 is out the front to an existing novell lan
    		port 7 is connected to the floor1 lan.
    
    What was lost after power off.
    
Slot 1: On the 900mx concentrator the A&B ports were directed to
    	 the front and obviously not connected to the FDDI lan. 
Fix:	 Redirect ports A&B to the back and reattach them to the FDDI lan.
    
Slot 2: On the 900FP the redundant pairs were not connect to any lans
    	within the hub.
Fix:	reconnect pulldowns 1-5 to their respective lans.
    	Note: the port configs were correct as far as redundancy goes.
    	      the thinwire lan was connected as it should be.
    
Slot 4: the green flexable lan pulldown was not connected.
Fix:	reconnect green lan to the computer room lan.
	Note: the thinwire lan was connected properly
    
    
    What was not lost after power off:
    
    	The hub itself retained all lans, its IP addr, and name put in 
    	through hubwatch.            	
    
    	The 900EF came back and reconnected properly to all lans.
    
    This has happened twice to this hub in the past 8-10 months. I have powered
    it off and on 3 or 4 times with no problems, when I did this test 
    I did'nt leave the hub powered off for any length of time. the customer
    has always left it off for 1 hour or more both times this has happened.
    
    
    			Hope this helps and makes sense
    
    				Carey Lee  dtn: 550-0275
                                     
2532.12NETCAD::DOODYMichael DoodyThu Jul 20 1995 21:5355
    Carey, Thanks for the clear & detailed description. 
    
    What you describe here is likely either a MAM firmware problem or 
    a problem with the timing of power-downs.
    
    If there was a problem with the hardware, you would have lost all the
    configuration including the IP address and the slot 8 config. Instead
    what you see is some of the LAN interconnect configuration missing.	So
    it has nothing to do with how long the hub was powered off.
    
    Here's some possibilities:
    	- Power was removed while some configuration change was just made
    	  to the Hub. You must wait at least 15 seconds after making a
          configuration change to the hub, before removing power. This
    	  can include changes to lan interconnect with hubwatch, removing/
    	  inserting modules, modules reset etc.
    
    
    	- Problem with MAM firmware. Upgrade.
    
    	- Problem with MAM interaction with that 2.0.0 concentrator. 2.0.0
    	  is old even for a 3.1 MAM. Latest is 3.0.n,   and MAM v4.0.n
    
    
    I tried duplicating your problem with the same configuration of modules
    & lans but could not reproduce it with either 3.1 or 4.0 of the MAM
    firmware. I only went back to 2.8.0 for the concentrator because that's
    the oldest of what's conveniently available to me.
    
    
    
    
    Some random notes:
    
>Slot 1: On the 900mx concentrator the A&B ports were directed to
>    	 the front and obviously not connected to the FDDI lan. 
>Fix:	 Redirect ports A&B to the back and reattach them to the FDDI lan.
    
    	MAM lost lan interconnect for this slot.
    
>Slot 2: On the 900FP the redundant pairs were not connect to any lans
>    	within the hub.
>Fix:	reconnect pulldowns 1-5 to their respective lans.
>    	Note: the port configs were correct as far as redundancy goes.
>    	      the thinwire lan was connected as it should be.
    
    	Not surprising since the port groupings are stored in the repeater
    	not the MAM. Thinwire connection is default when config lost.
    
    
>Slot 4: the green flexable lan pulldown was not connected.
>Fix:	reconnect green lan to the computer room lan.
>	Note: the thinwire lan was connected properly
    	
    	Thinwire is default.
2532.13JULIET::LEE_CAThu Jul 20 1995 23:2812
    reg .12
    
    The configuration has not changed for days on this hub so a change
    within the last 15 seconds is out as a possibility.
    
    For now I'm recommending the customer to upgrade to 4.0 HW and download
    all the latest firmware and we'll take it from there. They would like a
    definite answer that 4.0 will fix the problem. I told them that it's
    likely but could not be guaranteed.
    
    
    			Carey Lee
2532.14Yes, .12 was well doneNETCAD::BATTERSBYFri Jul 21 1995 13:278
    I too must voice appreciation for writing such a concise and detailed 
    description in reply .12. If more people would take the little extra
    time to include and describe exactly what they did & observed as shown 
    in .12, it would go a long way towards quicker resolution/explanation 
    of what might be wrong. 
    Thanks Carey, that was well done.
    
    Bob
2532.15Another big Thank You to CareyCSC32::R_BUCKHave been assimilatedTue Jul 25 1995 17:398
    Let me add my thanks also for the time and effort Carey put forth to
    provide such detail.  
    
    Should Marc or I do a formal elevation for this situation?  Or is this
    communication sufficent for now?
    
    Thanks
    Randall Buck
2532.16JULIET::LEE_CAThu Jul 27 1995 15:3512
    reg .15
    
      If your asking me whether this should be escalated I think we're OK
    for now. When the customer goes to v4.0 and loads all the new code if
    the problem does come back. WE WILL HAVE A POLITICAL NIGHTMARE ON OUR
    HANDS.
    
      I think the real question should be to the people who support these
    products if your gutt feeling is that this problem may not be resolved
    then by all means lets escalate and fix it.
    
    			Carey Lee