[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference netcad::hub_mgnt

Title:DEChub/HUBwatch/PROBEwatch CONFERENCE
Notice:Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7
Moderator:NETCAD::COLELLADT
Created:Wed Nov 13 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:4455
Total number of notes:16761

1786.0. "Fault-Tolerant 10BaseFL FOT ??" by MSDOA::REED (John Reed @CBO, DTN:367-6463, KB4FFE, SouthEast) Thu Dec 15 1994 15:56

    Hello.
    
    I am configured a large site full of ALpha's on DECConcentrator900,
    and PC's off of DECswitch900EF, and a few old DECbridge 620's.   This
    has worked nicely for my customer for a year.  But his VAX file server
    for the PC's is a 4000 with an ISA-0 ethernet and a QNA ethernet.  He
    lost a DECBridge900mx a few weeks ago, when it started rebooting for no
    good reason.  I upgraded the firmware, and it seemed to help.   
    
    He is now concerned about redundancy for his Pathworks physical
    connection to the server.   When that DECbridge rebooted, all of his
    pathworks users got knocked off the LAN.   He has asked how to connect
    the file server redundantly to the FDDI ring.  I told him that the best
    method was by using a direct Fiber link, to a DECRepeater900FP
    port-pair.   He has two DEChub900's, that could receive a DEFMM card,
    and have spare ports of the DECbridge900mx switched to the backplane.
    
    I could feed a fiber from a master port, and a fiber from a slave port
    to the VAX's ISA-0 ethernet port, and attach a fault-tolerant FOT to
    that VAX Ethernet port.  (He is using the QNA-0 port for cluster
    traffic only).
    
    
    I called Anixter, and they sell a fault-toleranct optical transceiver
    that speaks 10baseFL, from a vendor called MiLAN Technology.  It has
    two modular ports that can be configured each for either 10BaseT, or
    10BaseFL.  It will use the active port unless it is "disrupted" and
    then the backup port will take over. 
    
    Is is possible for me to attach this transceiver with two 10baseFL
    inserts to a fiber from each DEChub's DECrepeater?  I know that the
    transceiver will probably perform the switching, and the two repeater
    ports will probably keep running, since the stand-by port shouldn't
    transmit. I expect that the STP from the bridge on the backup port will
    not enable that port until 45 seconds after the failure of the first
    bridge or fiber, but I expect this solution to be better than  a single
    link.
    
    We saw a chart at NEtwork Academy that told about the protocols used by
    Digital's redundant ethernet fiber pairs.  DO we have a planned
    fault-tolerant FOT that will operate with the same protocol used by the
    DECrepeater900fp ??   I would prefer to sell DEC product, but any port
    in a storm...
    
    JR
    
    
    
     
    
    
    
    
    
    
    
    
    
T.RTitleUserPersonal
Name
DateLines
1786.1KAOFS::S_HYNDMANAcronym Decoder Ring ArchitectThu Dec 15 1994 20:4611
    
    
    	I'm not really clear on what your trying to do, have backup
    connections to bypass the bridge or provide redundant connections for 
    the server.  If it was the latter, why not go DAS FDDI and multi home the 
    server on the ring?
    
    	Cabletron also make redundant fiber tranceivers.
    
    
    Scott  
1786.2It's a VAX 4500 Pathworks ServerMSDOA::REEDJohn Reed @CBO, DTN:367-6463, KB4FFE, SouthEastFri Dec 16 1994 12:1723
    The 4000 series VAXes have Q-bus FDDI controllers as the only available
    option, and the customer feels that the throughput of the ISA Ethernet
    will be faster than a Q-bus attached FDDI controller.
    
    I need to have a way to reach this file server if the DEChub900 near it
    decides to crash.   The customer has expericenced several hub crashes
    (it's on a UPS, has  three DECCon, and one DECbridge, with three power
    supply modules) and each time the hub reboots, he looses the
    connections to his file server.   He wants a way to keep the PC's
    running through the HUB crashes.  
    
    The PC's are connected to Ethernets, on various other DEChub mounted
    DECbridge900's.   I beleive that the Fault Tolerant Ethernet FOT
    attached to his ISA-0 Ethernet, and one Primary fiber port fed to a
    repeater module in one hub, and the backup fiber port fed to a module
    in a different hub will work, as long as the repeater modules have
    ANOTHER working node on their PORT GROUP.  This will keep the spanning
    tree from shutting down either port, and allow the FOT to choose the 
    proper path to enable.   The customer would like to eventually put an
    FDDI PC file server on the ring.  But I think that this will be a godd
    starting point.
    
    JR
1786.3NETCAD::SLAWRENCEFri Dec 16 1994 15:238
    
    Ahh hah!  The hub is crashing?  It shouldn't be, so let's look at
    that...
    
    What are the firmware revs for the Hub and all modules?
    
    Are there error log entries? 
    
1786.4I HOPE the crashing has stopped...MSDOA::REEDJohn Reed @CBO, (803) 781-9571 NIS NetworkerMon Dec 19 1994 12:1331
    The crashing appears to have stopped after we upgraded to the most
    recent revisions.  (It hasn't occured for a week now, and it used to be
    several times a day).   They used to have DECcon FM 2.0.0, DECBridge900
    version 1.2.1, and HUBmanager v3.0.0.   It ran wonderfully, until the
    imaging application on the alpha's came online.  They have an Alpha
    farm with Kubota(tm) graphics accelerators and funny little
    transmitters on top of their screens.  They wear 3-D glasses, and do
    molecular modelling.  The images spin around, suspended in the air in
    front of your monitor.  If you wear the glasses, and turn out the
    lights, it would make a great lava lamp at a 60's party...   They are a
    medical research and design firm, with a lobby full of patent grants
    and awards.
    
    They have since upgraded to v2.8.0 on the Conc, 1.4.0 on the Bride, and
    3.1.0 on the HUB managers.  They feel the problem was traffic related,
    and they think the DECbridge900 "couldn't keep up with the traffic." 
    The customer's MIS department suffered a lot of grief during the
    period when the Hubs were rebooting.  The MIS staff doesn't want this
    to occur again, and they see how the link to their file server is a
    single point of failure.   (For that matter, having a single file
    server is also troublesome).   So, they are planning additional fault
    tolerance.   They like the FDDI, and the speed, and the way that it
    wraps around outages.  We are trying to add to their comfort level
    about the bridges, and give them some redundancy.  Ethernet and the STP
    will not bypass a fault as quickly as FDDI, (typically 45 seconds) so
    their LAT and Pathworks DIsks might time out during a hub crash.  But I
    hope to create a config where the users can get back on quickly.
    
    JR
     
    
1786.5The crashing should never have started...NETCAD::SLAWRENCEMon Dec 19 1994 20:3041
          
    I don't know how much comfort it will add, but here's more data, for
    what it's worth:
    
    The crash you saw was very well understood here; in fact, it took out
    our file servers here in DEChub Engineering before we ever released the
    bridge to field test.  
    
    The original problem was in the bridge, and was - in a way - traffic 
    related (your customer was right).  It started with a bug in the IP 
    fragmentation code in the bridge that occured only if two IP packets 
    arrived from the FDDI requiring fragmentation _very_ close together 
    such that they both were queued together in the bridge (this is a 
    very narrow window).  It took a little while, but a few of these
    crashed the bridge.  Combine that with some problems in the hub manager
    that had problems with modules that crashed too frequently, and you end
    up with an unstable hub.  (Without this bug they keep up just fine, by
    the way)
    
    The good news is that all of the above are fixed in the latest
    releases.
    
    The bad news (for your customer) is that the bugs have been fixed for
    quite a while now, and they didn't get the fix.
    
    We have spent a great deal of energy here on trying to create a set of
    mechanisms that ensures that the latest releases of all our firmware is
    available to the field and (where possible) directly to the customer -
    but it does no good if you don't check them.  What your customer had
    was the very first field release of firmware - almost certain to have
    at least some minor problems (in this case, unfortunately, it was
    fairly serious for them because thier Alphas were so fast).
    
    We _cannot_ guarantee that you will get the latest release of firmware
    when hardware is delivered to you.  You should _never_ assume that it
    is up to date.
    
    We have Internet and Easynet archives for the latest firmware, and
    mailing lists that you and your customers can subscribe to for release
    notices.  Pointers to both are in the owners manuals and/or the release
    notes.