[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference netcad::hub_mgnt

Title:DEChub/HUBwatch/PROBEwatch CONFERENCE
Notice:Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7
Moderator:NETCAD::COLELLADT
Created:Wed Nov 13 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:4455
Total number of notes:16761

805.0. "I've done a terrible thing ..." by JAYJAY::KORNS () Tue Mar 08 1994 19:07

I have done a terrible thing. I sold some Digital network
equipment into an "off-base" account. This account has 
Apple Macintoshs, a Sun SparcStation and a Cisco router.

To hook all this together, we configured 4 DEChub90s 
(2 double hubs), a pair of DECbridge90FLs to interconnect
the two double hubs and a bunch of DECrepeater90Ts. This
should all be "unpack-plugin-and-go" type equipment, right?
I keep being told, and mistakening believing, that I don't
have to sell a VAX or Ultrix system along with each DEC
module. There should be no need to filter traffic or otherwise
manage the bridges/repeaters to we didn't need a agent or
SNMP manager.

I believe we now have a problem with the DECbridge90FLs
and we need to get to the console of the DB90s to examine
things and possibly adjust BRIDGE AGE and/or FLOOD parameter.
The customer reports intermittent "inability" to communicate
with a node on one side. My current theory is that the bridge
on that side has aged out it's entry for that MAC address and
isn't forwarding packets for it.  

I'm am looking for a MOP-Console Carrier implemenation for 
Macintosh, or, if I can't find that, MOP-CC for DOS or Windows.
And if I can't find either of those, for a Sun SparcStation. 
MOP-CC is the protocol used by our DECservers and some
bridges for simple remote management. It provides a simple
console I/O character stream to be encapsulated over Ethernet
so that a device without a serial console can be accessed.
On OpenVMS this is implemented by the NCP CONNECT command. On
Ultrix (and OSF/1?) it is implemented by the "ccr" (sp?) 
command. 

Having_a_bad_day, Dave,

PS: I promise never to do this again. I will only sell DEChubs
into accounts with VAX 9000s :-) 
T.RTitleUserPersonal
Name
DateLines
805.1Alternative???CGOS01::DMARLOWEHave you been HUBbed lately?Tue Mar 08 1994 22:2911
    What are the protocols on the network?  Remember DECNET and LAT
    multicast every now and then and so are maintained in the address
    table.  TCP however does not, so things can go away from time to
    time.  Other protocols I'm not sure about.
    
    You could also replace the DB90FL's with DEFAR's and still probably
    not break the ETHERNET 5 4 3 rule.  With 90T's in the hubs and 2
    double hubs, you will have 5 segments and 4 repeaters maximum and
    all should be fine.  Just a thought.

    dave
805.2QUIVER::SLAWRENCEWed Mar 09 1994 14:4018
    
    One easy way to get MOP requestor is to bring in a DECserver90L+. 
    
    I'm not sure I buy your theory, however; it requires that the node be
    completely silent until the address ages out.  That would be a pretty
    long time for most systems to be quiet.  
    
    Is there some pattern to the problem?  
    
    	Is it the same nodes each time? 
    
    	Does the problem occur with all protocols?
    
    There has been quite a bit of discussion of bad ethernet clocks in
    various adapters which usually manifests when the packet size is large. 
    The DECrepeater90T is pretty strict about this and will 'corrupt'
    long packets coming from nodes with bad clocks.  For more on this you
    could look in the UPSAR::ETHERNET conference.
805.3more on WGB90 agingQUIVER::SLAWRENCEWed Mar 09 1994 14:455
    The default bridge age time is 900 seconds (displayed as '450*2');
    that's 15 minutes, which is a pretty long time to be completely quiet,
    even for IP.  It is also long enough that most IP end nodes will have
    aged out the ARP cache entry for that address, causing a broadcast ARP
    request to the node and a unicast response from it.
805.4keep selling to those non-DEC accounts...NAC::FORRESTWed Mar 09 1994 18:0215
	Dave, regarding your half-joking dig about selling into a non-DEC 
	account...

	I know we aren't perfect yet in playing in non-DEC environments, 
	but this problem could have been easily avoided by selling them 
	a DECagent 90 for SNMP management. Even if they didn't want to 
	buy HUBwatch for Windows at $495, or even if they didn't have an 
	SNMP manager of any kind, they could MOP to the bridge from the 
	agent console, just like Scott was suggesting you could do with 
	a DECserver 90L+.

	Maybe someday we'll get around to doing a bridge for the DEChub 
	90 that has integral SNMP...   for now we make the DECbridge 90 
	SNMP manageable via the DECagent 90.
805.5spanning tree warsNYOS01::PLUNKETTFri Mar 11 1994 18:518
    One thing that you might check is that the Cisco is not doing
    bridging.  If it is, the spanning tree algorithm might be flipping
    back and forth between IEEE and DEC spanning tree.  We found a
    problem with this when we put together a network of DEWBRs and
    DECbridge 90s.  The fighting protocols problem manifested itself in
    symptoms that sound like those in the base note.
    
    -Craig
805.6QUIVER::SLAWRENCEFri Mar 11 1994 19:055
    If the Cisco (or a DECbrouter90*, which run Cisco code) is bridging,
    set it to IEEE _NOT_ DECnet.  The Cisco implementation does not include
    the algorithm used in DEC bridges for the co-existance of the two
    protocols on the same LAN and the effect is as described in .-1; the
    DEC bridges thrash back and forth between IEEE and DEC.
805.7Question still standsBUDDIE::KORNSTue Mar 15 1994 23:3026
    RE: .4
    
    I'm aware of how a DECagent90 could help us. My "half-joking dig" 
    still applies in my mind. This network is so simple (or should be)
    that management isn't (shouldn't) be required. All this stuff should
    have plugged in and worked. This is a Macintosh shop and we don't
    have HUBwtch for that platform anyway.
    
    The customer contends the "age" default was the problem. If a TCP/IP
    host is in fact quiet when not used (which several people have told me is
    correct), we have problem whenever we sell into this environment with
    DB90s. 
    
    My original question still stands:
    
    	1) Do we have a MOP-CC for Macintosh, DOS or Sun laying around?
    
    ...and I'll add this one
    
    	2) Can we get a future DB90 product that fixes this problem, like
    	a full bridge instead of a leaf bridge, or a large default age
    	timer? (I'll leave the fix to engineering, but the problem is
    	real).
    
    Dave,
    
805.8NAC::FORRESTWed Mar 16 1994 12:0219
	re .7
	> 1) Do we have a MOP-CC for Macintosh, DOS or Sun laying around?
   
	Not to my knowledge. However, I'm sure there are SNMP managers 
	available for those platforms that could fix this problem, but then 
	you still have to sell them a DECagent 90.
 
    
	>...and I'll add this one
    
    	>2) Can we get a future DB90 product that fixes this problem, like
    	>a full bridge instead of a leaf bridge, or a large default age
    	>timer? (I'll leave the fix to engineering, but the problem is
    	>real).
    
	A full bridge wouldn't fix this problem. An infinite default age 
	would fix it, but the larger you make the age, the slower the bridge 
	adapts to changes, which isn't good either.
805.9ARP vs Bridge AgingQUIVER::SLAWRENCEWed Mar 16 1994 13:3916
    
    > The customer contends the "age" default was the problem. If a TCP/IP
    > host is in fact quiet when not used (which several people have told
    > me is correct), we have problem whenever we sell into this environment
    > with DB90s.
    
    It is possible that is the problem.  If so, the problem could exist
    with _any_ bridge, since the 'problem' is the relationship between the
    address aging in the bridge and the address aging in the ARP cache of
    the system trying to send to the quiet node.  If the ARP cache entry
    lasts longer than the bridge forwarding entry, then the end system will
    not send the ARP that would have refreshed the bridges' entry.  15
    minutes is a pretty long time for an ARP entry to be held (but Ultrix
    and UCX hold them even longer - which has been a problem with the hub
    when you change IP service modules).
    
805.10"leaf" vs. "full" bridge ...JAYJAY::KORNSMon Mar 21 1994 18:2397
When the DB90s came out, there was discussion about their special
operation as leaf bridges. If I remember all this correctly, DB90[FL]s
behave differently when presented with a packet that is not in their 
"address tables". Someone correct me if I'm wrong, but I believe this 
"leaf" behavior is the "basic" reason for the problem, not the "age" issue.

A DB90s addresses table IS NOT a list of every address known of in
the network as a normal bridge would maintain. A normal bridge has a
table of every address it has detected (on any port) and a value for 
which port it saw the address from. The address table in a DB90[FL] is 
the list of addresses the DB90[FL] has seen on it's workgroup side ONLY.
Addresses seen/heard on the backbone side are not learned and/or
added to it's address table, since the table is only for those in the
workgroup. This table holds 200 enties in the DB90[FL]. 

If a DB90 receives a packet on it's backbone port, it DOES NOT forward
or flood the packet onto the workgroup port unless the destination
address is in it's address table (which is a WG table). If a station
address in the WG addr table had aged out, packets for that address
would not be forwarded/flooded.

If a normal bridge received a packet on it's backbone port (or any port),
if the bridge does not have an entry for that station in it's address
table, the bridge floods the packet on all other ports. (RE: Bridge and
Extended LAN Manual Reference EK-DEBAM-HR-003, section 2.1.3.1).

That says the following configuration is likely to have problems:

     +------+                                       +------+
     |TCP/IP|     +------+            +------+      |TCP/IP|
     | Host |     | DB90 |------------| DB90 |      | Host |
     +------+     +------+            +------+      +------+
         |   LAN-A    |                   |    LAN-B    |
   =======================            ======================

(NOTE: there are also dozens of Macintoshes on the LANs as well but they
do not participate in the problem since they beep msgs on a regular basis.)

... and this is precisely the configuration in the case I have been
talking about. The TCP/IP system on LAN-A is a Cisco router. The
TCP/IP system on LAN-B is a Sun Sparcstation. Once The DB90[FL] on
LAN-B ages the Sun system's address, packets from the cisco do 
not get thru to the Sun system (ie; since the LAN-B DB90 does not
forward packets onto the workgroup unless the destination is in
the forwarding address database (ie; a WG address DB). I contend that 
if these were full bridges, this would not occur, since the rule is
for a bridge to flood a packet if the destination addresses IS NOT in 
the address table.

Admittedly there are ways around this. Manually flushing the ARP cache
in the cisco (causing cicso to to ARP broadcast next time) and setting the
BRIDGE AGE to 0 (which assumes the Sun is seen by LAN-B DB90 at least
once, which might not be the case after a bridge reboot. Another 
possibility is to configure each TCP/IP node to have a ARP cache
age shorter than the bridges. This would cause them to ARP broadcast
before the address aged, keeping the bridges from aging the entries.
I don't not even know if this is possible and/or how to do it if it is
possible. I do not consider it a solution anyway. The goal is to get
our products to plug-and-play in normal, everyday environments with
minimal setup/management (hasn't that always been our goal?).

The above configuration (ie; TCP/IP hosts) seems to be a very normal 
configuration these days. 

-----
We are shipping products whose default behavior is flacky in this
environment.
-----

The predominant answers I am hearing suggest selling SNMP management to 
fix the problem. I think this is missing the core issue. Can we produce 
a DB90 that elimnates this fundemental limitation?

Dave, 





















 

805.11QUIVER::SLAWRENCEMon Mar 21 1994 19:109
    While it is true that TCP & IP don't have the 'keep-alive' messages
    many Digital protocols use, it is pretty rare (in my experience) for a
    system to be completely quiet for as much as 15 minutes.  Sun machines
    at least used to come configured to do quite a bit of broadcasting
    (timed, rwho, ruptime).  
    
    What makes the customer so sure that it was the bridge aging that was
    the problem?  I'm not saying they're wrong, I'd just hate for them to
    think they've solved a problem when it might just be masked.
805.12RE: 805.11BUDDIE::davesmac.auo.dec.com::Kornslike on your feet, but with a "K"Wed May 04 1994 04:1651
The SparcStation on LAN-B sits idle all night after school gets out (this
is an elementary school :-). It has been configured to do very little
other than be a local DNS name server. You wouldn't want elementary
school teachers to have to use UNIX would you? :-)

Anyway, at night, teachers can dial into a local SLIP network run by the
University which is all behind the CIsco on LAN-A. When they attempt to 
telnet or ping the SparcStation, they can't. Our strong theory is it's
been aged by the by the LAN-B DECbridge90. As a first attempt to bang away
at the problem, they got with the Univeristy (which thinks Digital network
gear sucks by the way) and they logged into the cisco and did a "flush ARP
cache" (syntax??). Presto, it starts working. For awhile. Until the Sparc
ages again. Final solution was setting the age to infinity. BTW: the University
used the Cisco to temporarily enable bridging to the main campus, found a
dusty old VAX with NCP on it, did a mop CONNECT commmand to both DECbridges
and set the AGE parameter to infinity; turned off briding on the cisco and 
probably cursed our name all the way home. 

I haven't gone back and analyzed the cisco defaults for ARP cache lifetime 
and all that stuff, but I am convinced by the various symptoms they we 
understand the problem. Unfortunately, one consultant at the elementary school
also understands the problem and so does the Univerity. If anyone has experience
in selling in a smal town, news travel fast. The Soap-Opera Digest headlines
would read "Digital bridges to not work with TCP/IP, the worlds most popular
networking protocol !!!"

We know this isn't true, but it's hard to stop rumors, especially when it
takes a blackboard and an IP degree to explain the "symptom". There is no way
we can safely sell or configure around the problem. 

If I were to write the product requirements today, for the DECbridge90
follow-on, they would read like this:

    - 10/10 full bridge, 4096 address support (2048?, 1024?)
    - add an IP routing option for good measure, I've run into other
      situations calling for a low-cost, 10/10 IP router where you're
      conecting buildings which must be different subnets
    - add SNMP management
    - put a serial console port on it (I forgot add that the consultant
        spent some amount of time trying to attach a serial console
        to the AUI connector on the front of the DECbridge90FL since
        nothing else was plugged into it (using fiber) and since the
        cisco has a serial port, surely this other thing'ie on the
        DECbridge90FL must be a serial port too (a little humor)
    - allow telnet (and MOP-CC) connections to the console logic
    - you have permission to raise the price (versus DECbridge90) by 10-20%
    - ship next week :-)

Contact me if other questions.

Dave
805.13It doesn't forward unknown destinations...DPDMAI::DAVIESMark, SCA Area Network ConsultantThu May 05 1994 12:5923
    I am a little late into this note...
    
    Dave mentions that the DB90's "leaf" status means that it will not
    forward a packet to the workgroup side if the address is not in the
    workgroup cache.
    
    The question is:  Is this true?
    
    If it is, than the DB90 is not a bridge, it is something which Digital
    sells and wants people to believe that it is a bridge.  A bridge that
    does not automatically forward unknown destination datagrams to all
    ports is useless and dangerous.
    
    This type of a bridge implementation smacks of the same engineering
    weirdness as the non-DEC repeaters out in the world that can not pass
    10Mbps (Many notes have Digital folks saying this is ludicrous).
    
    I have sold these bridges and I have mistakening thought that it
    functioned like a standard bridge, ie, forwarding unknown destination
    datagrams.  If it doesn't, I will not sell another.
    
    Mark
    
805.14Yes, it is a bridge....LEVERS::BATTERSBYThu May 05 1994 13:2116
    Hmmmm I may be mis-interpreting your question, but the DB90 *is*
    a bridge.
    The DB90 has a feature where when there are fewer than 200 stations
    on the Work Group side of the LAN (opposite that side which is conncted
    to the Backbone), the DECbridge automatically prevents unecessary
    traffic from being transmitted from one LAN to the other.
    When there are more than 200 stations in the work group, the DB90
    enter Flood Mode, which reduces the effectivness of the traffic
    isolation but allows connectivity across the bridge. 
    Now in your example, the destination address address is not known
    to exist on the workgroup side, thus the bridge has learned that
    the address is not there, so why should it forward that datagram to
    that side if it doesn't exist there? That's what a bridge is supposed 
    to do, from my experience.
    
    Bob
805.15RE: 805.14BUDDIE::davesmac.auo.dec.com::Kornslike on your feet, but with a "K"Thu May 05 1994 14:3526
I'm getting the feeling people are trying to deny a problem exists. I'm
also getting very tired of monitoring this note. I guess I should
extract all the replies in this note and formally elevate the issue
since this is getting no where (I admit this forum isn't a formal 
product requirements system but I haven't even seen any acknowledgement
of the problem yet)

Re: .13 & .14

What Mark is saying is "this thing is not a bridge according to IEEE 802
and/or DEC Spanning Tree". It has a "special behaviour" about it's
decision to forward to the workgroup side which is SCREWING UP in very 
typical network configurations.  

Yeah, it's kinda-like a bridge because it's clearly not a repeater and it's 
clearly not a router but that's like saying a duck with one leg is a duck.
It's not a chicken and it's not frog, so it has to be a duck. (sorry if this
analogy sucks). This duck can't walk :-)

Anybody else out there have any comments before I EXTRACT 805.* and send to 
engineering management?

Dave,



805.16Let's verify functionalityDPDMAI::DAVIESMark, SCA Area Network ConsultantWed May 11 1994 15:1220
    re: .14
    
    The workgroup side has less than 200 entries.  The system in question
    has been aged out of the DB90 cache.  Now to make this system on the
    workgroup side known "again" it will have to be manually entered or
    this system will have to transmit a packet.
    
    This system is a BIND server.  It is waiting for someone to request
    service.  No one can find it because the DB90 has aged it out of it's
    cache.  It is not going to transmit a packet until someone requests
    information from it, but no one can find it, etc, etc...
    
    This is a BIG problem.
    
    Does  engineering agree that this is how the DB90 functions?
    
    Regards,
    
    Mark
    
805.17The rational behind decision.CGOS01::DMARLOWEHave you been HUBbed lately?Thu May 12 1994 04:2620
    I'm not from engineering but that is exactly how the DB90 works.
    For DECNET and LAT nodes this is not a problem as they transmit
    every 15 and 60 seconds respectively.  If the node ages out of the
    table (default 900 seconds) then that node must send something so
    it gets put back in the table.  I don't know how large you can set
    the age out timer.  Maybe someone can tell you.
    
    Unless there was a way to expand the address table such that it
    could record addresses on both sides then you won't get much better
    workings with the DB90.  If engineering did it differently but with
    the 200 node address limit then virtually every packet on the backbone
    would be sent into the hub under the assumption every address has
    aged out so should be forwarded into the workgroup, just in case.
    It then becomes a store and always forward bridge and you would
    get no traffic filtering.

    Would there be any way to set up a procedure that would make the
    server send out a packet every 10 minutes or so?
    
    dave
805.18It's broken & We hear you, but...QUIVER::SLAWRENCEThu May 12 1994 12:1815
    
    It is correct that the DECbridge 90 does not forward to the workgroup
    packets with unknown DAs; this is because it has no address table for
    the backbone port and assumes that all addresses in the workgroup will
    be known (that is, that they will not remain silent long enough to age
    out).
    
    Engineering knows that this is broken, but it cannot be fixed with the
    current hardware.
    
    If you believe that we need to build a replacement product that does
    all this correctly, you should express that belief to product
    management, supported by _NUMBERS_.  Engineering has been trying to
    find money to do this and it keeps not making the cut.
    
805.19Unlimited/infinity aging timer valueDPDMAI::DAVIESMark, SCA Area Network ConsultantThu May 12 1994 14:2110
    A fix that would work immediately and provide a good solution for
    existing hardware is to allow an "unlimited/infinity value to be
    applied to manually entered cache values", thus preventing certain
    cache addresses from ever being aged out.
    
    This would work well and fix problems whenthe product is used in a
    non-DEC environment, ie, not DECnet or LAT protocols.
    
    Mark