[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::digital_unix

Title:	DIGITAL UNIX(FORMERLY KNOWN AS DEC OSF/1)
Notice:	Welcome to the Digital UNIX Conference
Moderator:	SMURF::DENHAM

Created:	Thu Mar 16 1995
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	10068
Total number of notes:	35879

8760.0. "Token ring: ring recovery messages" by INDYX::ram (Ram Rao, PBPGINFWMY) Fri Feb 07 1997 15:59

I have a customer running V4.0A on a 4100 with a PCI Token Ring Adapter
PBXNP-AA.  Their kern.log file is being flooded with messages of the
following type:
	test4100: vmunix: tra0: ring status: ring recovery
(Their machine in named test4100).  Once in a while they are having
connections to the machine dropping.  Other notes in this Notesfile
seem to suggest that their are some token-ring patches available,
but I have not found them in the public areas on oskits or guru.

Customer is willing to upgrade to V4.0B if this will solve the problem
or a V4.0B patch exists.  (The V4.0B patches seem to be more up-to-date
than V4.0A).

Thanks,

Ram

T.R	Title	User	Personal Name	Date	Lines
8760.1	Get the patch	SMURF::GILLUM	Kirt Gillum	`Mon Feb 10 1997 13:20`	10
	There is a patch for running on the 4100 (a TI errata implementation). Contact Bob Spear (spear@zk3.dec.com) for a pointer to the patch. Typically "ring recover" occurs everytime that a node enters/exits the ring. Also, if you look at the counters, you'll probably see alot of beaconing (indicative of a bad connection). However, start with the latest patched driver.
8760.2	still errors after patch	DYOSW5::WILDER	Does virtual reality get swapped?	`Wed Mar 05 1997 10:09`	17
	I am also working with the customer listed in the base note. The customer is now at 4.0b. We have applied the patch provided by spear. They are STILL getting a lot of "ring recovery" errors. After all this, the system crashed after running for about 9 days. The crash-data showed numerous ring recovery errors at the time of the crash. To our knowledge, there are not nodes entering/exiting antwhere NEAR the number of ring recoveries they get. Any ideas? Should we ignore this? Thanks, /jim
8760.3		SMURF::GILLUM	Kirt Gillum	`Thu Mar 06 1997 18:06`	8
	Ring recoveries are not a big deal... Perhaps you should try a different cable/mau port on the adapter. Also, look at the counters and see if anything abnormal jumps out at you (netstat -s -Itra0). Crashing is a big deal. Why are you crashing? Is the token ring driver on the stack when the system crashes?
8760.4	Recovery errors concern them	NETRIX::"nancy@csc.cxo.dec.com"	Nancy Flavell	`Fri Mar 21 1997 15:05`	34
	Kirt, We appreciate your help with this problem. Since installing the latest Digital UNIX patches (including the one you mentioned from Bob Spear), the system has not crashed again. It seems that there is not even a core dump or crash-data left to analyze on the system, which has been running for eight days since applying the patches. However, they still get so many of the "ring recovery" messages that they are quite concerned the Digital equipment or software has some kind of problem. Their viewpoint is that there are only three systems on the token ring (IBM and HP are the other two), none of which are being added or removed from the ring, and only one of which report the ring recoveries, namely the Alpha. We are working to get the additional information you requested, like the netstat -s. Meanwhile, may we confirm whether the sole cause of the "ring recovery" messages is supposed to be nodes physically being removed or added to the ring, please? I will also mention that the hardware has been replaced, resulting in no changes to the symptoms. Nancy Flavell Digital UNIX Network Support Specialist Customer Support Center, Colorado [Posted by WWW Notes gateway]
8760.5		SMURF::GILLUM	Kirt Gillum	`Fri Mar 21 1997 17:04`	17
	The reason that none of the other systems display the message is because they probably ignore it. I think I'll change the driver to ignore the event. It would save customers from getting concerned. From the TI TMS380 Second Generation Token Ring User's Guide... RING_RECOVERY: This bit is set to one when the adapter observes claim token MAC frames on the ring. The adapter may be transmitting the claim token frames. This bit is reset when a ring purge frame is received or transmitted. So the ring monitor probably has detected an error condition and sent around a claim frame. Typically this happens when a node joins or exits the ring, but can also occur for several other reasons (like not seeing the token within the token rotation time).