[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:DECmcc user notes file. Does not replace IPMT.
Notice:Use IPMT for problems. Newsletter location in note 6187
Moderator:TAEC::BEROUD
Created:Mon Aug 21 1989
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:6497
Total number of notes:27359

912.0. "Crash with Wave1?" by PARZVL::KENNEDY (Give me your watch & I'll tell you) Fri Apr 12 1991 19:04

We've recently upgraded our system to DECnet/VAX Extensions V1.0 FT (Wave1). 
It's running VMS V5.4-1 and DECmcc BMS V1.1 + Toolkit + FDA.  The MCC kit is
SSB (7 Mar) and was installed just before upgrading to Wave1.

We've been able to get the system to crash reliably when executing the following
commands:

MCC> show node phz5g8 routing circuit csmacd-0 adjacency rtg$008e all attr
MCC>  show node phz5g8 routing destination node * all attr
MCC> show node phz5g8 routing destination node 08-00-2b-0d-ce-03 all att

The information comes back, but the system crashes with:
----------------------------------------------------------------------
System crash information
------------------------
Time of system crash:  9-APR-1991 10:24:48.78


Version of system: VAX/VMS VERSION V5.4-1

System Version Major ID/Minor ID: 1/0

CPU bugcheck codes:
        CPU 00 -- SSRVEXCEPT, Unexpected system service exception
------------------------------------------------------------------------

Has this been heard of before?  I tried to QAR, but NACQAR doesn't seem to 
have a QAR_INTERNAL account.  A dump has been saved of once occurrence.

_Mek

T.RTitleUserPersonal
Name
DateLines
912.1QAR on PHVQARMARVIN::COBBGraham R. Cobb (Wide Area Comms.), REO2-G/H9, 830-3917Mon Apr 15 1991 17:5712
Enter Wave  1  (but  not MCC) QARs on node PHVQAR::.  Also there is a Wave 1
notes  conference on MARVIN::DECNET-VAX_EXTENSIONS (press KP7 to add to your
notebook).

Did you try the commands before installing Wave 1?

This is  almost  certainly  a Wave 1 problem (possibly in addition to an MCC
problem):  MCC  *shouldn't*  be  able  to  crash the system! By the way, you
should  mention what sort of system PHZ5G8 is -- I presume it is a WANrouter
(OSIrouter)? Have you tried issuing the commands from NCL?

Graham
912.2Wave1 group thinks it's DECmccPARZVL::KENNEDYGive me your watch & I'll tell youTue Apr 16 1991 12:0129
I did also enter this in PHVQAR, the response was that this was not likely to 
be a Wave1 problem.  We tried NCL and the same command did not crash the system.

Here's the response to the Wave1 QAR - any comments from the DECmcc group?  I 
do have a crash dump available.

From:	MARVIN::GILLOTT "Mark Gillott, >>> RKG, 831-3172 <<<  16-Apr-1991 1056"   16-APR-1991 05:56:08.91
To:	PARZVL::KENNEDY
CC:	GILLOTT
Subj:	QAR 00249 in the DNVEXT_US database has been CLOSED


If you  are  are sure that the system crash is related to the issuing of the
"show  node  x ..." command to DECmcc (and it certainly looks like this is
the case), then this is not a Wave 1 problem.

At present  ALL  communication  between  a  VMS  host  and an OSIRouter uses
standard  Phase IV DECnet (we don't yet have a Phase V VMS NSP implemenation
-  unless you are actually using Wave 2?).  Consequently the problem MUST be
related  to  DECmcc  - as you have indicated issuing the same command to NCL
works perfectly.

Now it  may  be that DECmcc is using the logical link in a non-standard way,
which   in   turn  is  exposing  a  problem  with  the  NSP/Session  Control
implementation  on  the  OSIrouter  (or possibly exposing a problem with the
Phase  IV  DECnet-VMS  implementation).   If  you can prove that this is the
case,   you   should   enter   a   QAR   into  the  OSIrouter  QAR  database
(WANLAD::OSI500_V10_QAR).

912.3Looking for common factorsTOOK::KOHLSRuth KohlsTue Apr 16 1991 13:4716
I'm looking for common factors between this and the crash mentioned in
passing in note 886.2, since the same message appears.  I really don't
know what is relevant, so don't try and read anything into the questions!

So, what version of DNS are you using, and where is the DNS server?
Did you keep a log of all these installations, and/or did you see ANYthing
"unusual"?  Is this the first time and first version of DECmcc you've 
installed? (Was your system "clean" of all MCC stuff before the installation?)
Did you set up your namespace as documented in the DECmcc installation manual?

I'm passing this note on to any DECmcc development people I think might be 
able to help. I do agree that MCC by itself ought not be able to crash
anything--I think its bad combinations of factors, and we all need to
find out what the factors are.

Ruth
912.4TOOK::J_HALPINTue Apr 16 1991 20:4631
    
    
    
    	Well, this is a bizzarre bug. I can reproduce MaryEllen's crash
    on her system (VMS 5.4-1 & Wave 1) and on my workstation (VMS 5.4-1 &
    Wave 2 but doing Phase IV style connects). The crash does not occur on
    VMS V5.3 or VMS V5.4-1 systems running DECnet/VAX Phase IV.
    
    	This is the command that will do it every time:
    
    MCC> SHOW NODE PHZ5G8 ROUTING CIRCUIT CSMACD-0 ADJACENCY RTG$007F -
    MCC_> ALL ATTRIBUTES
    
    NOTE: The ADJACENCY instance name is irrelevant, the crash happens on
    any adjacency.
    
    The DNA5 AM correctly returns the IDENTIFIER  and STATUS partitions
    (the only two defined for this entity), then there is a long pause
    followed by a the system crash. One the non-WAVE # systems, the same
    long pause is there before the MCC prompt is returned.
    
    I can issue individual requests for the IDENTIFIER and STATUS
    partitions without any problems. So the crash is definitely related
    to an ALL ATTRIBUTES request.
    
    Could this have something to do with the REFERENCE Partition???
    
    JimH
    
    
    
912.5Maybe an MCC/DNS interaction problem?TOOK::GUERTINI do this for a living -- reallyWed Apr 17 1991 12:087
    RE:.4
    
    If the problem is with the REFERENCE attributes then that would imply a
    problem with the DNS Clerk.  Can you try the same command with ALL
    REFERENCE?
    
    -Matt.
912.6SHOW ALL REFERENCE will do it!PARZVL::KENNEDYGive me your watch &amp; I'll tell youFri Apr 19 1991 12:025
Matt,

SHOW ALL REFERENCE will cause the crash.

_Mek
912.7There is a patch for a DNS caused crash for Vax/VMSCOOKIE::KITTELLRichard - Architected Info MgmtFri Apr 19 1991 13:3813
This may not be relevant to Wave * systems, but we just crashed a VMS V5.4
system. I was involved in the crash analysis because my process was active,
running MCC_MAIN.

The crash was "Unexpected system service exception" and the site of the 
faulting instruction was traced to DNS$SHARE.

At that point our system managers went "Aha! we've heard of a patch that
fixes a problem with DNS getting an error returning from a system service."
I was unshackled and allowed to leave the dungeon where people who crash
the production cluster are confined until they confess their sins. :-)

912.8The patch seems to do the trickTOOK::GUERTINI do this for a living -- reallyFri Apr 19 1991 17:3318
    Thanks, Richard.
    
    This morning I ran SDA on the crash dump, and it is definitely crashing
    in DNS$SHARE with an ACCVIO on address 00000008.  The system I was looking
    at is VMS V4.5-1. The ACCVIO ended up generating a SSRVEXCEPT.  I'm
    sure we're all seeing the same thing.  I found the patch in the
    DNS_PROGRAMMERs notes file, note 120.3 (hit KP7).  
    
    I just tried the patch on a system which was crashing on a
    SHOW NODE .... ALL REFERENCE and it stopped crashing after the patch!
    (The date of the SYS$SHARE:DNS$SHARE.EXE image was 8-OCT-1990.)
    So, I'm going to assume (based on this empirical data) that this is
    a DNS bug, and not investigate this one any further.
    
    BTW: Has anyone seen this crash on anything other than a DNA5 NODE command?
    
    -Matt.   
    
912.9DNS patch worked for us!PARZVL::KENNEDYGive me your watch &amp; I'll tell youThu Apr 25 1991 12:166
Sorry for the delay, just wanted to confirm that the DNS patch fixed our 
problems.

Thanks to Jim Halpin and everyone else who got this nailed down so quickly.

_Mek