[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference ssdevo::hsj40_product

Title:HSJ30/40 Product Conference
Moderator:SSDEVO::EDMONDS
Created:Tue Jul 13 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1264
Total number of notes:4958

1240.0. "Dual-redundant HSJ's disagree on drive status." by KERNEL::CLARK (STRUGGLING AGAINST GRAVITY...) Wed Apr 23 1997 08:57

    Problem passed to me by field engineer:-
    
    HSJ40 based disk (RZ28M) is part of a shadow set.
    
    The HSJ40 is one of a dual-redundant pair, both running HSOF 2.7-0J
    
    On one HSJ40, the drive is operating OK.
    On the other HSJ it shows as "misconfigured".
    
    As far as can be determined, the drive changed from "OK" to
    "misconfigured" at some time after it was known to be operating OK.
    
    The first indication of a problem came when the amber light on the
    drive started to flash.
    
    There are no errors in the cluster-wide errorlog to indicate that the
    drive had a problem.
    
    This is the second occurrence of this type. The last time was on a
    different drive and different HSJ pair.
    
    At present it is not possible to delete and re-add the drive (it
    refuses to delete).
    
    I found some notes references which may explain this restriction:-
    Topic 617.0 in Conference vmsnotes_v12:
    DECnotes Reply 32.1 in Conference hsj40_product on ssdevo
    DECnotes Topic 3118.0 in Conference ask_ssag on ssag
    
    Has anybody else observed this problem?
    
    				Dave Clark
T.RTitleUserPersonal
Name
DateLines
1240.1GIDDAY::HOBBSAndy Hobbs. Sydney CSC. -730 5964Wed Apr 23 1997 23:1417
    
     Dave,
    
    I've had this a few times and I've issued a CLEAR UNKNOWN from the
    dubiously-informed HSJ40, even while VMS was using the target unit
    via the other controller - it fixed it, but didn't negatively affect
    the operating system.
    
    Anything which causes the HSJ40 pair to disagree with each other,
    scares the heck out of me. 
    
    Is it possible to RESTART the one with the problem ?
    
    Are you running with WB Cache ? (Get 2.7-1, if so - Saves Ronald
    telling you ;^)
    
    Andy/.
1240.2Yup I like to keep people up to date with patches...SSDEVO::RMCLEANFri Apr 25 1997 18:3210
Aww... You ruined my fun! ;-.]

Yup I believe that there were times with V2.7 where the two differed because
the state info was not transferred between the HSJ's.  As .-1 will tell you
it really is not important because the correct info is on the running 
controller.  This was especially true about sizes of things because the
backup controller was unable to get the drives long enough to find out
what their state is.

I think you will find that this is improved with V3.1
1240.3Removeables are worse...SSDEVO::FIALAMe, I'm just a recycler.Wed May 28 1997 13:029
The state "appears" different more often with removeable media
tapes+, cd's, opticals. As long as the online controller
has the perceived correct state its of no concern.
The problem is that the controller checking the unit state
may/may-not be the one mastering the device.

This would! be a problem if the unit were able to be online
to both controlers and simultaneous accessed like the Massbuss.
But this isnt the Massbuss [although I wonder about scsi]...