[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference ssdevo::hsz40_product

Title:HSZ40 Product Conference
Moderator:SSDEVO::EDMONDS
Created:Mon Apr 11 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:902
Total number of notes:3319

878.0. "HSZ50 decoding of failing disk trough uerf" by NNTPD::"SilvaJ@mail.dec.com" (Jorge Silva (XIP)) Thu May 15 1997 07:48

We had a problem with one costumer where is has stripesets with 7 disks.
Through uerf we were getting messages with the following cam string: Hard
Error detected. Through uerf without options one can see which scsi bus,
target id and lun id is failling.

I saw in note 278 that there is a way of knowing wich port, target, lun is
reporting errors. Neverthless, this note information pertains to HSZ40
controllers. So my questions are:
1) Are the byte positions the same in the HSZ50 case?
2) Is there a document which can help the diagnostic of failling disks
interpreting the frame output of the uerf command?

Jorge Silva :-)8
[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
878.1dont use uerfLEXS01::GINGERRon GingerThu May 15 1997 11:354
    stop using uerf. Get DECevent. DECevent will report more that you ever
    thought you wanted to know aout hsz errors.
    
    
878.2SSDEVO::T_GONZALESFri May 16 1997 16:4814
    Even with DECevent, a hard error detected will still not give the
    specific nature of the problem. Usually, the only way to find out
    what caused the hard error is to have a terminal connected to the
    hsz that has the unit on line at the time of the error. Using someting
    like polycenter console manager is the only way to trap these hard
    error condtions.  Some of the causes of these type of errors are
    a command timeout on a device between the hsz and the device, followed
    by the hsz doing a bus device reset on its scsi bus that contained the
    device.  You may also want to get the latest cam_disk.o module for
    unix.  The latest patches, april 97, have enhanced error logging
    to capture the intial error that caused the hard error.
    
    Were there any other entries in the error log, immediately before the
    hard error?