[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference ssdevo::hsz40_product

Title:HSZ40 Product Conference
Moderator:SSDEVO::EDMONDS
Created:Mon Apr 11 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:902
Total number of notes:3319

888.0. "Mysterious HSZ40 reset and error message" by GIDDAY::HIRSHMAN (Hugged your Webmeister today?) Tue May 27 1997 09:13

    A customer has logged the following call on her HSZ40.  Can anyone 
    suggest why the the HSZ40 might have performed a reset, and explain
    what the quoted message means?



FROM:     Trish Thomas
          TAFE NSW
PH:       02 9950 1719
FAX:      02 9950 1601

PROBLEM:  Disk error on ALPHA 2100, serial no. AY53411647

DESCRIPTION:
Machine Type    AlphaServer 2100A 5/300 running VMS V6.2-1H3 in a VMScluster.
                (HW Ver=04C700000000000000000018, SID=80000000, XSID=00000000)
Serial no.      AY53411647
Disk            RZ28M,RZ29B

PROBLEM:
1  All disks on the machine clocked one error early this morning. These disks
   are connected to an HSZ40 controller on a Storageworks shelf. It looks like
   the controller has reset itself and we don't know why.
   I have included an extract from DIAGNOSE & FMU.

2  We recently upgraded the firmware on the HSZ40 controller and initialized
   one of the disks with SAVE_CONFIGURATION. We are now getting this message
   on the HSZ40:
   "NVPM OEM information component initialized to default settings"
   Can you explain what this message means?



                ***************************************
                            DIAGNOSE


Logging OS                        1. OpenVMS
System Architecture               2. Alpha
OS version                           V6.2-1H2
Event sequence number         27566.
Timestamp of occurrence              23-MAY-1997 00:34:16
Time since reboot                    7 Day(s) 11:17:24
Host name                            IRIVE2

System Model                         AlphaServer 2100 4/233

Entry type                        1. Device Error


---- Device Profile ----
Unit                                 IRIVE2$DKB202
Product Name                         HSZ40  SCSI to SCSI Ctrl

-- Driver Supplied Info -
Device Firmware Revision             V31Z
VMS SCSI Error Type               5. Extended Sense Data from Device
SCSI ID                         x02
SCSI LUN                        x00
SCSI SUBLUN                     x02
Port Status               x00000001  Success
Command Opcode                  x0A  Write (6 byte)
Command Data
                                x0E
                                x14
                                xD1
                                x10
                                x00

SCSI Status                     x02  Check Condition
Remaining Byte Length           160.

------- HSZ Data -------
Instance Code             x03F40064  Device services had to reset the port to
                                     clear a bad condition. Note that in this
                                     instance the Associated Target, Associated
                                     ASC, and Associated ASCQ fields are
                                     undefined.

                                     Component ID =   Device Services.
                                     Event Number =   x000000F4
                                    Repair Action =   x00000000
                                     NR Threshold =   x00000064
Template Type                   x41  Device Services Non-Transfer Error.
Template Flags                  x00  HCE =   0, Event did not occur during Host
                                             Command Execution.
Ctrl Serial #                              ZG61201027
Ctrl Software Revision               V31Z
RAIDSET State                   x00  NORMAL. All members present and
                                     reconstructed, IF LUN is configured as a
                                     RAIDSET.

Error Code                      x70  Current Error
Sense Key                       x06  Unit Attention
ASC & ASCQ                    xD203  ASC  =   x00D2
                                     ASCQ =   x0003
                                     Device services had to reset the bus.

Associated Port                 x01
Associated Target               x04
Associated ASC                  x00
Associated ASCQ                 x00

----- Software Info -----
UCB$x_ERTCNT                     16. Retries Remaining
UCB$x_ERTMAX                     16. Retries Allowable
IRP$Q_IOSB                x0000000000000000
UCB$x_STS                 x08021810  Online
                                     Software Valid
                                     Unload At Dismount
                                     Volume is Valid on the local node
                                     Unit supports the Extended Function bit
IRP$L_PID                 x824E8730  Requestor "PID"
IRP$x_BOFF                      512. Byte Page Offset
IRP$x_BCNT                     8192. Transfer Size In Byte(s)
UCB$x_ERRCNT                      1. Errors This Unit
UCB$L_OPCNT                 1060333. QIO's This Unit
ORB$L_OWNER               x00010004  Owners UIC
UCB$L_DEVCHAR1            x1C4D4008  Directory Structured
                                     File Oriented
                                     Sharable
                                     Available
                                     Mounted
                                     Error Logging
                                     Capable of Input
                                     Capable of Output
                                     Random Access


                **************************************
                        Fault Management Utility

 describe instance 031A4002
 Instance Code: 031A4002 Description:
  Command timeout.
 Reporting Component: 3.(03) Description:
  Device Services
 Reporting component's event number: 26.(1A)
 Event Threshold: 2.(02) Classification:
  HARD. Failure of a component that affects controller performance or
  precludes access to a device connected to the controller is indicated.


Regards,


Trish Thomas
    
T.RTitleUserPersonal
Name
DateLines
888.1SSDEVO::T_GONZALESTue May 27 1997 14:126
    You are getting a command timeout on a device on the hsz port,  The
    fmu information should have included a port target information,
    although I didn't see it in your insert.  Anytime a command timeout
    occurs on the hsz port side, the hsz will reset that port, usually
    all units on that port will report the reset.  the error log
    information is reporting that event,  recommend you change that device.
888.2still wondering about error messageGIDDAY::HIRSHMANHugged your Webmeister today?Mon Jun 02 1997 04:3210
    There was no command timeout error and/or port target ID info in the
    errorlog, so I'm not sure where that leaves me.
    
    Also, do we need to do anything about the "NVPM OEM information
    component initialized to default settings" error (informational??)
    message or should the customer just do a CLEAR CLI and ignore it?  What
    does this message mean, anyway?
    
    
    -Bret