[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference ssdevo::hsj40_product

Title:HSJ30/40 Product Conference
Moderator:SSDEVO::EDMONDS
Created:Tue Jul 13 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1264
Total number of notes:4958

1219.0. "HSJ50 LF 42452080,02950100,02970100?" by UTRTSC::VISSER () Tue Mar 18 1997 12:29

Any idea's?

Configuration:
	ONE VAX 7840 with 2x CIXCD (HSJ005/6 connected to one CIXCD/SC)
				   (HSJ015/16 connected to other CIXCD/SC)
	Dual Redundant HSJ50's: "HSJ005" and "HSJ006"  <- Problem-set
				"HSJ015" and "HSJ016"
	OpenVMS V6.2

Summary:
	"HSJ005" fails with Lastfail 
	     15-MAR 09:02:34
		02970100: Invalid DCA state detected in init_failover.

	"HSJ006" fails with Lastfail
	     15-MAR-1997 at 09:02:34
		42452080: Ci_isr found that yaci hardware had a bus 
			  timeout error
	     15-MAR-1997 at 09:02:38
		02950100: Invalid DCA state detected in start_crashover.

VMS Errorlog: only Path-failures and VC Closures of HSJ005/6 were found.

SYMPTOMS:

	1. HSJ005/6 fail with errors, as supplied here.

	2. At the same time:
	   Disks (both JBOD's and STRIPESETs) are Host Based Shadowed between
	   HSJ005/6 and HSJ015/16. There was no problem with the JBOD-based
	   Shadowsets (members on HSJ015/16 were kept as single member in the
	   HBVS-sets). (Members from HSJ005/6 were all removed from
   	   shadowsets).
	   The STRIPESETs (one set shown) reported (appr. 5 minutes after 
	   Mount verification started on DSA1160):
	    - $1$DUA60  (HSJ006, HSJ005) has been removed from shadow set
	    - $1$DUA160 (HSJ016, HSJ015) has been removed from shadow set
	    - Mount verification has aborted for DSA1160
	    - DSA1160: contains zero working members.

	   We were NOT able to find any reasons WHY $1$DUA160 was removed from
	   the other HSJ-pair. Looks like something VMS'g?

Any Idea's?
    			Jan Visser
    
----------------
Details FMU:

"HSJ005":
=========
Copyright Digital Equipment Corporation 1993, 1996. All rights reserved.
HSJ50 Firmware version V50J-2, Hardware version  B01

Last fail code: 02970100

run fmu
           Fault Management Utility
show last all

Last Failure Entry: 1.
 Flags: 000FFF01

 Template: 1.(01)
 Description: Last Failure Event

 Occurred on 15-MAR-1997 at 09:02:34

 Power On Time:
 0. Years, 56. Days, 13. Hours, 3. Minutes, 8. Seconds

 Controller Model: HSJ50
 Serial Number: ZG63100472 Hardware Version:  B01(0B)

 Controller Identifier:
  Unique Device Number: 000963100472 Model: 45.(2D) Class: 1.(01)

 Firmware Version:  V50J (50)

 Node Name: "HSJ005" CI Node Number: 5.(05)

 Instance Code: 0102030A
 Description: 

  An unrecoverable firmware inconsistency was detected or an intentional
  restart or shutdown of controller operation was requested.

 Reporting Component: 1.(01)
 Description: 
  
Executive Services

 Reporting component's event number: 2.(02)

 Event Threshold: 10.(0A) Classification:

  SOFT. An unexpected condition detected by a controller firmware component
  (e.g., protocol violations, host buffer access errors, internal
  inconsistencies, uninterpreted device errors, etc.) or an intentional
  restart or shutdown of controller o
peration is indicated.

 Last Failure Code: 02970100
 (No Last Failure Parameters)

 Last Failure Code: 02970100
 Description: 

  Invalid DCA state detected in init_failover.

 Reporting Component: 2.(02)
 Description: 
  
Value Added Services

 Reporting component's event number: 151.(97)

 Restart Type: 0.(00)
 Description: 
Full firmware restart
------------------------

"HSJ006":
=========
Copyright Digital Equipment Corporation 1993, 1996. All rights reserved.
HSJ50 Firmware version V50J-2, Hardware version  B01

Last fail code: 02950100

run fmu

           Fault Management Utility
show last all


Last Failure Entry: 4.
 Flags: 000FF300

 Template: 1.(01)
 Description: Last Failure Event

 Power On Time:		           NOTE: This one is 4 seconds AFTER first-LF 
 0. Years, 53. Days, 7. Hours, 9. Minutes, 46. Seconds

 Controller Model: HSJ50
 Serial Number: ZG63300576 Hardware Version:  B01(0B)

 Controller Identifier:
  Unique Device Number: 000963300576 Model: 45.(2D) Class: 1.(01)

 Firmware Version:
 V50J(50)

 Node Name: "HSJ006" CI Node Number: 6.(06)

 Instance Code: 0102030A
 Description: 

  An unrecoverable firmware inconsistency was detected or an intentional
  restart or shutdown of controller operation was requested.

 Reporting Component: 1.(01)
 Description: 
  
Executive Services

 Reporting component's event number: 2.(02)

 Event Threshold: 10.(0A) Classification:

  SOFT. An unexpected condition detected by a controller firmware component
  (e.g., protocol violations, host buffer access errors, internal
  inconsistencies, uninterpreted device errors, etc.) or an intentional
  restart or shutdown of controller o
peration is indicated.

 Last Failure Code: 02950100
 (No Last Failure Parameters)

 Last Failure Code: 02950100
 Description: 

  Invalid DCA state detected in start_crashover.

 Reporting Component: 2.(02)
 Description: 

Value Added Services

 Reporting component's event number: 149.(95)

 Restart Type: 0.(00)
 Description: 
Full firmware restart
------------

Last Failure Entry: 3.
 Flags: 000FF381

 Template: 1.(01)
 Description: 
Last Failure
 Event

 Occurred on 15-MAR-1997 at 09:02:34

 Power On Time:
 0. Years, 53. Days, 7. Hours, 9. Minutes, 42. Seconds

 Controller Model: HSJ50
 Serial Number: ZG63300576 Hardware Version:  B01(0B)

 Controller Identifier:
  Unique Device Number: 000963300576 Model: 45.(2D) Class: 1.(01)

 Firmware Version:
 V50J(50)

 Node Name: "HSJ006" CI Node Number: 6.(06)

 Instance Code: 01010302
 Description: 

  An unrecoverable hardware detected fault occurred.

 Reporting Component: 1.(01)
 Description: 

  
Executive Services

 Reporting component's event number: 1.(01)

 Event Threshold: 2.(02) Classification:

  HARD. Failure of a component that affects controller performance or
  precludes access to a device connected to the controller is indicated.

 Last Failure Code: 42452080
 (No Last Failure Parameters)

 Last Failure Code: 42452080
 Description: 

  Ci_isr found that yaci hardware had a bus timeout error

 Reporting Component: 66.(42)
 Description: 

  
Host Interconnect Port Services

 Reporting component's event number: 69.(45)

 Restart Type: 0.(00)
 Description: 
Full firmware restart
---------------

T.RTitleUserPersonal
Name
DateLines