[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference decwet::advfs_support

Title:AdvFS Support/Info/Questions Notefile
Notice:note 187 is Freq Asked Questions;note 7 is support policy
Moderator:DECWET::DADDAMIO
Created:Wed Jun 02 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1077
Total number of notes:4417

1015.0. "AdvFS panic on init 0 (with ASE 1.3 running)" by NETRIX::"faidherb@tsc.bro.dec.com" (Th FAIDHERBE) Tue Mar 11 1997 05:03

Hi to all, (this note will be posted in advfs and ase conference)

I am working on an ase call (see problem in note 1920 in smurt::ase). 
I think that reservation conflict problem may be a start/stop script problem
(but that is an other thing)

But I have tryed some strange thinks :

On office, we have two OSF3.2g running ase1.3

I relocated service to second system and I used init 0 to shutdown
the first. 

When shuting down first, first system reported 
advfs I/O error: setId 0x33120b87.0002edd5.fffffffe.0000  tag 0xfffffff7.0000u
    page 42
            vd 1  blk 2944  blkCnt 16
            write error = 5

    bs_osf_complete: metadata write failed
    AdvFS Domain Panic; Domain nmcs-dom1 Id 0x33120b87.0002edd5

I rebooted first and relocated service on first.
After, relocated once again service from first to second and
invoked shutdown -h now  on the first.
I haven't received errors on advfs metadata write ...

Does somebody have explaination why advfs is reporting metadata write errors
when system stopped by init 0 ???
Init 0 execute scripts in /sbin/rc0.d directory but is it possible that
these kind of errors may be caused by one order problem in the 
/sbin/rc0.d/K..... scripts ???

Kindly Regards,

+---++---++---++---++---++---++---+ TM  Digital Equipment Belgium
|   ||   ||   ||   ||   ||   ||   |   Multivendor Customer Services
| d || i || g || i || t || a || l |         Thierry FAIDHERBE 
|   ||   ||   ||   ||   ||   ||   |      DIGITAL Unix Support Team
+---++---++---++---++---++---++---+  Email FAIDHERB@TSC.BRO.DEC.COM 
            Phone : +32 2 729 77 44  Fax : +32 2 729 77 65
           With DIGITAL Unix, ... You get what you pay for ...





[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
1015.1NETRIX::"jchang@wasted.zk3.dec.com"Janice ChangTue Mar 11 1997 09:1518
Hello, thanks for your note.

Are you able to reproduce the problem as stated?  Also, does the problem occur

when you use init 0 to shutdown the second system (having relocated the 
service to the first)?

Sometimes the "bs_osf_complete: metadata write failed" error originates
from a hardware problem.  Do you show any such errors in the uerf logs?

Thanks,
Janice




[Posted by WWW Notes gateway]
1015.2NETRIX::"faidherb@tsc.bro.dec.com"Th FAIDHERBETue Mar 11 1997 09:5737
Hi Janice,

Thanks for reply.

I tryed on office : init 0 to halt system  give metadata write failed
                    shutdown .... no errors ...

I have the deamon.lod and uerf for the two customer system but
can reproduce at office the customer problem.... that doesn't
look to hardware problem because I checked all on office and no hardware
errors.

Summary of my test on Office:

sys 1      sys 2          Command used on sys2   Result
                            to halt system
=============================================================================
NFSservice                on sys2 :shutdown      No metadata write failed

NFSservice                on sys2 :init 0        METADATA WRITE FAILED ...
<<---

           NFSservice     on sys2 :shutdown      No metadata write failed

           NFSservice     on sys2 :init 0        No metadata write failed


In this way, I found (and I can reproduce) than METADATA WRITE FAILED 
is displayed  ONLY if ase service is on the other member and only if you have
used init 0 for halt system ....

Hope that may help you ...

Thierry


[Posted by WWW Notes gateway]
1015.3are you using HSZs?DOOSJE::HERTAFor something fulfilled this hour, loved, or enduredTue Mar 11 1997 10:2313
Thierry,

I've been discussing this problem with a few other people.  One of them
suggested it may be a problem with HSZs timing out devices too late for ASEs
convenience.

You say you can reproduce the problem on your systems.  Do they include HSZ40s
or HSZ50s?

Do you have any accompanying SCSI_STAT_RESERVATION_CONFLICT messages in your
system error log?

Herta
1015.4SCSI_STAT_RESERVATION_CONFLICT .... !!!NETRIX::&quot;faidherb@tsc.bro.dec.com&quot;Th FAIDHERBETue Mar 11 1997 10:5997
Herta,

If I try to do a uerf -R -o full |more ... I get ...

----- EVENT INFORMATION -----

EVENT CLASS                             ERROR EVENT
OS EVENT TYPE                  199.     CAM SCSI
SEQUENCE NUMBER                 10.
OPERATING SYSTEM                        DEC OSF/1
OCCURRED/LOGGED ON                      Tue Mar 11 15:47:07 1997
OCCURRED ON SYSTEM                      tscosf
SYSTEM ID                 x00020004     CPU TYPE:  DEC 3000
SYSTYPE                   x00000000

----- UNIT INFORMATION -----

CLASS                         x0000     DISK
SUBSYSTEM                     x0000     DISK
BUS #                         x0002
                              x00A8     LUN x0
                                        TARGET x5

----- CAM STRING -----

ROUTINE NAME                            cdisk_op_spin

----- CAM STRING -----

                                        Unit Reserved

----- CAM STRING -----

ERROR TYPE                              Information Message Detected
                                         _(recovered)

----- CAM STRING -----

DEVICE NAME                             DEC     RZ28B

----- CAM STRING -----

                                        Active CCB at time of error

----- CAM STRING -----

                                        CCB request completed with an error
ERROR - os_std, os_type = 11, std_type = 10


----- ENT_CCB_SCSIIO -----

*MY ADDR                  x09F82F28
CCB LENGTH                    x00C0
FUNC CODE            x01
CAM_STATUS                    x0004     CAM_REQ_CMP_ERR
PATH ID              2.
TARGET ID            5.
TARGET LUN           0.
CAM FLAGS                 x000004C0
                                        CAM_DIR_NONE
                                        CAM_SIM_QFRZDIS
*PDRV_PTR                 x09F82C28
*NEXT_CCB                 x00000000
*REQ_MAP                  x00000000
VOID (*CAM_CBFCNP)()      x00521BB0
*DATA_PTR                 x00000000
DXFER_LEN                 x00000000
*SENSE_PTR                x09F82C50
SENSE_LEN            x40
CDB_LEN              x06
SGLIST_CNT                    x0000
CAM_SCSI_STATUS               x0018     SCSI_STAT_RESERVATION_CONFLICT
SENSE_RESID          x00
RESID                     x00000000
CAM_CDB_IO           x000000000000000000000000
CAM_TIMEOUT               x00000014
MSGB_LEN                      x0000
VU_FLAGS                      x0000
TAG_ACTION           x00

///Thierry

Janice,

I called Herta .. and the difference between Herta problem and I is only
that Herta is using KZP.. 
But we have same problem ... Metadata write failed if service is on second
member and that we are using init 0 on first system. After that,
a look to binary.errlog file showed us SCSI_STAT_RESERVATION_CONFLICT 

///Thierry




[Posted by WWW Notes gateway]
1015.5NETRIX::&quot;jchang@wasted.zk3.dec.com&quot;Janice ChangTue Mar 11 1997 12:189
Hello  Thierry.  I have not been able to find previous cases dealing
with a similar problem.  I'll have to recommend that an IMPT case be
filed, since it doesn't look like there is a known solution.

Thanks,
Janice

[Posted by WWW Notes gateway]
1015.6please send crash-data file to CANASTA !!!HAN::HALLEVolker Halle MCS @HAO DTN 863-5216Wed Mar 12 1997 14:4217
    Thiery,
    
    PLEASE obtain the crash-data file and send it to the CANASTA Mail
    Server with the following command:
    
    # Mail -s "Diagnose case=advfs_1015 customer=notes_on_decwet" 
    			can_server@xocomp.enet.dec.com < crash-data.n
    
    This will run your crash-data file through CANASTA and - even if no
    solution might be available - will make SURE, that this FOOTPRINT is
    entered into the CANASTA case database.
    
    Thanks,
    
    Volker.
    
    PS: Please read note TURRIS::DIGITAL_UNIX 8919 for more info about CANASTA
1015.7NO CRASH-DATA ...NETRIX::&quot;faidherb@tsc.bro.dec.com&quot;Th FAIDHERBEThu Mar 13 1997 01:5527
Hi,

I'm %%% NOT %%% able to send crash data because there is %%%NO%%%
crash-data file ... and no vmcore and vmunix ...

In fact, when system is shutting down, he display on screen

advfs I/O error: setId 0x33120b87.0002edd5.fffffffe.0000  tag 0xfffffff7.0000u
    page 42
            vd 1  blk 2944  blkCnt 16
            write error = 5

    bs_osf_complete: metadata write failed
    AdvFS Domain Panic; Domain nmcs-dom1 Id 0x33120b87.0002edd5

more than 2 or 3 times but there are NOTHING in /var/adm/crash.

The only one thing that I can give you is the binary.errlog file.
In these file, you may see the SCSI_STAT_RESERVATION_CONFLICT.

I spoke with one of my collegas and he said me that problem is
maybe in relation with the order of K... scripts in /sbin/rc0.d

Thanks to all,
Thierry

[Posted by WWW Notes gateway]