[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference smurf::ase

Title:ase
Moderator:SMURF::GROSSO
Created:Thu Jul 29 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2114
Total number of notes:7347

1891.0. "Why these SCSI CAM errors on one node?" by DYOSW5::WILDER (Does virtual reality get swapped?) Thu Feb 20 1997 10:13

Environment:
2 node TruCluster Production Server V1.4, UNIX V4.0B, StorageWorks
800, 3 dual HSZ50s and 3 KZPSAs per node.
Currently, all services are running on node A. Node A's error log
is about 50MB in size, and hasn't logged anything for the last few
days (even though node B has been rebooted several times). Node B's
error log is over 400MB in size. It is constantly getting SCSI CAM
errors. DECevent reports errors of low priority on all 3 busses. The
errors are all:
     module name string:   cdisk_op_spin
     generic string:       UNIT Reserved

What do these errors mean? Are they normal? These systems have been
running since only early January. Originally, we had a lot of hardware
problems with SCSI cables and one HSZ50. However, we believe all those
problems have been fixed since at least early Feb.

Any idea what these errors are? Anything we should do? These errors get
logged across several disks on all 3 busses. Every couple of seconds, we
get maybe a half dozen errors. What is going on?

Thanks,

Jim

    Cross posted in digital_unix
    
T.RTitleUserPersonal
Name
DateLines
1891.1advfsd - use file disks.ignoreZUR01::VORBURGERlive and let liveFri Feb 21 1997 09:0623
1891.2Not all advfsDYOSW5::WILDERDoes virtual reality get swapped?Fri Feb 21 1997 10:0611
    Actually, only about 4 disks use advfs out of 100 disks. The rest are
    all LSM raw volumes. Some of the errors MAY be on the advfs disks, but
    many of the errors are on the LSM raw (drd) disks.
    
    While I can fix the advfs situation with the .1 reply, what about the
    others?
    
    Thanks,
    
    /jim
    
1891.3drds ARE logging errorsDYOSW5::WILDERDoes virtual reality get swapped?Sat Feb 22 1997 13:306
    I have confirmed that as soon as I move drds from one node to another,
    these scsi cam errors start showing up on the original node. It seems
    that ase nodes log errors on drds that they see but do not directly
    own. Is this the way it should be? Is this a problem?
    
    /jim
1891.4explanation...BACHUS::DEVOSManu Devos DEC/SI Brussels 856-7539Mon Feb 24 1997 07:0916
Jim,

DECsafe or Trucluster software is reserving the disks for the system
running the service. This is really a "hardware reservation" made on the
disk.

So, if you try to access the disks reserved for system A from system B, then
System B will immediately show an error because you try to access reserved disks.

If you simply try the command "# file /dev/rrz*c" you will get the same errors.

The advfsd daemon is scanning every disks seen in /dev to update its GUI. Thus
it cause an error for each disk reserved by the other system. This has nothing
to do with the fact that the reserved disk are containing an ADVFS filesystem.

Regards, Manu.
1891.5/var/opt/advfsd/disks.ignore doesn't workDYOSW5::WILDERDoes virtual reality get swapped?Mon Mar 03 1997 18:285
    Well,      /var/opt/advfsd/disks.ignore  did not solve it. Could it
    really be    /usr/opt/advfsd/disks.ignore   ??
    
    /jim
    	
1891.6UTRUST::PILMEYERQuestions raise the doubtTue Mar 04 1997 05:224
    Have you read 1623? More specifically the fact that snmpd can also
    cause the errors...
    
    -Han
1891.7will try 1623 suggestionDYOSW5::WILDERDoes virtual reality get swapped?Tue Mar 04 1997 22:017
    Well, I will try the suggestion in 1623 and snmpd. 
    
    The customer is getting very frustrated over this. I hope there is a
    patch soon.
    
    /jim