[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference aosg::lsm

Title:LSM
Moderator:SMURF::SHIDERLY
Created:Mon Jan 17 1994
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:803
Total number of notes:2852

787.0. "Synching usr/var problem?" by NNTPD::"trenta@csc32.enet.dec.com" (Debbie Trenta) Tue Apr 29 1997 20:34

Hi,

I am not sure if we have a problem or not but something has me confused
in the volprint and volstat outputs I am seeing after a unclean shutdown.  
The system is a 8400 running V4.0a Digital UNIX.  

The system went down abnormally so hence the LSM volumes needing resynched.  
However it is unclear to me as to what is taking place.  On prior versions 
and in all prior cases I have seen one of the plexes is always in a sync 
state and WO state while being resyched.  However in this case both plexes are
in ENABLED/ACTIVE rw state and the volume is only in an ENABLED/SYNC
state.  What is going on?  The volstat is telling me it is reading and
writing to both plexes.  Can you explain if this is a problem or not?
If not, what is taking place?  Has a new synching algorithm been introduced
that I am not aware of?

Thanks in advance for the explanation.

Debbie Trenta
Lucent/AT&T Platinum Services Support
Colorado CSC

The data follows:


Starting secondary cpu 5
LSM: Resynchronization of volume rootvol in group rootdg started.
/sbin/ufs_fsck -p /dev/rvol/rootdg/rootvol
/dev/rvol/rootdg/rootvol: 1913 files, 72516 used, 72499 free (115 frags, 9048
bl
ocks, 0.1% fragmentation)
starting LSM
LSM: Resynchronization of volume vol-rz49h in group rootdg started.
LSM: Resynchronization of volume vol-rz49g in group rootdg started.
Checking local filesystems
/sbin/ufs_fsck -p
/dev/rvol/rootdg/rootvol: 1913 files, 72516 used, 72499 free (115 frags, 9048
bl
ocks, 0.1% fragmentation)
/dev/rvol/rootdg/vol-rz49g: 8794 files, 192392 used, 1746318 free (1326 frags,
2
18124 blocks, 0.1% fragmentation)
LSM: Resynchronization of volume rootvol in group rootdg finished.
/dev/rvol/rootdg/vol-rz49h: 507 files, 5892 used, 1680586 free (554 frags,
21000
4 blocks, 0.0% fragmentation)
Mounting / (root)

.....
Binary error logger started
Starting ASE ...
Apr 29 13:45:42 sd9801 vmunix: LSM: Resynchronization of volume rootvol in
group
 rootdg started.
Apr 29 13:45:42 sd9801 vmunix: LSM: Resynchronization of volume vol-rz49h in
gro
up rootdg started.
        ONC portmap service started
Apr 29 13:45:42 sd9801 vmunix: LSM: Resynchronization of volume vol-rz49g in
gro
up rootdg started.
        Initializing the ASE Availability Manager
Apr 29 13:45:43 sd9801 vmunix: LSM: Resynchronization of volume rootvol in
group
 rootdg finished.
        ASE logger started (/usr/sbin/aselogger)
        ASE agent started (/usr/sbin/aseagent)
ASE member started
Setting kernel timezone variable


....

sd9801# volprint -ht
DG NAME         GROUP-ID
DM NAME         DEVICE       TYPE     PRIVLEN  PUBLEN   PUBPATH
V  NAME         USETYPE      KSTATE   STATE    LENGTH   READPOL  PREFPLEX
PL NAME         VOLUME       KSTATE   STATE    LENGTH   LAYOUT   ST-WIDTH MODE
SD NAME         PLEX         PLOFFS   DISKOFFS LENGTH   DISK-NAME    DEVICE

dg rootdg       862325304.1025.sd9801

dm rz49a        rz49a        nopriv   0        300000   /dev/rrz49a
dm rz49b        rz49b        nopriv   0        598976   /dev/rrz49b
dm rz49d        rz49d        simple   1024     0        /dev/rrz49d
dm rz49g        rz49g        nopriv   0        4000000  /dev/rrz49g
dm rz49h        rz49h        nopriv   0        3480080  /dev/rrz49h
dm rz65a        rz65a        nopriv   0        300000   /dev/rrz65a
dm rz65b        rz65b        nopriv   0        598976   /dev/rrz65b
dm rz65d        rz65d        simple   1024     0        /dev/rrz65d
dm rz65g        rz65g        nopriv   0        4000000  /dev/rrz65g
dm rz65h        rz65h        nopriv   0        3480080  /dev/rrz65h

v  rootvol      root         ENABLED  ACTIVE   300000   ROUND    -
pl rootvol-01   rootvol      ENABLED  ACTIVE   300000   CONCAT   -        RW
sd rz49a-01p    rootvol-01   0        0        16       rz49a        rz49a
sd rz49a-01     rootvol-01   16       16       299984   rz49a        rz49a
pl rootvol-02   rootvol      ENABLED  ACTIVE   300000   CONCAT   -        RW
sd rz65a-01p    rootvol-02   0        0        16       rz65a        rz65a
sd rz65a-01     rootvol-02   16       16       299984   rz65a        rz65a

v  swapvol      swap         ENABLED  ACTIVE   598976   ROUND    -
pl swapvol-01   swapvol      ENABLED  ACTIVE   598976   CONCAT   -        RW
sd rz49b-01     swapvol-01   0        0        598976   rz49b        rz49b
pl swapvol-02   swapvol      ENABLED  ACTIVE   598976   CONCAT   -        RW
sd rz65b-01     swapvol-02   0        0        598976   rz65b        rz65b

v  vol-rz49g    fsgen        ENABLED  SYNC     4000000  SELECT   -
pl vol-rz49g-01 vol-rz49g    ENABLED  ACTIVE   4000000  CONCAT   -        RW
sd rz49g-01     vol-rz49g-01 0        0        4000000  rz49g        rz49g
pl vol-rz49g-02 vol-rz49g    ENABLED  ACTIVE   4000000  CONCAT   -        RW
sd rz65g-01     vol-rz49g-02 0        0        4000000  rz65g        rz65g

v  vol-rz49h    fsgen        ENABLED  NEEDSYNC 3480080  SELECT   -
pl vol-rz49h-01 vol-rz49h    ENABLED  ACTIVE   3480080  CONCAT   -        RW
sd rz49h-01     vol-rz49h-01 0        0        3480080  rz49h        rz49h
pl vol-rz49h-02 vol-rz49h    ENABLED  ACTIVE   3480080  CONCAT   -        RW
sd rz65h-01     vol-rz49h-02 0        0        3480080  rz65h        rz65h
sd9801# ps ax | grep vol
    8 ??       I        0:00.50 vold -k -m boot
   27 ??       I        0:00.01 /sbin/volrecover -b -o iosize=64k -s
   40 ??       I        0:00.02 /etc/vol/type/fsgen/volume -U fsgen -g
862325304
.1025.sd9801 -o iosize 64k -- resync vol-rz49g
   41 ??       U        0:08.09 /etc/vol/type/fsgen/volume -U fsgen -g
862325304
.1025.sd9801 -o iosize 64k -- resync vol-rz49g
  576 ??       I        0:00.02 sh /usr/sbin/volwatch root
  586 ??       I        0:00.01 sh /usr/sbin/volwatch root
  587 ??       I        0:00.01 volnotify -f -w 15
  667 console  S  +     0:00.01 grep vol
sd9801# volstat -r
sd9801# volstat -d
                           OPERATIONS           BLOCKS        AVG TIME(ms)
TYP NAME                 READ     WRITE      READ     WRITE   READ   WRITE
dm  rz49a                   0         0         0         0    0.0     0.0
dm  rz49b                   0         0         0         0    0.0     0.0
dm  rz49d                   0         0         0         0    0.0     0.0
dm  rz49g                  28        29      3584      3712   18.0    33.5
dm  rz49h                   0         0         0         0    0.0     0.0
dm  rz65a                   0         0         0         0    0.0     0.0
dm  rz65b                   0         0         0         0    0.0     0.0
dm  rz65d                   0         0         0         0    0.0     0.0
dm  rz65g                  29        29      3712      3712   17.9    34.7
dm  rz65h                   0         0         0         0    0.0     0.0




[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
787.1all is OK !BRSDVP::DEVOSManu Devos NSIS Brussels 856-7539Tue Apr 29 1997 22:0216
    Hi Debbie,
    
    No, there is no new synch mechanism under the sun...
    
    When a system is crashing, no plex of a volume is known as OK. So, the
    synchronizing mechanism is simply opening the volume in Read/write back
    mode. It means that the volume is open as usual (round robin or
    prefered plex mode) and a whole volume read process is started which 
    causes all read operations on one plex to be written back to the other 
    plex. When the volume read process is finished, we are sure that the two
    plexes are the same. This is a standard procedure from the beginning of
    LSM.
    
    So, don't worry, be happy !
    
    Manu.
787.2thanks :-) Can you answer another about PSL?CSC32::TRENTAWed Apr 30 1997 15:2928
    Manu,
    
    Thanks for the reply.  So my perception of what happens during a 
    crash has obviously been misinformed.  I always thought writes were 
    done to the 1st plex first then 2nd. So when a crash happened, it
    did a read of the 1st and then a write only to the 2nd.  (This type 
    of sync obviously only happens when one of the plexes becomes 
    disabled or inactive for some reason).  So then for clarification 
    sake, are you saying that whatever plex it happens to read from that plex
    is assummed correct and then a write of that data is done back to
    itself and all other plexes?
    
    Makes sense - I just never realized that.  Pardon the ignorance.
    I guess I just never really looked at the synching that was done
    after a crash before.
    
    Could you please explain another question I have then about
    "Persistent State Logging" ?   What volumes does it know to synch
    upon reboot?  Meaning I know that the log keeps a record of the 
    first write and last close to a volume.  So then I am assumming 
    this means that if a volume was active/enabled but never written
    to (even though it was in a R/W state) that LSM knows that the 
    volume does not have to be resynched after a crash.  Am I 
    right in my understanding of how Persistent State Logging works?
    
    Thanks again Manu.  I appreciate the clarification.
    
    Debbie                                             
787.3BRSDVP::DEVOSManu Devos NSIS Brussels 856-7539Thu May 01 1997 14:4038
    Hi Debbie,
    
>>     So then for clarification sake, are you saying that whatever plex it 
>>     happens to read from that plex is assummed correct and then a write 
>>     of that data is done back to itself and all other plexes?
    
    This is true when a volume appears at LSM startup with both plexes in
    the "ACTIVE" state, which is typical when a system has crashed.
    It is also the case when you start for the first time a mirrored
    volume just created. You said above "... back to itself and
    all.."; no there is no write back to the plex just read, only to the
    other. As you noticed, this method is NOT applied when ONE plex appears
    as STALE at LSM startup. Obviously, it is placed in Write Only mode
    (WO), and the volume is open NORMALLY (i.e. not in rwback mode). Again
    a whole vvolume read process is started, causing the other plex(es) to
    be the source of data to copy to the  the WO (stale) plex. A STALE plex
    is a plex which has not been updated during the lifetime of the volume
    because it has been intentionnaly detached or automatically detached
    due to an error on one of the sub-disk (or disk) underneath itself.
    
>>      Could you please explain another question I have then about
>>      "Persistent State Logging" ?   What volumes does it know to synch
>>      upon reboot?  Meaning I know that the log keeps a record of the 
>>      first write and last close to a volume.  So then I am assumming 
>>      this means that if a volume was active/enabled but never written
>>      to (even though it was in a R/W state) that LSM knows that the 
>>      volume does not have to be resynched after a crash.  Am I 
>>      right in my understanding of how Persistent State Logging works?
    
    You are right when you say that it knows if that volume should be
    resynchronized. But PSL is also used in other situations like BCL, and
    also to store various states of the DM disks, plexes and volumes.
    But, I am responding from home, and without any doc, so maybe
    Engineering could complete this answer.
    
    Regards, Manu.