[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference cookie::sls

Title:Storage Library System
Moderator:COOKIE::REUTER
Created:Sun Oct 13 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2270
Total number of notes:7850

2269.0. "RDF 4.1a server inactivity timer timeout explanation ?" by PRSSOS::FONDI () Fri Jun 06 1997 07:24

  RDEV 4.1a server inactivity timer timeout causes ABS remote backup to fail 

ABS OpenVMS Clients Server Configuration, all nodes are Alpha nodes.

   server ABS V2.1			remote nodes: 	Clients ABS V2.1
   VMS V6.2-1H3						VMS V6.2-1H3
   MDMS 2.8A						MDMS 2.8A
   RDEV 4.1a that serves a TL810 for remote nodes	RDEV 4.1a
   DECNET/OSI V6.2 ECO7			                DECNET/OSI V6.2 (no ECO)
 							or DECNET IV it depends
                                                        the client nodes.


Probleme: This customer uses ABS to save disks of each remote nodes to
********* the TL810 connected to the ABS server named LAVX06.
          Often remote backups from any remote nodes (phase 4 or 5)abnormaly 
          stops with
              %BACKUP-E-INVRECTYP, invalid record type in save set
THREAD #1:  %BACKUP-E-INVRECSIZ, invalid record size in save set
THREAD #1:  %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.
THREAD #1:  -SYSTEM-F-DEVOFFLINE, device is not in configuration or not 
            available

 In the following  LAVX06 is the server node
                   LAVX08 is one remote node (DECnet/OSI V6.2 no ECO) 

 Looking at different log this seems to come from RDEV 4.1a server that 
 deallocates the tape_device served because it has (inactivity timer expired)
 freeing the tapedrive unloading it ... see RDV:079 [06/03-23:15:30.79]  in 
 SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG i posted below

 The RD client tries to reconnect but see the served tape device is already 
 freed =>thus disconnecting thus aborting the remote backup with 
        -SYSTEM-F-DEVOFFLINE, device is not in configuration or not available
 (see :79  [06/03-23:17:23.77] in SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25
  and lavx08::ABS_LOG:X08RMS.LOG the log of the save request showing the error)

Questions:
**********
	1)What to do to increase the RDEV 4.1a server inactivity timeout ?   
	  (actually how many seconds is it ?)
        2)Could it be a DECNET problem performance, or DECNET/OSI not           
          up-to-date?
          (cutomer will install DECNET/OSI 6.3 ECO6 on every node)
        3)In the mean time i have discovered that there is a RDF V4.1D
          (Kit SLSFT01A028.A, that corrects crash problem caused by  	  
          RDCDRIVER_A62 on deallocating pages , as LAVXC08 sometimes crash like 
          this, we will install this patch (CFS. 50555) on both server and  
          clients nodes. 

   Thank you
   Henri


below are:
1) log of the save request   lavx08::ABS_LOG:X08RMS.LOG
2) RDserver log     SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG
3) RDclient log     SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25
4) opcom message seen on LVXC08 at time of error  LAVX08_operator.log




1)
                *******************************************
                FICHIER DE LOG : lavx08::ABS_LOG:X08RMS.LOG
                *******************************************

$ vl=f$verify(0) ! SYLOGIN.COM - debut
Executing ABS LOGIN.COM
Completed execution of ABS LOGIN.COM
"@abs_system:coordinator.com 383531EA-C5F2-11D0-87FE-4C4156583036" Executing, o
------------------------------------------------------
---------------------------------------------------------------
Starting New Request at  3-JUN-1997 20:13:07.89
   Name:   X08RMS
   UID:    383531EA-C5F2-11D0-87FE-4C4156583036

COORDINATOR:  Attempting to allocate volume set AGA515...
COORDINATOR:  Mounting volume set member: AGA515 RVN 1
COORDINATOR:     (Selected drive LAVX08$RDEVA0:)
%MOUNT-I-MOUNTED, AGA515 mounted on _LAVX08$RDEVA0:
COORDINATOR:  Skipping LAVX08$RDEVA0: to End of Tape...
THREAD #1:

Operation #1 starting at  3-JUN-1997 20:17:02.81

   Data Movement Type:    INCREMENTAL_SAVE
   Incremental Level:     Level 6 Operation

   Object Set:
      Object Type:        VMS FILES
      Include List:       DKA100:
      Exclude List:

   Archive Information:
      Storage Class Name: X08RMS
      Saveset Location:   AGA515
      Saveset Name:       3JUN199720130855.

   Execution Environment:
      Name:               X08RMS_ENV
      Number of retries:  0
      Retry Interval:     0 minute(s)


THREAD #1:  $
THREAD #1:  SET NOON
THREAD #1:  $ version = F$EXTRACT(0,4,f$getsyi("VERSION"))
THREAD #1:  $ IF (VERSION .eqs. "V6.1") THEN $DEFINE BACKUP ABS$SYSTEM:ALTERNAT
THREAD #1:  $ DEFINE SYS$COM
THREAD #1:  MAND SYS$INPUT:
THREAD #1:  $ BACKUP DKA100:[000000...]* -
THREAD #1:  _$  -
THREAD #1:  _$ /LIST=_MBA176:/FULL -
THREAD #1:  _$ /RECORD -
THREAD #1:  _$ /IGNORE=(INTERLOCK) -
THREAD #1:  _$ /NOCRC/NOVERIFY/CRC -
THREAD #1:  _$  -
THREAD #1:  _$ /SINCE=" 2-JUN-1997 20:15:14.93" -
THREAD #1:  _$  -
THREAD #1:  _$ /MODIF -
THREAD #1:  _$ LAVX08$R
THREAD #1:  DEVA0:3JUN199720130855./SAVE -
THREAD #1:  _$ /STOR=V2SLS/NOASSIST -
THREAD #1:  _$ /EXACT_ORDER
THREAD #1:  HDR13JUN199720130855.AGA51500010003000100 97154 97154 000000DECVMSB
THREAD #1:  HDR13JUN199720130855.AGA51500010003000100 97154 97154 000000DECVMSB
THREAD #1:  HDR2F0819208192                     M             00               
THREAD #1:  HDR2F0819208192                     M             00               
THREAD #1:  %BACKUP-E-INVRECTYP, invalid record type in save set
THREAD #1:  %BACKUP-E-INVRECSIZ, invalid record size in save set
THREAD #1:  %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.
THREAD #1:  -SYSTEM-F-DEVOFFLINE, device is not in configuration or not availab
THREAD #1:  $
COORDINATOR:  Retiring volume set AGA515 (due to fatal error during save)
COORDINATOR:  Dismounting volume set member: AGA515 RVN 1
THREAD #1:  Facility ABS:  ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1:     Line = 1563, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
THREAD #1:  Facility ABS:  ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1:     Line = 1307, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
THREAD #1:  Facility ABS:  ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1:     Line = 7971, File = RESD$:[SRC]SLS_SERVICES.C;1
THREAD #1:  Facility ABS:  ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate vo
THREAD #1:     Line = 2340, File = RESD$:[SRC]SLS_SERVICES.C;1
THREAD #1:  Facility ABS:  ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific error
THREAD #1:     %SYSTEM-F-UNREACHABLE, remote node is not currently reachable
THREAD #1:  Failed to deallocate volume set
COORDINATOR:  Skipping LAVX08$RDEVA0: to End of Tape...
COORDINATOR:  Dismounting volume set member: AGA515 RVN 1
COORDINATOR:  Failed to access Storage Class
COORDINATOR:  Facility ABS:  ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR:     %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN19972013
COORDINATOR:  Facility ABS:  ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR:     Line = 1309, File = RESD$:[SRC]COORD_THREAD_MANAGEMENT.C;1
COORDINATOR:  Facility ABS:  ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR:     Line = 801, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
COORDINATOR:  Facility ABS:  ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR:     Line = 704, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
COORDINATOR:  Facility ABS:  ABS_ARCHIVE_ACCESS_FAILED, Failed to access Storag
COORDINATOR:     Line = 469, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
COORDINATOR:  Facility ABS:  ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR:     Line = 1717, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR:  Facility ABS:  ABS_SLS_VSET_DEALLOC_FAILED, Failed to deallocate 
COORDINATOR:     Line = 2340, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR:  Facility ABS:  ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific err
COORDINATOR:     %SYSTEM-F-UNREACHABLE, remote node is not currently reachable
COORDINATOR:  Facility ABS:  ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR:     Line = 4819, File = RESD$:[SRC]SLS_SERVICES.C;1
COORDINATOR:  Facility ABS:  ABS_SYSSKIPFILE_FAILED, SYS$QIOW with IO$_SKIPFILE
COORDINATOR:     Line = 507, File = RESD$:[SRC]TAPE_OPERATIONS.C;1
COORDINATOR:  Facility ABS:  ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific err
COORDINATOR:     %SYSTEM-F-DEVOFFLINE, device is not in configuration or not av
COORDINATOR:  Final status is Invalid archive file system specified
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 1277, File = RESD$:[SRC]COORDINATOR.C;1
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 1772, File = RESD$:[SRC]COORD_THREAD_MANAGEMENT.C;1
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 916, File = RESD$:[SRC]COORD_ARCHIVE_MANAGEMENT.C;1
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 811, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 1667, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS:  ABS_INVLD_FILE_SYSTEM, Invalid archive file system specified
   Line = 1895, File = RESD$:[SRC]ARCHIVE_FILE_SYSTEM.C;1
Facility ABS:  ABS_PLATFORM_SPECIFIC_ERROR, Platform-specific error in diag blo
   %LIB-E-ACTIMAGE, error activating image !AS
%SYSTEM-F-OPINCOMPL, operation is incomplete
  ABS          job terminated at  3-JUN-1997 23:19:12.23
<CR><LF>  Accounting information:
  Buffered I/O count:             957         Peak working set size:  12960
  Direct I/O count:               345         Peak page file size:    67680
  Page faults:                   1141         Mounted volumes:            2
  Charged CPU time:           0 00:00:03.29   Elapsed time:     0 03:06:16.09


2)
type SYS$SYSDEVICE:[SLS$FILES.TTI_RDEV]RDSERVER_LAVX06.LOG

                RDV:079 [06/03-20:11:23.19]  Client connecting
                RDV:079 [06/03-20:11:23.38] ( RCN) client DECnet addr   (get_cli
ent_addr() 0)
                RDV:079 [06/03-20:11:23.56]  ALLOCATE rqst        (from LAVX08::
ABS)
 lavx06$mkc500  RDV:079 [06/03-20:11:23.69]  Allocated
                RDV:080 [06/03-21:10:47.01]  Client connecting
                RDV:080 [06/03-21:10:47.26] ( RCN) client DECnet addr   (get_cli
ent_addr() 0)
                RDV:080 [06/03-21:10:47.40]  ALLOCATE rqst        (from LAVX09::
ABS)
....
 lavx06$mkc500  RDV:079 [06/03-23:15:30.79]  Deallocating         (inactivity ti
mer expired)
 lavx06$mkc500  RDV:079 [06/03-23:15:30.88] ( RCN) LL_Done              (chl 320
, _NET6623:)
 lavx06$mkc500  RDV:079 [06/03-23:15:30.88] ( RCN) Freeing              (chl 320
, _NET6623:)
 lavx06$mkc500  RDV:079 [06/03-23:15:30.89]  Unloading tape
 lavx06$mkc500  RDV:079 [06/03-23:15:30.89] ( RCN) Freeing              (chl 320
, _NET6623:)
 lavx06$mkc500  RDV:079 [06/03-23:15:31.04]  Deallocated


3)
LAVX08>TYPE SLS$ROOT:[TTI_RDEV]RDCLIENT_LAVX08.LOG;25

$ vl=f$verify(0) ! SYLOGIN.COM - debut
RDEV - VMS Remote Device Facility (Version V4.1a) - --RDclient-- Procedure
Copyright (c) 1990-1996 by Touch Technologies, Inc.

            :    [06/03-09:55:45.00] RDCLIENT startup     (V4.1)
            :    [06/03-09:55:45.06] network connect ready (_MBA52:)
            :    [06/03-09:55:45.25] OPCOM relay ready    (Pty=_FTA1:, Mbx=_MBA5
4:)
            :    [06/03-09:55:45.25] Compression DISABLED for this client ()
            :    [06/03-09:55:45.25] Startup complete     ()
LAVX06:: RDV:    [06/03-20:13:15.55] Connecting to server
            :79  [06/03-23:17:23.77] 23:17:23.77 80F60D80 NRP done w/error statu
s 20E4, iocount 59, flags 00050035
            :79  [06/03-23:17:23.78] 23:17:23.77 80F60D80 NRP error status 20E4
flags 00058035
            :79  [06/03-23:17:23.78] 23:17:23.77 Attempting reconnect.  Old: 80E
8AEC0 netucb  80E8AF84 netwcb
LAVX06:: RDV:79  [06/03-23:17:23.78] DECnet link error    (reconnect requested)
            :79  [06/03-23:17:23.80] 23:17:23.77 SEND_RCNCT_CMD ok
LAVX06:: RDV:79  [06/03-23:17:33.57] Attempting reconnect (attempt 1)
LAVX06:: RDV:79  [06/03-23:17:33.62] ...sending rcnct data
LAVX06:: RDV:79  [06/03-23:17:33.78] Served device already freed; server rejecte
d reconnect (%SYSTEM-F-ABORT, abort)
LAVX06:: RDV:79  [06/03-23:17:33.78] Client done with link to server
LAVX06:: RDV:79  [06/03-23:17:33.80] Disconnecting from server
LAVX06:: RDV:79  [06/03-23:17:33.85] Failed resp to pkt from server (%NONAME-W-N
OMSG, Message number 7FF9C9E0)
            :    [06/03-23:17:33.86] Disconnect request, bad sl ()


4)
%%%%%%%%%%%  OPCOM   3-JUN-1997 22:55:32.13  %%%%%%%%%%%
Logfile time stamp

%%%%%%%%%%%  OPCOM   3-JUN-1997 23:17:33.79  %%%%%%%%%%%
Message from user ABS on LAVX08
   ABS FATAL ERROR:  %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN19972
0130855.;
   X08RMS, thread 1   X08RMS, thread 1
   Include spec = (VMS FILES) DKA100:
   Last VMS FILES processed:  [BASES.A]DYNAMIQUE_I_1.DBF

%%%%%%%%%%%  OPCOM   3-JUN-1997 23:18:22.31  %%%%%%%%%%%
Message from user ABS on LAVX08
   ABS FATAL ERROR:  Failed to deallocate volume set
   X08RMS, thread 1   X08RMS, thread 1
   Include spec = (VMS FILES) DKA100:
   Last VMS FILES processed:  [BASES.A]DYNAMIQUE_I_1.DBF

%%%%%%%%%%%  OPCOM   3-JUN-1997 23:19:11.83  %%%%%%%%%%%
Message from user ABS on LAVX08
   ABS JOB SUMMARY:     X08RMS, thread 1
   %BACKUP-F-WRITEERR, error writing LAVX08$RDEVA0:[]3JUN199720130855.;
   Include spec = (VMS FILES) DKA100:
   Last VMS FILES processed:  [BASES.A]DYNAMIQUE_I_1.DBF

%%%%%%%%%%%  OPCOM   3-JUN-1997 23:20:03.49  %%%%%%%%%%%
Message from user ABS on LAVX08
ABS_CLEANUP:  Cleanup performed for aborted job
    Request Name:   X08RMS
    Scheduler Job:  11
    Deleted PID:    000000BD


T.RTitleUserPersonal
Name
DateLines