[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vmszoo::rms_openvms

Title:RMS asks, 'R U Journaled?'
Moderator:STAR::TSPEERUVEL
Created:Tue Mar 11 1986
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3031
Total number of notes:12302

3025.0. "Error while trying to recover" by BACHUS::LEEN (Jaak Leen, TP/IM Support Belgium 856-8738) Mon Apr 28 1997 12:03

          <<< MOVIES::DISK$SYSDATA:[NOTES$LIBRARY]DECDTM-VMS.NOTE;1 >>>
                                -< DECDTM-VMS >-
================================================================================
Note 352.0               Error while trying to recover.                  1 reply
BACHUS::LEEN "Jaak Leen, TP/IM Support Belgium 856-8738"  49 lines  25-APR-1997
09:26
--------------------------------------------------------------------------------
We had a strange problem at a customers site a few days ago.

They have an application that is using 'recover unit journaling' and
there are several RMS files participating in the transaction.

Now they have always the same error when they open one of the files invoked
they get the same error messages from OPCOM.

%%%%%%%%%%%  OPCOM  17-APR-1997 10:55:50.05  %%%%%%%%%%%
Message from user FIELD on AGBVR2
%RMSREC-F-OPRSERVER, error occurred during detached recovery unit recovery;
process ID (PID) 00011463

%%%%%%%%%%%  OPCOM  17-APR-1997 10:55:50.06  %%%%%%%%%%%
Message from user FIELD on AGBVR2
-RMSREC-F-FILE, file DISK$USER2:[IMPCON.MTRD3]WIPS.IMP;3

%%%%%%%%%%%  OPCOM  17-APR-1997 10:55:50.06  %%%%%%%%%%%
Message from user FIELD on AGBVR2
-RMSREC-F-INVDDTM, error occurred processing prepare record

%%%%%%%%%%%  OPCOM  17-APR-1997 10:55:50.07  %%%%%%%%%%%
Message from user FIELD on AGBVR2
-SYSTEM-F-NOSUCHPART, specified participant not found

%%%%%%%%%%%  OPCOM  17-APR-1997 10:55:50.07  %%%%%%%%%%%
Message from user FIELD on AGBVR2
-SYSTEM-W-DEVOFFLINE, device is not in configuration or not available

I couldn't find anything that could explain the DEVOFFLINE. Via LMCP I found
a active transaction for that day that was COMMITTED because the the problems
started around that time I advised them to delete the journal-file.


Record number 3 (00000003), 64 (0040) bytes
Transaction state (2):  COMMITTED
Transaction ID: B2ABC8AF-B6D7-11D0-8CCB-414742565232 (17-APR-1997 04:04:32.55)
DECdtm Services Log Format V1.1
Type ( 3): LOCAL RM         Log ID: 037100C0-0029-0003-7C23-000000000000
Name (22): "RMS$USER2.......*.D..."
     (0000 0144162A 00000000 00000032 52455355 24534D52)

The disk with label 'RMS$USERS2' was online.

We saved the journal-file and the transaction-log. Any idea where to look next.

Thanks in advance,

Jaak
================================================================================
Note 352.1               Error while trying to recover.                   1 of 1
MOVIES::POTTER "http://www.vmse.edo.dec.com/~potter/"  7 lines  25-APR-1997 10:39
--------------------------------------------------------------------------------
Jaak,

I think this is more of an RMS-Journaling issue than DECdtm - have you 
asked the RMS folk?

regards,
//alan
T.RTitleUserPersonal
Name
DateLines
3025.1STAR::TSPEERTue Apr 29 1997 13:1526
Jaak,

RMS is failing detached recovery because one of its calls to DECdtm 
is failing -- RMS simply passes on this error.  I strongly suspect
that the DECdtm call is $GETDTI, which RMS uses during recovery to
request that DECdtm return the global status (committed or aborted) of
a transaction in which RMS was a resource manager.  NOSUCHPART from
DECdtm means that for some reason DECdtm fails to recognize RMS as
having been involved in the transaction.  While there could be
numerous theoretical reasons for this failure, including a bad $GETDTI
call from RMS, the fact that this appears to be specific to certain
files involved in a specific transaction makes me wonder whether
DECdtm is having trouble accessing its log information for the data it
must return in the $GETDTI call.  Could DEVOFFLINE be referring to the
device where the DECdtm transaction log was located?  Does *any*
attempt to access the affected file(s) -- e.g. a simple DCL OPEN --
cause the same error?  Is only one file involved, or do all files
involved in the transaction exhibit the same behavior?

Before bouncing this entirely back on the DECdtm people, you might
check the file SYS$MANAGER:RMSREC$SERVER_ERROR.LOG, which is
created/appended to whenever detached RMS recovery encounters a fatal
recovery error.  There is a (slim) chance it may contain additional
information.

Tom Speer
3025.2MOVIES::POTTERhttp://www.vmse.edo.dec.com/~potter/Wed May 07 1997 08:2043
>%%%%%%%%%%%  OPCOM  17-APR-1997 10:55:50.05  %%%%%%%%%%%
>Message from user FIELD on AGBVR2
>%RMSREC-F-OPRSERVER, error occurred during detached recovery unit recovery;
>process ID (PID) 00011463
>
>%%%%%%%%%%%  OPCOM  17-APR-1997 10:55:50.06  %%%%%%%%%%%
>Message from user FIELD on AGBVR2
>-RMSREC-F-FILE, file DISK$USER2:[IMPCON.MTRD3]WIPS.IMP;3
>
>%%%%%%%%%%%  OPCOM  17-APR-1997 10:55:50.06  %%%%%%%%%%%
>Message from user FIELD on AGBVR2
-RMSREC-F-INVDDTM, error occurred processing prepare record
>
>%%%%%%%%%%%  OPCOM  17-APR-1997 10:55:50.07  %%%%%%%%%%%
>Message from user FIELD on AGBVR2
>-SYSTEM-F-NOSUCHPART, specified participant not found
>
>%%%%%%%%%%%  OPCOM  17-APR-1997 10:55:50.07  %%%%%%%%%%%
>Message from user FIELD on AGBVR2
>-SYSTEM-W-DEVOFFLINE, device is not in configuration or not available

Okay, let's see what we know here.  RMS-Journaling is trying to recover the
log, and is getting SS$_NOSUCHPART from the transaction manager.  That means
that DECdtm has given up knowledge of RMS-Journaling for this file being
ionvolved in the transaction, either because it never was involved, because it
received a response from RMS-J saying that RMS-J was not going to ask about
the outcome of the transaction again, or because of a DECdtm bug.

No such DECdtm bug has been reported or observed previously.

Can you explain to me what the last line of this sentence means?  Have you
advised the customer to change any records in the DECdtm log, or alter a
DECdtm journal file in any way?

>I couldn't find anything that could explain the DEVOFFLINE. Via LMCP I found
>a active transaction for that day that was COMMITTED because the the problems
>started around that time I advised them to delete the journal-file.


As for DEVOFFLINE, I have no idea where that is coming from...

reards,
//alan
3025.3A small window ...BACHUS::LEENJaak Leen, TP/IM Support Belgium 856-8738Thu May 08 1997 15:5312
    Thanks for the explaination. No, I didn't advise  the customer to
    change the DECdtm log's but because I suspected the one transaction
    flagged as committed the customer could remove the RMS journal file so
    RMS would not attempt to do the recovery.
    
    Is it possible that there's a small window in which RMS tells DECdtm
    that it nolonger is participating in the transaction and RMS actually
    deleting the file?
    
    Regards,
    
    Jaak
3025.4STAR::TSPEERThu May 08 1997 17:0918
> 
>     Thanks for the explaination. No, I didn't advise  the customer to
>     change the DECdtm log's but because I suspected the one transaction
>     flagged as committed the customer could remove the RMS journal file so
>     RMS would not attempt to do the recovery.

I hope you were absolutely sure before removing the RMS journal that
there was no uncommitted transactions in that journal; otherwise some
transactional updates may be lost. 

>     Is it possible that there's a small window in which RMS tells DECdtm
>     that it nolonger is participating in the transaction and RMS actually
>     deleting the file?
    
I know of no windows in RMS journaling's use of DECdtm services which
can lead to the problem you seem to be describing.

Tom