[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference clt::cobol

Title:VAX/DEC COBOL
Notice:Kit,doc,performance talk info-->DIR/KEY=KIT or DOC or PERF_TALK
Moderator:PACKED::BRAFFITT
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3250
Total number of notes:13077

3210.0. "DEC COBOL cannot read all the records....." by CSC32::E_VAETH (Suffering from temporary brain cramp, stay tuned) Tue Mar 04 1997 15:22

@(#)DEC COBOL Compiler Driver V2.4-863
DEC Cobol RTL   V2.4-109        1+       5-aug-1996

A customer ftp'd a file from an IBM system to a DUNIX system.  Then used the dd
command to convert it from ebcdic to ascii.
  
    dd if=ebcdic.file of=ascii.file block=32400 conv=ebcdic

Note:
"the block or unblock option cannot be combined with ascii, ebcdic, or
 ibm.  Invalid combinations silently ignore all but the last mutually
 exclusive keyword."  -- don't think that should make a difference....

The file is blocked a 48 records each 675 bytes.  He is finding that DEC COBOL
is not reading all the records in each block and that some are skipped.  The
original file is hugh, however, he was able to reproduce the problem using a
smaller file (49 records).  DEC COBOL will read 48 records and think EOF has
been hit.  

C and Micro Focus COBOL will read all 49 records.  I have placed the programs
and files in a tar file so that the data file will not be modified in any way
when transfering to another system.  It can be pulled from node licris
(16.63.0.10) via anonymous ftp.

Thanks,

Elin

T.RTitleUserPersonal
Name
DateLines
3210.1strange file => strange resultsWENDYL::BLATTWed Mar 05 1997 18:4349
I reproduced the problem and I found out a few things.

First of all, the input file does not match the file description.
The program describes the file as LINE SEQUENTIAL.  The file
has no linefeed record delimiters.

However, that may not be so terrible by itself. The file gets interpreted
as one large record that is read in chunks.  It turns out that
the rules for LINE SEQUENTIAL say that if a record in the file
is too big for the record buffer (based on where the line delimiter
is found (not found?), then as many characters that will fit
in the buffer are returned (675).  The next read starts off with
the next character in the file. (Unlike fixed-length files that 
truncate too-big records)

In any case, we read all the "records" fine as one big too-big record
until we come to the last record and when we still have no
record terminator at the end of the file. Then the rtl gets a little
confused and the last record is not returned.

So, maybe this is a bug. But the file is "bad", so it's hard to
say what should or should not happen with no linefeeds in it.

The c program is merely reading 675-byte chunks.  It is not depending
on record delimiters, so it can't not work.  I have no idea how or
why MicroFocus Cobol reads it..   

I would say we are in the realm of "undefined results".
But I think I know where the "fix" is in any case and I think I
can get it to do what is wanted.

BTW, emacs reads the file as one large record also. A CONTROL-N got
me to the end of the file.  VI gets:

   bill.dat" Line(s) longer than 2048 characters were split. [Incomplete last line] \
   17 lines, 33091 characters

This proves the file is not a valid "LINE SEQUENTIAL" file which is
supposed to be a file suitable for (created by?) a system editor.

I tried making the program describe 675-byte fixed length records,
but that didn't work either. I didn't investigate.

I wonder if there's a way for dd to produce a Unix stream filetype.
That would certainly be the best approach. Or look at other dd options
to produce something more usable.



3210.2thanks..CSC32::E_VAETHSuffering from temporary brain cramp, stay tunedWed Mar 05 1997 20:4811
It would be great to find some sort of workaround for this customer.  They were
originally reading it as a regular sequential file. The results were much worse.
Specifying LINE SEQUENTIAL at least kept the records right.  

I tried some different combinations of the dd command and none worked.  Do I
tell him he really needs to get a better file?? 

Thanks,


-elin
3210.3Good news!WENDYL::BLATTThu Mar 06 1997 00:0932
I found a way for dd to make it a real stream file.

if=bill.dat of=bill.stlf.dat cbs=675 conv=unblock

This may have to be a separate pass over the file after
the initial ebcdic/ascii conversion, but it inserted
newline characters correctly.  It also trimmed trailing
spaces so it makes a more compact file.

I suspect our problems with the fixed length file is the
odd length of 675.  I tried to get dd to pad it out
(conv=sync bs=676), and it *DID* come out the
right size (33075 + 49), but the 49 nul's were
all at the END of the file.  duh.

btw, the BLOCK CONTAINS clause has no affect on a line sequential
file.  

That reminds me, .0 said on the huge file he had trouble
reading *some* of the records.  My analysis in .1 was
based on the "last" record in the file, but I am wondering
if it is going to be once every 49 records?  In your
dd example, you showed block=32400 (675x49). My dd man page
does not show that format for block.  I would like to
know for sure if that is pertinent.  I don't know what the
significance is of the 49th record I debugged, if it was
not that it was the last 'record' in the file.

I still would like to address the behavior of the 
"strange" file.

  
3210.4more infoCSC32::E_VAETHSuffering from temporary brain cramp, stay tunedThu Mar 06 1997 14:154
The customer is using the dd command to go from tape to disk.  He says he has to
do it that way since COBOL won't open an ebcdic file on tape???? I don't have a
tape attached to my system, so I cannot prove/disprove that, can you??

3210.5why not?WENDYL::BLATTThu Mar 06 1997 15:0813
> since COBOL won't open an ebcdic file on tape??

I wonder what he meant by "won't open"?  Did he get an open error or
was the resultant file not what was expected? 

I'm not very familiar with tape processing, but I am pretty sure this 
is do-able.  CODE SET IS EBCDIC comes to mind.  Now I see why the program
had BLOCK clause and LABEL clause.  Were the tape labels really
ansi?   Is the odd-length record issue an issue on tape too?

In any case, what I suggested in .3 is an additional step after
the first dd is run.

3210.6as the tape turns..CSC32::E_VAETHSuffering from temporary brain cramp, stay tunedThu Mar 06 1997 16:509
The customer says the workaround you found doesn't work for him.  It works great
for me based on what he sent me.  Investigation shows that they may not entirely
understand what to do.  He will need to do some more work on his end.

I managed to scare up a travelling tape drive.... will test the tape thing. I
can hardly believe that COBOL doesn't handle tapes.  He states that when he
tries to open the file on tape, he gets a cobrtl error.  He could not remember
what the exact error was.  

3210.7WENDYL::BLATTThu Mar 06 1997 16:572
I'm thinking IBM labels vs. ANSI labels is a possibility.

3210.8fwiwWENDYL::BLATTThu Mar 06 1997 17:418
I confirmed that the problem reading the fixed-length file was
due to it being an odd-length record.  i.e. I removed the padding
in the rtl and removed "LINE" (sequential) from the program and read
the 49 records cleanly.

I also confirmed that the odd-length padding does not occur for
tape files.

3210.9change in the rtl?CSC32::E_VAETHSuffering from temporary brain cramp, stay tunedThu Mar 06 1997 22:488
    >>due to it being an odd-length record.  i.e. I removed the padding
    >>in the rtl and removed "LINE" (sequential) from the program and read
    
    Does this mean you are making a change to the cobrtl??
    
    Thanks,
    
    Elin
3210.10noPACKED::BLATTFri Mar 07 1997 01:006
>    Does this mean you are making a change to the cobrtl??
 
whoops... nope. didn't mean to mislead you.  I just wanted to
see if that was the problem.  That was the easiest thing
for me to try.  

3210.11On the wishlistPACKED::BRAFFITTFri Mar 07 1997 08:5911
>    >>due to it being an odd-length record.  i.e. I removed the padding
>    >>in the rtl and removed "LINE" (sequential) from the program and read
>    
>    Does this mean you are making a change to the cobrtl??
    
    We still have on the wishlist adding an option for A/UNIX to suppress
    this OpenVMS-compatible padding byte for odd-length records (see WISH
    keyword on CLT::DEC_COBOL_IFT note 514 and this note).
    
    Continue to send mail to TEAMLK::NAZEER as you encounter customer
    situations which could make use of such an option on A/UNIX.
3210.12WENDYL::BLATTMon Mar 10 1997 16:5712
Notwithstanding whatever real problems are going on behind 
the scenes here (tape, dd...) I have changed the cobrtl to not
skip certain records in an alleged LINE SEQUENTIAL file (without LF's).

It works on the file supplied -- bill.dat

It has to go thru our testbed before I am fully confident in this change.

Interestingly, I had a real hard time with this file on VMS,
RMS complained bitterly with every attempt I made to get it
over to a OpenVMS disk.  I wanted to see if the OpenVMS code had
any related problems, but you just can't get here from there.