| >I suggested that a DCL loop would be the answer, but how do you keep
>DCL from comparing a file to itself?
Well, at least a few suggestions from my side: My idea would be to use a few
loops. Two loops with each a F$SEARCH("*.*;*",id). (You might also use
"*.TXT;*", or any other filespec, as long as you use the SAME in both
file_id_streams) Within the outer loop, with stream id =1, count the number of
files processed sofar, within the inner loop skip all files with stream id = 2.
until the same file as in stream_id 1. And process all reamining files. This
way you can compare files like: 1-2, 1-3, 1-4, 1-5, 1-n, 2-3, 2-4, 2-n, etc. In
pseudo, out of my head code that would be something like below. Not tested, no
guarantee, use on your on risk, etc...:-)
Success!
_Wim_
$ cnt1 = 0
$ fspec1=f$sea("",1)
$!
$loop1:
$ fspec1=f$sea("*.*;*",1)
$ if fspec1 .eqs. "" then goto done_all
$ cnt1 = cnt1 +1
$ cnt2 = 0
$ fspec2=f$sea("",2)
$!
$skip:
$ if cnt1 .gt. cnt2
$ then
$ fspec2=f$sea("*.*;*",2)
$ if fspec2 .eqs. "" then goto cannothappen?!
$ cnt2 = cnt2 + 1
$ goto skip:
$ endif
$!
$! here fpsec1 and fspec2 should be equal!! (At same file_in_progress)
$!
$loop2:
$!find next file in stream id 2
$ fspec2=f$sea("*.*;*",2)
$ if fspec2 .eqs. "" then goto done_loop2
$!
$! Do your compare, for instance:
$ diff/out=nl: 'fspec1 'fspec2
$! use $status to findout if there was a match, if so report this either
$! on screen or in a log file, (but put this log file in an other directory!!)
$!
$ goto loop2
$done_loop2:
$!
$ goto loop1
$done_all:
$!
$! Do all remaining stuff here....
|
| Can you use the file size as a difference criteria? Sort the files into
subdirectories based on size and then run the program in .1 on each
subdirectory. This could improve run time a lot.
If the files are all the same size and they are text can you write code to
just read them into core and then diff them, SMOP-style? N-million
file diffs sounds like an abuse I'd want my system to avoid!
How about making a copy of all the files. Then in the program given by
.1 when you find a file which is a duplicate DELETE it. When you are done
you have only unique files left.
Good luck,
Chuck
|
|
This is what I ended up with ... thanks for your help, Wim.
$ set verify
$ cnt1 = 0
$ fspec1=f$sea("",1)
$!
$ loop1:
$ fspec1=f$sea("[...]*.txt;*",1)
$ if fspec1 .eqs. "" then $goto done_all
$ cnt1 = cnt1 +1
$ cnt2 = 0
$ fspec2=f$sea("",2)
$!
$ skip:
$ if cnt1 .gt. cnt2
$ then
$ fspec2=f$sea("[...]*.txt;*",2)
$ if fspec2 .eqs. "" then $goto cannothappen
$ cnt2 = cnt2 + 1
$ goto skip
$ endif
$!
$! here fpsec1 and fspec2 should be equal!! (At same file_in_progress)
$!
$loop2:
$!find next file in stream id 2
$ fspec2=f$sea("[...]*.txt;*",2)
$ if fspec2 .eqs. "" then $goto done_loop2
$!
$! Do your compare, for instance:
$ write sys$output ""
$ write sys$output 'fspec1
$ write sys$output 'fspec2
$ write sys$output ""
$!
$ diff/out = [slab.stanlog]stan.log 'fspec1 'fspec2
$! use $status to findout if there was a match, if so report this either
$! on screen or in a log file, (but put this log file in an other directory!!)
$!
$ search [slab.stanlog]stan.log "Number of difference sections found: 0"
$ If $Status .ne. 1 Then $Goto difffile
$!
$ goto delfile
$ goto loop2
$ done_loop2:
$!
$ goto loop1
$ done_all:
$!
$! Do all remaining stuff here....
$ exit
$ delfile:
$ delete/noconfirm 'fspec2
$ goto loop2
$ exit
$ difffile:
$ delete/noconfirm [slab.stanlog]stan.log;*
$ goto loop2
$ exit
$ cannothappen:
$ write sys$output "Cannothappen"
$ exit
|