[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference vmszoo::rms_openvms

Title:RMS asks, 'R U Journaled?'
Moderator:STAR::TSPEERUVEL
Created:Tue Mar 11 1986
Last Modified:Wed Jun 04 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3031
Total number of notes:12302

3029.0. "Duplicate key record is limited?" by EWBV05::KUSAKARI (CSC Tokyo) Tue May 13 1997 06:31

	Hello,

	I'd like to ask about Secondary key of ISAM file.
	I understand Secondary Key Index Format with duplicate as below.
	
	Start ->	                                            End
       +----------------------------------------------------------------
       | Size,KEY,ptr,ptr... Size,KEY,ptr,ptr... Size,KEY,ptr,ptr...
       +----------------------------------------------------------------	

	Question;
	Since size field is defined as 2 Byte, one KEY has 32767(or 65535?)
	as max length. Does this mean maximum records which one duplicate KEY 
	has is limited?

	Thanks Yasuo

	
T.RTitleUserPersonal
Name
DateLines
3029.1EPS::VANDENHEUVELHeinTue May 13 1997 13:1529
> Does this mean maximum records which one duplicate KEY has is limited?
    
    NO. (I almost wished it did though :-).
    
    Before you hit the maximum 16 bit size, you'll hit the absolute
    maximum  bucket size of 63 blocks = 32KB. In practice, the 
    bucket size for alternate keys it often choosen to be 12 blocks or so.
    This allows for roughly 1000 dups per bucket. Once a bucket fills up,
    RMS will just create a next bucket repeating and continuing the same key.
    
    A customer I visited two weeks ago had 5.5 million record in a file 
    with 1.5 million having the same alternate key. They had 1608 buckets
    in the duplicate chain for that key value. Adding a single record with 
    that key value to the file took 20 seconds on a good days, 2 minutes
    on a bad day. :-(. We flipped the 'null key value' bit on the file
    (through FDL ofcourse), converted and a job take used to take 36 hours
    finished in 7 hours and has the potential to finish in 2 hours.
    
    [Because it took so long, they sorted the added records by postal code
    to allow some early exctract to be done whereas the could have
    presorted by primary key order making RMS caches much more effective]
    
    You may want to use my SIDR program on the VMS freeware CD or from
    topic 5 in this conference to analyze an existing file with dups
    (try your own VMSmail file for starters)
    
    hth,
    	Hein.
    
3029.2Thanks, SIDR program is very helpfulTKTVFS::KUSAKARICSC TokyoMon May 19 1997 09:490