[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference ulysse::rdb_vms_competition

Title:DEC Rdb against the World
Moderator:HERON::GODFRIND
Created:Fri Jun 12 1987
Last Modified:Thu Feb 23 1995
Last Successful Update:Fri Jun 06 1997
Number of topics:1348
Total number of notes:5438

393.0. "Powerhouse 4GL and Rdb Hashed Indexes?" by MAIL::FARLOWS (Steve Farlow @STO - DTN 445-7212) Tue Aug 01 1989 20:37

    I am currently providing Rdb consulting to a large customer that is 
    using Rdb V3.0A, CDD/Plus V4.0, SQL and Powerhouse.  I know very little 
    about Powerhouse.  We just finished creating the database.  It is a 
    relatively large multifile database with some relations being sorted 
    and others being hashed.
    
    For some reason, the programmers cannot get Rdb to use the Hashed 
    Indexes when using a Powerhouse Quick screen.  When the call is modeled 
    in SQL it works perfectly.  So, the problem appears to be with the way 
    Powerhouse Quick screens are interacting with relations that have 
    hashed indexes.  
    
    I believe that Powerhouse tries to bypass the optimizer and do its own 
    optimization.  To override this feature of Powerhouse, we are trying to 
    use an ACCESS ... VIA ... USING statement.  It seems to work when there 
    are sorted indexes defined but not when there are hashed indexes.
    
    The relation that we are working with has been loaded with 16,000 
    records, 134 bytes per record.  When using the screen, all 6 fields in 
    the key are specified but when watching the optimization strategy Rdb 
    is sequentially searching the relation.
    
    We are currently trying to get help from Cognos on this.  The customer 
    is leaning toward eliminating the  use of hashed indexes, but that 
    would not provide the highest level of performance for the many look-up 
    tables that are used.
    
    Has anyone successfully used Powerhouse with Rdb Hashed Indexes?  Is 
    there any way to see if Rdb is receiving all the values for all of the 
    fields in the index?  Does anyone have any ideas on this?
    
    I am posting this note in the PICA::POWERHOUSE and NOVA::RDB30B and 
    BISTRO::RDB_VMS_COMPETITION notes files.
    
    Thanks for all help,
    
    Steve Farlow @STO
    MDVAX1::FARLOWS
T.RTitleUserPersonal
Name
DateLines
393.1Cross posting answer.ROWING::FEENANJay Feenan, Database Systems Devel.Wed Aug 02 1989 02:487
    this is due to powerhouse converting equalities to range retrievals
    
    (employee_id = 254 would be employee_id between 254 and 254)
    
    Hash indicies can only be used with equalities....contact powerhouse.
    
    -Jay
393.2Cross Posting ResponsesMAIL::FARLOWSSteve Farlow @STO - DTN 445-7212Thu Aug 03 1989 00:4587
    Thanks Jay for cross posting the main answer.  Below are the actual
    responses received - FYI.
    
    Steve Farlow
================================================================================
Note 16.1          Hashed Indexes and Powerhouse 4GL Problems             1 of 4
COOKIE::BERENSON "VAX Rdb/VMS Veteran"               13 lines   1-AUG-1989 11:23
                             -< Need Exact Match >-
--------------------------------------------------------------------------------

HASH indices are only used for exact match retrievals.  My guess is that
POWERHOUSE is not generating the query as an exact match.

You can $ASSIGN "SB" RDMS$DEBUG_FLAGS  to see (a) the BLR being
generated by POWERHOUSE and (b) the execution strategy chosen by the
optimizer.  Using the BLR you can tell if the problem is the way
POWERHOUSE generates the query or something more subtle in the optimization.

If you can obtain the results of a run with the debug flags turned on
and post it here, I'm sure someone with the necessary expertise will
take a look for you.

Hal
================================================================================
Note 16.2          Hashed Indexes and Powerhouse 4GL Problems             2 of 4
NOVA::FISHER "Rdb/VMS Dinosaur"                      17 lines   1-AUG-1989 12:19
                                  -< exactly >-
--------------------------------------------------------------------------------

    In fact previous notes in various places have discussed the fact that
    Powerhouse always does range retrievals as a lazy means of solving
    problems (well, maybe I added the word lazy, as such it is my opinion
    alone).  Thus, a query such as:
    
    FOR E IN EMPLOYEES WITH E.EMPLOYEE_ID = 1234 ...
    
    Is generated by Powerhouse as if it were
    
    FOR E IN EMPLOYEES WITH E.EMPLOYEE_ID BETWEEN 1234 AND 1234 ...
    
    Which always looks to the optimizer as if it were a range retrieval.
    
    The solution is for the customers to put some pressure on COGNOS
    to improve Powerhouse
    
    ed
================================================================================
Note 16.3          Hashed Indexes and Powerhouse 4GL Problems             3 of 4
BRILLO::BIRCH "Peter Birch, DTN 842-3297"             8 lines   2-AUG-1989 04:29
                          -< I didn't realise that! >-
--------------------------------------------------------------------------------

    While I must confess total ignorance on how the optimiser works,
    it surprises me that it doesn't recognise that query as an exact
    match. Would it be very hard to make it do so?                    
    
    Just asking
    
    PDB
                        
================================================================================
Note 16.4          Hashed Indexes and Powerhouse 4GL Problems             4 of 4
RICARD::GODFRIND "Sell the best, connect the rest"   20 lines   2-AUG-1989 05:18
                     -< Query arguments in host variables >-
--------------------------------------------------------------------------------

The optimizer is not aware of the actual VALUES used to perform a query.

A query like 

         FOR E IN EMPLOYEES WITH E.EMPLOYEE_ID BETWEEN 1234 AND 1234 ...

is actually processed like 

         FOR E IN EMPLOYEES WITH E.EMPLOYEE_ID BETWEEN var_1 AND var_2.

The reason for this is that the same query could be reused with different
values in variable 'var-1' and 'var_2'.

Now, of course, the optimizer could be smarter when and if the values are
actually specified as CONSTANTS in the request ...

He could be even smarter and recognize that both var_1 and var_2 may actually
be the same host variable (same address) 	;-)

/albert
393.3Poor optimizer hadn't a clue...WIBBIN::NOYCEBill Noyce, FORTRAN/PARALLELThu Aug 03 1989 20:192
    Wow!  No wonder Powerhouse users don't even benefit from SORTED
    indices as much as they should!