[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

798.0. "Algorithm for a non-math hack!" by KIM::RICE () Thu Dec 03 1987 16:03

    I am in need of an efficient hashing algorithm for a screen management
    package under developement. Being not of the mathmatical aptitude
    evident in this conference, verbose replies would be most appreciated.
    
    Here's the problem:
    	
    	Given numerous display regions defined by a set of cartesian
    	coordinates representing opposing corners of a rectangle, 
    	find each region which contains a given point.
    
    ie:
    
    	Region A: (0,0) , (10,10)
    	Region B: (3,3) , (12,12)
    	Region C: (8,8) , (10,10)
    
    	Given point (4,4) => Region A and Region B but not Region C. 
    
    Hope that makes sense.
    
    All help greatly appreciated!
    
    Rick Rice
    DTN 438-4136
    	
        
T.RTitleUserPersonal
Name
DateLines
798.1Why hash?SQM::HALLYBKhan or bust!Thu Dec 03 1987 16:3814
    I don't understand.  Hashing is used to find exact matches, not
    near misses.  Unless you are willing to create entries for every
    coordinate in your screen package, hashing won't get you anywhere.
    
    An alternative would be to define an array SC of <width> * <length>
    longwords, 32 bits each, and use the longwords as a bit mask.
    Following your example in .0, you would have SC(4,4) = 000...00011
    where bit 0=1 for Region A, bit 1=1 for region B, all other bits=0
    since (4,4) is not in any other regions.

    This would limit you to 32 regions.  Of course you could increase
    this to any number by using a 3rd dimension in SC...

      John
798.2SMURF::DIKEThu Dec 03 1987 19:1512
    You could build a tree with the property that the left child
    of a node contains regions that are completely to the left of the
    region contained in the right child.  The performance of this scheme
    depends heavily on what kind of regions you expect.  If you expect
    small, scattered regions, then this will work pretty well.  If there
    are large, overlapping regions, then it will amount to a linear
    search, which is another possibility.
    
    The X servers do this sort of thing a lot, and they use lists and
    search them linearly.
    					Jeff
    
798.3Correction to .0, I blew it again!KIM::RICEThu Dec 03 1987 22:5722
    Correction to .0:
    
    	Regions may not overlap so only one region may be found for
    a given set of points. ie:
    
    Region A: (0,0) , (4,4)
    Region B: (2,5) , (4,7)
    Region C: (5,0) , (10,2)
    
    Given (1,1) Find Region A but not Region B or C.
    
    You ask why I didn't say this in the first place, well
    "Ole DOPEY ME!"
    
    The approach in .1 sounds quite fast yet very costly in resources.
    Is there any way to reduce the array size knowing that only one
    region may be returned?
    
    Apologies for the screwup!
    
    Thanks,
    Rick
798.4SQM::HALLYBKhan or bust!Fri Dec 04 1987 00:0420
>    The approach in .1 sounds quite fast yet very costly in resources.
>    Is there any way to reduce the array size knowing that only one
>    region may be returned?

    Sure, just use one mapping SC(0:XMAX, 0:YMAX) where SC(X,Y)
    holds the value of the one region that owns that spot.
    SC could be a byte or word array if the number of regions
    is limited to 255 or 65535 respectively.
    
    Since I don't know the value of XMAX or YMAX it's a bit difficult
    to determine how efficient this approach is.  If, however, you're
    thinking of a screen say as large as a GPX then I estimate maybe
    200 characters by 60 lines = 12,000 bytes | words | longwords or a
    few dozen pages is all it takes.
    
    If on the other hand XMAX and YMAX are pixel-based, this approach
    may not be the best.  But you did say "screen package" and didn't
    supply coordinates way out there in pixeland, so I assume character-based.

      John
798.5MATRIX::ROTHMay you live in interesting timesFri Dec 04 1987 12:1434
    For practical purposes a simple linear search would suffice, I'd think.

    You could build a space cutting tree otherwise.  Recursively
    make a bounding box surrounding the set of regions.  Then divide
    it at the midpoint of its longest extent (X or Y).  Split the rectangles
    into 3 sets - those to the left of the line, those to the right, and
    those that cross.  Recursively apply this to the three sets, until
    each leaf node contains only a single rectangle (assuming none overlap.)
    Instead of midpoint, you could use the median for a somewhat better
    balanced tree.

    To search in log expected time, recursively probe the tree to the side of
    the midpoint cutting line containing the query point, and also probe
    the middle tree.  This prunes the other side of the tree from the search,
    and on average the middle tree will be quite small compared to the
    side trees.

    You could also skip probing a subtree if the query point is not
    in its bounding box - that may save some time in exchange for the
    extra checking to do at each level.

    Maintaining this structure dynamically is a problem though since you
    don't want to go thru the whole tree making process every time the
    input structure is changed.  There are ways of doing this dynamically.
    You could just add boxes to an existing tree and reconstruct only once
    in a while for reasonably low amortized cost.

    Note that this only wins for really large numbers of rectangles!

    If you've got fewer than a couple dozen or so, just do a linear search -
    the overhead of keeping recursive context will more than outweigh
    the gains.

    - Jim
798.6More InfoKIM::RICEFri Dec 04 1987 13:4214
    Sorry for the ambiguity. This should clear up the problem.
    
    1) The display is pixel based on a GPX
    2) The regions may not overlap
    3) The regions are generally small and scattered
    4) Average number of regions is around 25 but may be much greater
    5) The regions are generated dynamically
    6) The search will be executed for every mouse movement thus
       the algorithm must be very effective
    
    Hope this helps and THANKS for the replies thus far. 
    
    Rick
    
798.7MATRIX::ROTHMay you live in interesting timesWed Dec 09 1987 08:5234
    Is there a minimum size to the regions?

    If so, then a practical approach is to subdivide the screen into
    a rectangular array of cells small enough that no more than one
    region can fit inside each one.  Then each cell can intersect at most
    4 regions, and you could have the array of cells contain up to 4
    region ID's each.  Spacewise this is no problem even if the region ID's
    are longwords.

    To do a mouse motion query, identify the cell the mouse is in by
    dividing and truncating and check up to 4 of the regions overlapping
    the cell.  To add or remove a region loop thru the cells the region
    overlaps and modify their little 4 element overlapping region arrays.

    Another idea would be to use a signature table in conjunction with
    the bitmask idea mentioned in .1.

    "Hash" each region ID into a set of (say) 32 buckets, and let each
    screen cell contain a longword of 32 bits.  The 32 buckets will be
    listheads of regions hashing to that bit.

    To poll the mouse, classify which cell it's in and check for a hit
    on all regions on any flagged bucket listheads.  To add or remove a
    region, set or clear the bits in the cell array and add or remove the
    region from the signature table.

    The advantage of this approach is generality, since there is no
    limitation on the number of regions that can overlap each screen cell.

    In either of these, the cell array won't be too bad - even a simple
    32 by 32 array would have only 1024 entries but would cut down the average
    search times quite a lot...
    
    - Jim