[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

900.0. "Color Mapping. Any idea?" by TRN01::CARUSO (Angelo Caruso - SWAS Genoa - Italy) Fri Jul 08 1988 18:06

    This is really a graphics problem, but perhaps can excite ingenuity
    of somebody math oriented...
    
    The problem is mapping in an optimal way a lot of points (about
    one million at most) of a tri-dimensional space to a smaller set
    (typically 16 or 256) of points. Optimal could mean that the sum
    of the distance squared from each point to the point that approximate
    it is minimal.
    
    I don't understand if could be possible to find a direct solution,
    perhaps an heuristic algorithm is more feasible.
    
    Any idea around ?
T.RTitleUserPersonal
Name
DateLines
900.1CTCADM::ROTHIf you plant ice you'll harvest windFri Jul 08 1988 19:594
    I had mailed a routine to you some time ago, but your node does seem
    notoriously hard to reach.

    - Jim
900.2ZFC::DERAMOFor all you do, disk bugs for you.Fri Jul 08 1988 23:122
     Re .-1, is it short enough to post here?  Or maybe a
     few of us can forward it using NMAIL. :-)
900.3CLT::GILBERTTue Jul 12 1988 15:5226
    It sounds like an interesting problem.  I'd be tempted to try the
    following approach:

    Choose 16 (or 256) initially selected points as follows:

	Choose a point (from amoung the million) that is farthest from
	any initially selected point.  Repeat.

    Now 'relax' the selected points.

	For each of the 16 (256) points, determine the 'force' on it by
	summing the 'forces' from each of the points (amoung the million)
	that map to it.  Each of these forces is simply the distance
	(or distance squared) between the points, expressed as a vector
	pointing away from the 'selected' point.

	Use this total force to determine the direction (and amount)
	by which each selecetd point should move.

    Move the 16 (256) selected points, and redetermine to which selected
    point each (of the million) point is closest.

    Repeat the relaxation & movement steps.


    Surely, there's a better way.
900.4color map algorithmCTCADM::ROTHIf you plant ice you'll harvest windTue Jul 12 1988 19:3355
    The technique I actually used was to take robust statistics on
    the gamut of colors by building a KD tree of heirarchically nested
    bounding boxes in R3 around the image colors.

    When 16 (or 256) of these have been allocated, you can take the
    centroid of the points within each box as a first estimate of the
    color map.

    To quickly map image colors (a 512**2 image has 1/4 million pixels)
    to the nearest neighbors you can subdivide the color cube into 32**3
    subcubes, initially unoccupied.

    Then each image pixel is quantized into one of the subcubes (which gives
    resolution of 15 bits by itself), and a list of 'reachable colors'
    is created for that cube if one doesn't already exist for it.

    A color map entry is reachable from within a subcube if it is closer to
    some point within the cube than any other color.  The set of reachable
    colors includes the color that is closest to the center of the subcube,
    and in addition any other colors such that the plane bisecting the space
    between any pair of colors in the list intersects the subcube.  On account
    of the convexity of a cube this happens when not all vertices of the
    subcube are on the same side of the bisecting plane.

    The KD tree is used to select an initial set of candidates within a
    spherical shell around the center of the cube; this takes O(log n)
    to make the search.  Then the cutting plane is used to quickly reduce
    the tentative set to the true reachable points.  The spherical shell
    has only to be the thickness of the length of the major diagonal; this
    is easy to incorporate into the basic KD tree search.

    Statistically only 2 or 3 points are reachable from each subcube,
    so the technique is an economical way of doing the nearest neighbor
    search.

    Relaxation can optionally be used to find a local optimum for the color
    map.

    To do this, a pass thru the image is made and the centroid of all image
    colors that hit each color map entry is accumulated.  Then the color
    map entry is updated to that centroid and the process is repeated.
    This is referred to as the 'Basic Isodata Algorithm' by pattern
    recognition workers, but has been rediscovered many times.

    Diffusive dither is used on the final image mapping; it adds the
    benifits of halftoning to the process.  It simulates the heat (diffusion)
    equation by dispersing any mapping error on a pixel to the adjacent
    pixels below and to the side.  Some precautions have to be taken because
    the image colors don't necessarily lie in the convex hull of the
    assigned color map when doing this.

    Diffusive dither is very effective - even with only 16 colors images
    for the VT340 come out amazingly well.

    - Jim
900.5sounds like a job for ... cluster analysisZFC::DERAMOSupersedes all previous personal names.Tue Jul 12 1988 23:1718
>>    The problem is mapping in an optimal way a lot of points (about
>>    one million at most) of a tri-dimensional space to a smaller set
>>    (typically 16 or 256) of points. Optimal could mean that the sum
>>    of the distance squared from each point to the point that approximate
>>    it is minimal.
     
     This sounds a lot like something called "Cluster Analysis".
     It usually starts with the values of k variables for n
     objects and clusters them into groups.  One technique tries
     to minimize the sum of squared distance of each point from
     its cluster's centroid.  Another tries to find tree-like
     hierarchies like you would see on a chart of species
     evolution.  There are books (and statistical packages,
     probably not public domain, but who knows) available on the
     subject.
     
     Dan 
        
900.6oops, forgot to addZFC::DERAMOSupersedes all previous personal names.Tue Jul 12 1988 23:195
     I forgot to add that I don't think the methods I studied
     were intended for one million by three data sets.  (The
     three is okay, the one million is too big.)
     
     Dan
900.7ISTG::GOKHMANBoris the BearThu Aug 04 1988 21:1214
    Let me venture a guess, it IS a lot like a typical problem of pattern
    recognition (cluster analysis is one of the terms often used). If
    so, there are many algorithms to solve it. There has been a good
    deal of research in  Computational Geometry, devoted to just this
    subject, partitioning of the multidimensional point set into the
    "nearest neighbor" clusters. In 2d look for the term(and algorithms
    on) Voronoy Polygons.

    Look for publications by Shamis from CMU, he published proofs that
    such clustering may be done in NlogN time for a set of N points
    by "divide and conquer" algorithms.
    
    Boris
900.8CTCADM::ROTHIf you plant ice you'll harvest windSun Aug 07 1988 16:0836
    Re .-1

	It is indeed a cluster analysis problem.  I found pattern
    recognition textbooks to be helpful, and used some of the tools
    mentioned (kd trees, computational geometry, etc.) in the colormap
    routine.

	The following books give a good survey of useful results:

	Duda, R. O. and Hart, P. E.
	Pattern recognition and Scene Analysis
	Wiley, 1973

	Tou, J. T. and Gonzalez, R. C.
	Pattern Recognition Principles
	Addison Wesley, 1974

	For color map quantization proper, Heckberts SIGGRAPH paper is
    excellent.  In addition a recent ACM TOMS paper discusses a variance
    based refinement in the initial clustering phase that looks good.
    I experimented with something similar to their idea also.

	Heckbert, P.
	Color image quantization for frame buffer display
	July 1982 SIGGRAPH proceedings

	Wan, S. J., Wong, K. M. and Prusinkiewicz
	An algorithm for multidimensional data clustering
	ACM TOMS June 1988

    Clustering is important, but proper dithering and perceptually based
    error metrics are also very important.  It is not necessarily true that
    an algorithm that minimizes the mean-squared error will give the best
    looking picture.

    - Jim
900.9CTCADM::ROTHIf you plant ice you'll harvest windSun Aug 07 1988 16:1818
900.10Turbo Vector QuantizationMAMIE::BOTTOMSTue Oct 18 1988 14:1212
    If I understand the problem it is also referred to as Vector
    Quantization.  The approach suggested in the book on Image Processing
    by Gonzales and Wintz is to take the log of the histogram of points
    and then assign new vectors to the resulting histogram such that
    the greatest area under the curve is covered by at least one vector.
    Again, this does not cover the psycho-visual question totally but
    the log compression does help.  It also helps in dealing with the
    busy work of making sure multiple new vectors are not assigned to
    the same original vector.  Other variations of this approach use
    a
    "bin seeding" algorithm but it is essentially the same math with
    an easier algorithm to implement.