[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

310.0. "Wanted: Ancestor function for a" by MILOS::THIEL () Tue Jun 25 1985 03:59

This problem is a "real" problem that relates to VMS support for HSC-50
volume shadowing.

Given:  A tree where node i has an identifier IDi chosen from a uniform distribution
	0..ID-1
	A (large) constant C1 (practical values are in the range 2**32 <= C <= 2**512)
	Each node of the tree has a value Vi associated with it.  0 <= Vi < C

Wanted:
	1.  A function F that determines the Vi of a node given the ID of that node
	    and the value, Vp, associated with the parent node.
	    Vi = F (Vp, IDi)

	2.  A value. V0, to associate with the root of the tree.

	3.  A function C that operates on two values, Vi and Vj, and has the value:

		If node i is an ancestor of node j or if node j is an ancestor of
		node j,

			TRUE with probability 1-Pf
			FALSE with probability Pf

		Otherwise,

			FALSE with probability 1-Pt
			TRUE with probability Pt

		1 >> Pf >> Pt

		In other words, the function C may give the "wrong" answer, but should
		be biased toward incorrectly providing the value FALSE rather the value TRUE.

If a response contributes to the practical solution of the problem motivating this entry,
recognition will be given in the VMS source code.
T.RTitleUserPersonal
Name
DateLines
310.1LATOUR::AMARTINTue Jun 25 1985 12:3016
Have you considered an exact solution made by an exhaustive tree search? 
What are the expected and maximum number of tree nodes?  What is the expected
and maximum arity of an individual node?  Does the structure of the tree
change dynamically, or is it static for the lifetime of a particular boot
of a processor?  If it changes, how often (every time a file is opened vs.
every time an HSC50 comes online)?

Alternately, is there room in the representation of these nodes for two other
values between 0 and ID-1?  I believe that there is a method of quickly
determining the ancestor/descendant relationship in a tree by storing the
pre-order and post-order number in the tree nodes.  It might be amenable
to cheap updating of the tree structure by using floating point values so
that numbers can be inserted between others in the sequence without renumbering
the whole tree.  (I have to look all this up to be sure it exists).
				/AHM

310.2BEING::POSTPISCHILTue Jun 25 1985 20:2934
> A tree where node i has an identifier IDi chosen from a uniform distribution
> 	0..ID-1

Is there any pattern to the way the identifiers are assigned?
What values is ID likely to have?
Is it a binary tree, or does the tree have any other useful characteristics?

>	Each node of the tree has a value Vi associated with it.  0 <= Vi < C

Are we free to select these values, or are they already assigned?

> A function F that determines the Vi of a node given the ID of that node
> 	    and the value, Vp, associated with the parent node.
> 	    Vi = F (Vp, IDi)

Will other information be available, such as "left" or "right" (for a binary
tree) or the number of the child (for a general tree)? 

> A function C that operates on two values, Vi and Vj, and has the value:
>	If node i is an ancestor of node j or if node j is an ancestor of
>	node j,

Should that last node be node i, rather than node j?

>			TRUE with probability 1-Pf
>			FALSE with probability Pf

Would Pf = 0 be acceptable?

I think it would help if you explain something about the purpose of this
problem.


				-- edp
310.3LATOUR::AMARTINWed Jun 26 1985 01:2656
Yep, given extra information stored in the nodes, the relation
"x is an ancestor of y" can be determined in constant time.  There are
two methods you can choose from, which are related:

The Art of Computer Programming, by Donald E. Knuth, Volume 1,
Fundamental Algorithms, 2nd Edition:

Section 2.3.3 (Binary Tree Representation of Trees), p347, Exercise 20
(rated 22 on a logarithmic scale of 0 to 50, contains more mathematics
than usual, is an especially instructive problem):

"Prove that if u and v are nodes of a tree, u is an ancestor of v if and only
if u preceeds v in preorder and u follows v in postorder."

Answers to Exercises, p571:

"More generally, let u and v be any nodes of a forest.  If u is an ancestor
of v, it is immediate by induction that u preceeds v in preorder, and
follows v in postorder.  Conversely, suppose u preceeds v in preorder and
follows v in postorder; we must show that u is an ancestor of v.  This is
clear if u is the root of the first tree.  If u is another node of the first
tree, v must be also, since u follows v in postorder; so induction applies.
Similarly if u is not in the first tree, v must not either, since u preceeds
v in preorder."

Alternately, you can use the following from The Design and Analysis of
Computer Algorithms, by Aho, Hopcroft and Ullman:

Section 2.4 (Trees), p55:

"Once numbers have been assigned by a traversal, it is convenient to refer
to vertices by their assigned [(pre-, post-, in-) order] numbers.  Thus, v
will denote the vertex which has been assigned the number v. If the vertices
are numbered in the order visited, then the numberings have some interesting
properties.

In preorder all vertices in a subtree with root r have numbers no less than
r.  More precisely, if D(r) is the set of descendants of r, then v is in D(r)
if and only if r <= v <= r + ||D(r)||.  By associating with each vertex v
both a preorder number and the number of descendants we can easily determine
whether a vertex w is a descendant of v.  After initially assigning preorder
numbers and calculating the number of descendants of each vertex, the
question of whether w is a descendant of v can be answered in a fixed amount
of time independant of tree size.  Postorder numbers have an analogous
property."

If you delete nodes from a tree which has already been initialized with the
information for Knuth's version, then it should still work.  I don't know if
this is true for AHU's version.

I don't know if either algorithm has a constant cost for updating
information in the nodes when adding a new node.  Perhaps one of them does if
you piggyback the updating on the code necessary to add a new node into the
tree in the first place.  If updating is less frequent than determining
the ancestor relationship, it may not matter.
				/AHM
310.4BEING::POSTPISCHILWed Jun 26 1985 13:4629
Re .3:

I think we can combine both of the solutions to get something useful here.
Consider the Vi of each node as a pair of numbers, (a, b).  Traverse the tree
in pre-order, counting the nodes and recording the count in the "a" field of
each node.  Then traverse the tree in post-order, but record the count in
the "b" field.  Then node i, with Vi = (ai, bi), is a descendant of node j,
with Vj = (aj, bj), iff ai < aj and bi > bj.  If a node is to be considered a 
descendant of itself, change "<" and ">" to "<=" and ">=".  Thus, the
desired function, C, becomes:

	(ai < aj) xor (bi <= bj), where "xor" represents the Boolean
	exclusive-or function.

This function returns TRUE iff node i is node j, node i is a descendant of
node j, or node j is a descendant of node i.

To prepare for the future insertion of nodes, count by 10 or some other number
instead of 1.  A good number to count by might be sqrt(C) / (ID+1).  This
leaves the maximum space possible between the nodes.

To insert a new node, Vi, find the nodes that precede and succeed it in pre-
and post-order traversals, average their a's and b's, and truncate the
averages, if non-integers are produced.  If this results in a number equal to
one of the adjacent numbers, it will be necessary to renumber at least part of
the tree.


				-- edp
310.5LATOUR::JMUNZERWed Jun 26 1985 16:4018
An approach with less class than the previous replies, but closer to the
requested structure for the answer:

	V0 = 0

	Vi = F (Vp, IDi) = Vp OR IDi

	C (Vi, Vj) = ((Vi AND Vj) = Vi) OR ((Vi AND Vj) = Vj)

Notes:	[1]	1 > Pt > Pf = 0.  This is NOT what was requested.

	[2]	Pt can be greatly reduced by hashing IDi into a large
		number of bits, say (0 <= hash <= 2**512), and letting

			Vi = F (Vp, IDi) = Vp OR (hash (IDi))

I don't know whether this approach has any virtue other than simplicity.