[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

335.0. "Infinity" by BEING::POSTPISCHIL () Sat Sep 21 1985 22:16

Note 203.0 raises the topic of the maximum length of strings with a certain
property.  203.8 gives a proof which claims there is an "infinitely long string"
with that property.

In response to a claim that the proof given in .8 demonstrates the existence
of arbitrarily long strings, but not infinite ones, 203.15 and .17 cited
the "infinity lemma" as support, from Donald E. Knuth's _The Art of Computer
Programming, Volume 1:  Fundamental Algorithms_, second edition, pages 381
to 383.  That volume states:

	Theorem K.  (The "infinity lemma".)  In any infinite oriented
	tree for which every vertex has finite degree, there is an
	"infinite path from the root", i.e., and infinite sequence of
	vertices V[0], V[1], V[2], . . . in which V[0] is the root and
	[V[j] is the parent of V[j+1]] for all j >= 0.

(The brackets in the last line show paraphrasing.)

The construction of .8 can be thought of as a tree, but application of the
infinity lemma merely shows that there is an infinite path through the
construction, but not that an infinitely-long string results as an element
of the tree.

To see this, note that an infinite ordered tree for which every vertex has
finite degree can be easily constructed from the set of non-negative integers:
0 is the root and n+1 is the descendant of n for all n >= 0.  The infinity
lemma correctly tells us there is an infinite path from the root, yet none
of the integers are infinite.

The proof of .8 shows that for any finite integer, it is possible to find
a string with the specified property which has a length greater than the
given integer.  To show there is an infinitely long string, it is necessary
to show there is a string whose length is greater than any integer.

Observe that in the case of the tree of integers, the path 0, 1, 2, 3, .
. . has a length which is greater than any integer.  This is a single path,
it is not necessary to change it when we are requested to show a path with
a longer length.  But in .8, it may be necessary to find a _new_ string when
a _new_ integer is put forth.  Thus, no single string has been demonstrated
to be infinite.

203.13 discusses the possibility of generating a string in a stream fashion,
so that a single string can be continued forever.  If this is possible, there
is a string with infinite length.


				-- edp
T.RTitleUserPersonal
Name
DateLines
335.1BEING::POSTPISCHILTue Sep 24 1985 21:2838
Re 203.29, .30:

"Gilbert's string" refers to the string given in 203.7.

> 	Suppose an infinitely long string (containing no consecutive
> 	duplicated substrings) doesn't exist; then some such string must
> 	be a longest -- let the length of a longest such string be M.

Editor's note:  I believe "some such string", above, should be "some string".

This is incorrect.  Counterexample:  Substitute "finite integer" for "long
string", and change lengths to magnitudes.  Then the argument becomes:
Suppose an infinite finite integer does not exist (known to be true); then
some finite integer must be greatest (known to be false).

Consider the set { "A", "AA", "AAA", . . . }, where the length of each string
is finite.  This set contains neither an infinite string nor a longest string,
yet it contains strings of arbitrary (although finite) length.

>    ->	Note that the algorithm in .13 approaches the lexically first string
> 	from below.  Any particular infinitely long string places an upper
> 	bound on the lexically first string.  If the lower bound and the
> 	upper bound start off the same way, then so must the true lexically
> 	first string.

With a slight change, I agree with this.  The change is to consider, rather
than "any particular infinitely long string", a prefix (beginning of a string)
for which it is known there exist strings of arbitrary length in our set
(which I am going to call NR, for non-repeating, unless somebody else would
like to use a different name) beginning with that prefix.  Then, if we have
such a prefix, and we know the search-from-below algorithm has gotten to
at least that prefix, and we know there is an infinitely-long string, then
the first such string must begin with that prefix.  However, we do not know
there is an infinitely-long string.  The search-from-below algorithm might
continue forever, without ever passing the prefix and without converging.


				-- edp
335.2BEING::POSTPISCHILWed Sep 25 1985 13:3480
You will all be happy to know there is an infinitely-long string in NR (the
set of strings arising from the problem given in 203.0).

A proof follows.  Those of you who know me may have realized I like formal
proofs.  The following proof may be somewhat long, although I have been
relatively loose by my standards in using such things as "clearly" to shorten
the proof.  Also, it occurs to me that some of the points I make may seem
obvious or redundant, but they are probably there because some strange quirk
of infinity is being dealt with.  Feel free to ask questions or make complaints.

Herein, the term "integers" denotes non-negative integers.

Consider the search-from-below algorithm, which tries "A", then "B", then "C",
and backs-up if none of these characters fit.  We know there are
arbitrarily-long strings in NR.  Since there are strings of arbitrary length in
NR, there clearly are an infinite number of strings in NR.  Therefore, the
search-from-below algorithm can never terminate. 

The algorithm is said to reach a string, q, iff q is in NR and the algorithm
generates q as a possible prefix for an infinitely-long string.  I have also
used the expression "go to" to mean "reach" and "passes" to mean "reaches
a string lexically greater than but not longer than".

Clearly, the algorithm never stops reaching new strings (otherwise, it would
terminate).

Let us state that a string, y, is a stable prefix iff y is reached and not
passed by the algorithm.  I will now prove, by induction on the length of stable
prefixes, that stable prefixes of all finite lengths exist. 

Clearly, the empty string is a stable prefix, since there are no strings not
longer than the empty string which are lexically greater than the
empty string. 

Suppose y is a stable prefix.  Then there must be a stable prefix yq, where q is
in { "A", "B", "C" }.  Proof:  Clearly, the algorithm must attempt to place an
"A" after y.  If the algorithm never goes to a "B" or "C" after y, then yA is a
stable prefix, by definition.  If it does go to a "B", but does not go to a "C",
then yB is a stable prefix, by definition.  If it does go to a "C", then it must
stay there; otherwise, it would have to back-up and increase a character.  Note
that it must back-up and increase a character, since the only other option is to
back-up to the empty string and terminate, but we know this does not occur.
Thus, the algorithm would reach a string lexically greater but not longer than
y, which is impossible since y is a stable prefix. 

Note that there must be a finite number of stable prefixes of a given finite
length (in fact, there is exactly one).  Therefore, there is a last stable
prefix of a given finite length.  Consider the last stable prefix of finite
length n, y, and the last stable prefix of length n+1, z.  Then yq = z, where q
is in { "A", "B", "C" }.  Proof:  If, in considering what to place after y, the
algorithm passes z, then z is not a stable prefix, since the algorithm went to a
string lexically greater than z but not longer than z.  If the algorithm does
not reach the q that creates z, then z is not a stable prefix, since stable
prefixes are reached by the algorithm.  Therefore the algorithm reaches and does
not pass z.

Let y(n) denote the last stable prefix of finite length n.  Let c(y,n) denote
the n-th character of string y.  Let s be the string such that
c(s,n) = c(y(n),n), for all finite integers n.  Note that, if s = pq, where
p is a string of finite length, then p is a stable prefix.  Proof:  The length
of p must be some finite integer, n.  Then p = y(n), since c(p,n) = c(y(n),n),
by definition of s, and c(p,n-1) = c(y(n-1),n-1) and c(y(n-1),n-1) =
c(y(n),n-1), since y(n-1)q = y(n) for some q in { "A", "B", "C" }.  It should
be clear that c(p,n-i) = c(y(n-i),n-i) = c(y(n),n-i), so p = y(n).

s is infinite.  Proof: Let m be any finite integer.  The length of s is greater
than m, since the m+1-th character of s exists (it is c(y(m+1),m+1). 

Also, s is in NR.  Proof:  Suppose there is a string, x, in s, such that
s = pxxq for some string p and some string q.  The length of p is finite
(otherwise the characters of x would be at infinite positions in s, but s
has been defined only with finite positions).  Similarly, the length of x
is finite.  So pxx is finite.  By the second preceeding paragraph, pxx is
a stable prefix.  But then pxx must contain no repeating string.  But it
does, so this is a contradiction.  Therefore there is no string, x, in s,
such that s = pxxq for some string p and some string q.  By definition of
NR, s is in NR.


				-- edp
335.3R2ME2::GILBERTWed Sep 25 1985 22:3456
I bow to your "infinite wisdom" :-).

			----------------
re 335.0:

> 203.13 discusses the possibility of generating a string in a stream fashion,
> so that a single string can be continued forever.  If this is possible, there
> is a string with infinite length.

Indeed.  Let f(0) be the string "A", and let f(i+1) be the string generated
by applying the substitution in 203.8 to f(i), where i >= 0.  We note that
any string with the prefix f(j) will generate a string with the prefix f(j+1).
Thus, for all j >= 0, f(j) has f(0) as a prefix (they all start with "A"),
and for all j >= 1, f(j) has f(1) as a prefix (by induction on j), and ...,
and for all j >= k, f(j) has f(k) as a prefix.  That is, for any k >= 0,
f(k) is a prefix for an arbitrarily long string in NR.

This should greatly simplify the proof, and may be sufficient of itself.
That is, you could use: "Let s be the string such that c(s,n) = c(f(n),n)
for all finite integers n", remarking that f(n) has more than n characters,
by induction.

Note that this also provides a fast way to generate the first N characters
of an infinitely long string in NR, in O(NlogN) time and O(logN) space.

			----------------

> Suppose an infinitely long string in NR doesn't exist; ...

I think I see your point.  [Note: "some such string" means "some string in NR"].
The point is that, for an infinitely long string to exist, we should prove a
convergence to an infinitely long string.  Also, a string is of infinite length
if and only if it's length is larger than any finite number; right?

To describe the flaw differently, it's possible to have an infinite number of
distinct strings from the set ("A"|"B"|"C")*, without any string having infinite
length.  Right?  (this is also reflected by the 4th paragraph in 335.2).

			----------------

You seem to have the following problem with your bounds argument (in 335.0)
("it's a problem for you, but not for me", to quote Osman's brother).
If we know there are 'arbitrarily long' strings with that prefix, and that
there is an 'infinitely long' string, how are we assured that that 'infinitely
long' string must come before all 'arbitrarily long' ones?

Is there anything wrong with the original statement, assuming that an infinitely
long string in NR exists (which I'd assumed, and stated as assumed, and which
also happens to be true)?

			----------------
re 335.1:

> Feel free to ask questions or make complaints.

Picky, picky.
335.4BEING::POSTPISCHILThu Sep 26 1985 12:3933
Re .3:

Of course, the string generated by the substitution algorithm does reproduce
itself as a prefix for the generated string!  That makes everything simpler.
                         
> Also, a string is of infinite length if and only if it's length is larger than
> any finite number; right? 

Yes.

> If we know there are 'arbitrarily long' strings with that prefix, and that
> there is an 'infinitely long' string, how are we assured that that 'infinitely
> long' string must come before all 'arbitrarily long' ones?

I think you are referring to 335.1, not .0.  To answer the question, note
that there are an infinite number of arbitrarily-long strings before any
infinitely-long strings (prefixes of the infinitely-long strings).  The actual
question is how do we know the algorithm converges to an infinitely-long
string?  The answer is in .2.  :-)

Quiz Questions:

        We have sort of been assuming (A+B+C)* contains infinitely-long
        strings, since NR is a subset of it.  (This depends upon our
        definition of "string".)  Demonstrate a set which contains
        arbitrarily-long strings but not infinitely-long strings. 
        
        The proof in 335.2 seems to construct a string by specifying
        each character as c(y(n),n).  However, it is not actually
        a proof by construction.  Why not?
        
        
				-- edp
335.5METOO::YARBROUGHThu Sep 26 1985 12:551
re quiz question 1 in .4: see note 325.1.
335.6R2ME2::GILBERTThu Sep 26 1985 17:426
Another problem with your proof is that it relies on the algorithm for
generating the lexically first string.  Specifically, that the algorithm
will back up (to correct a letter) in a finite amount of time.  Imagine
the NR equivalents of "AB", "AAB", "AAAB", et cetera.  The algorithm could
spend an unbounded amount of time with these, before backing up to correct
some letter.
335.7BEING::POSTPISCHILThu Sep 26 1985 19:4623
Re .3:

> Is there anything wrong with the original statement, assuming that an
> infinitely long string in NR exists (which I'd assumed, and stated as assumed,
> and which also happens to be true)? 

Of course, the "happens to be true" part is entirely coincidental.  :-)

I am not sure which original statement you are referring to.  The proof given
in 203.8, as it is originally, is wrong in claiming to prove there are
infinitely-long strings in NR.  With a little extension, it can be used to
show there are infinitely-long strings in NR.


Re .6:

Another problem?  I missed the first one.  I am not sure your objection applies;
the time the algorithm spends looking for strings is irrelevant.  In every
case, it either passes or does not pass a certain string.  I use that fact,
but I do not use any information about how long it takes.


				-- edp
335.8R2ME2::GILBERTThu Sep 26 1985 22:5622
There are an infinite number of finite strings in NR.  Suppose that there are
an infinite number of finite strings in NR that lexically precede the first
infinitely-long string in NR (e.g., none of these is a prefix to an infinitely
long string).  Then, the algorithm will never 'pass' your string, even though
that string is *not* a prefix of any infinitely long string.

I re-read and understand your argument, but the apparent contradiction above
leaves me fuzzy.

A simple assumption I'd made long ago was the following:  If there are an
infinite number of strings (from a finite alphabet) in a set, then that set
contains an infinitely long string (i.e., a string longer than any finite
string).  Isn't this true?  The infinity lemma seems directly applicable:

Let the root of the tree be associated with the null string, and all other
nodes are associated with a single letter from the finite alphabet, so that
every path from the root to a node is associated with a prefix of some string
in this infinite set.  The degree of each node is finite, bounded by the size
of the alphabet, and there are an infinite number of nodes in the tree (one
way to see this is by the 'diagonal argument' often used with real numbers).
Then, by the infinity lemma, there is an infinite path from the root, and this
infinite path is associated with an infinitely long string.
335.9BEING::POSTPISCHILFri Sep 27 1985 13:0119
Re .8:

The algorithm never passes my string (I thought that was your string; how
did it become mine?).  My proof basically shows that the algorithm converges
to the string, not that it passes or even reaches the string.

The infinity lemma does not apply to all sets of strings.  For example, you
could have sets which contains p and pxq, but not px.  If there are an infinite
number of such q's for some p and some x, the pxq's would have to be connected
to p as a prefix to create a tree, which violates the infinity lemma's premise
that the degree of each node is finite.  However, this does not apply to
NR, since pxq is in NR implies px is in NR.

Consider the set of strings with "C" as the last character.  Clearly, no
such string is infinitely-long.  Why does the infinity lemma not apply to
this set?


				-- edp
335.10BEING::POSTPISCHILSat Sep 28 1985 15:2717
The infinity lemma can be used to demonstrate NR has infinity-long strings,
because pxq is in NR implies px is in NR, using the application given in
.8.  Is that application what was originally intended, rather than my
interpretation of applying the lemma to the tree generated by the steps of
the proof, rather than the strings themselves?

The infinity lemma is not applicable to the set of strings ending in "C",
because pxq is in the set and p is in the set does not imply px is in the
set.  However, it does apply to a subset of this set (the set of strings
which contain no character other than "C", for example), so there are
infinitely-long strings in the set.  However, if we consider the set which
contains the strings which end with "C" and contain only that one occurrence
of "C", the infinity lemma is not applicable to the set or any of its subsets,
and the set does not contain infinitely-long strings.


				-- edp
335.11R2ME2::GILBERTSat Sep 28 1985 22:1730
re 335.9:
	If pxq is in the tree, then so is px, since there is a path in the
	tree for each prefix of a string in the set.  For example, if the
	set is {"ABA","BBA","BBCA","ABCABCABC..."}, then the tree would be:

			.
		       / \
		      A   B
		      |    \
		      B     B
		     / \    |\
		    A   C   A C
			|     |
		       ...    A
re 335.10:
	The first paragraph loses me.  Perhaps the above comment helps?

	Second paragraph.  That you say the infinity lemma doesn't apply
	in this case is surprising.  By your argument (where you prove that
	there is an infinitely long string in NR), there is an infinitely
	long string in this set, too.  By the way, are there an infinite
	number of strings that end with "C"?  I'd say yes, and that the
	infinity lemma applies.  Boy, is this confusing.

	Starting over (from the fact that we have arbitrarily long strings
	in NR, this being shown by construction), we remark that there must
	be an infinitely long string in NR, since, for any finite string
	(in or out of NR), say of length M, there exists one in NR that is
	longer, namely the string that results from applying the transformation
	to "A" M times.  Note that this parallels the final step of EDP's proof.
335.12ALIEN::POSTPISCHILMon Sep 30 1985 13:0321
Re .11:

Your first paragraph is correct.  What I was wondering was:  Is this what
you intended when you first mentioned the infinity lemma, in which case my
application would be wrong.  Let's not worry about this, and just assume
my first thought of how to use the infinity lemma was not identical to yours.

To wrap things up, hopefully, if there is a set S which contains an infinite
number of strings and pq is in S implies p is in S for all p and q, then
the infinity lemma can be used to form a tree from the elements of S which
would demonstrate an infinitely-long string in S (with the additional assumption
that the set of characters of the string of S is finite).

In the case of NR, this is true.  In the case of EC, the set of all strings
containing "C" as the last character and "A" or "B" for any other characters,
pq is in EC does not imply p is in EC ("AC" is in EC, but "A" is not), so
we cannot apply the infinity lemma.  Nothing we have shown so far proves
there is not an infinitely-long string in EC, but this is the case.


				-- edp
335.13HARE::GILBERTMon Sep 30 1985 15:2314
I don't understand why you say there are no infinitely-long strings in EC.

    1.	Because there are an infinite number of strings in that set, I
	think that the infinity lemma applies.

    2.	Suppose we form another set, which contains all the elements of
	EC, but with the last character removed.  *This* new set contains
	infinitely long strings -- so must EC.

    3.	For any finite length M, there is a longer string in EC.  Thus,
	some string in EC must be longer than any finite length, and so
	some string in EC must be infinitely long.

Infinitely long strings make for such interesting arguments.
335.14BEING::POSTPISCHILMon Sep 30 1985 15:4929
Re .13:

1.	The infinity lemma requires more than an infinite number of
	elements.  It requires a tree structure.  Unfortunately (for me),
	EC can be used to make such a tree structure, although it is not
	obtained by appending a character to a string to get a descendant;
	it is created by inserting a character at the beginning.  Oh, well.

2.	Darn.  But where is the "C" in this infinitely-long string?  Aha!
	There is a string in E, the set of strings of EC with the last
	character removed, which is infinitely-long.  However, this does
	not prove there is an infinitely-long string in EC, since it might
	not be possible to append a "C" to the infinitely-long string of E
	to get an infinitely-long string of EC.  Maybe we had better set
	some formal definitions of "string" and "concatenation".

3.	Careful, the first sentence just proves there are arbitrarily-long
	strings, not an infinitely-long string.

I think there is something wrong in applying the infinity lemma.  It was
originally used to show there is an infinitely-long path through certain
trees.  Perhaps it does not apply to strings as well.  Consider the set F
of all finite strings formed from a given finite character set.  By
definition, F contains only finite strings.  Obviously, the infinity lemma
yields an incorrect result as we have been applying it.


				-- edp

335.15BEING::POSTPISCHILMon Sep 30 1985 20:1624
I believe I know what is wrong with applying the infinity lemma to sets of
strings.  Knuth made a unstated assumption that a path found through the
method given in the proof of the infinity lemma would be a valid path.  This
is true because there is no restriction on the path; the lemma merely claims
there is a path from the root of the tree, not that it is any particular
kind of path.

When applying the infinity lemma to a set of strings, it will tell us there
is an infinitely-long string related to a path from the root of some tree.
It does not tell us that such a string is in a given set of strings.

After using the infinity lemma to find an infinitely-long string, it is
necessary to prove the string is in a desired set.  In the case of NR, this
can be done by claiming that the infinitely-long string does not contain
any repeating substrings because, if it did, there would be a substring of
the infinitely-long string with finite length which contained repeating
substrings, but we know, from the way the tree is constructed, that all
substrings of the infinitely-long string do not contain repeating substrings.

In the case of EC, this cannot be done, since the last character of the string
must be "C", and infinitely-long strings do not have last characters.


				-- edp
335.16LATOUR::AMARTINTue Oct 01 1985 00:4219
I see you can use something I verified this weekend.  I wanted
someone else to look at it before I put it here, but I better
start things off before you get even more confused.

The traditional definition of "string" is that it is a FINITE sequence
of characters (from a finite alphabet).  You should either stop calling your
infinite sequences of characters "strings", or re-prove every
theorem about strings you are taking for granted for the case of
infinite sequences.

For instance, (A+B+C)* contains only finite sequences, so if there is
a member of NR or EC that is infinite, they certainly aren't subsets
of (A+B+C)*.

And the "search-from-below" algorithm isn't an algorithm unless it
halts, since algorithms always halt.

More on this tomorrow, after I get my longer article reviewed.
				/AHM
335.17ALIEN::POSTPISCHILTue Oct 01 1985 12:3225
Re .16:

Algorithms are not required to halt.  The subject of computability often
focuses on non-halting algorithms.

Let us define a string (for our purposes; if anyone has a _formal_ definition
from somewhere, I would like to see it) as function:  The string, y, formed
from an alphabet, M (a set of characters), is a function, f[y] (brackets
denote sub-scripts), from the non-negative, finite integers (denoted by Z+)
into M such that n is in Z+ and f[y](n+1) is defined implies f[y](n) is defined.

Herein, the notation f[y] will denote the function described above.

If there exists an n in Z+ such that f[y](n) is undefined, the smallest such n
is the length of y and is denoted l(y).  Otherwise, the length of y is said to
be infinite and l(y) is infinity. 

The function f(i) = f[x](i) if i < l(x) and f[y](i-l(x)) if i >= l(x) is,
if it exists, called the concatenation of x and y and is denoted xy.

With these definitions, I believe the recently-made statements about strings
and the various sets of them we have been considering are true.


				-- edp                         
335.18ALIEN::POSTPISCHILTue Oct 01 1985 14:5415
As I defined concatenation, the concatenation of an infinitely-long string, x,
with another string, y, exists and is equal to x.  Is this what we want? 
There are at least two other options:  Change the value of the concatenation,
putting characters beyond the infinity-th position, to get strings with ordinal
lengths, rather than the cardinal lengths we have now or prohibit concatenation
onto an infinitely-long string. 

The second option is what I originally had in mind, although the way things
are currently defined probably will not mess up to many things.

The first option will definitely require some rethinking, but it might make
the discussion more interesting.


				-- edp 
335.19LATOUR::RMEYERSWed Oct 02 1985 09:0639
All page numbers in this note refer to "The Theory of Parsing, Translation,
and Compiling", Volume 1, by Aho and Ullman.

Re .17

An algorithm is required to halt.  It's twin sister the procedure isn't
required to halt.  Many times people loosely use the term algorithm when
they really mean procedure.  (page 27)  Much of the work in algorithmic
theory concerns answering the question "Is this procedure an algorithm?"


Re The Entire Question.

THEOREM:

A infinite string is not a member of any language.

PROOF:

A language is defined by a grammar if and only if it is recognized
by a Turing Machine.  (page 100).

The language defined by a recognizer is the set of strings that it
accepts.  (page 96)

A Turing Machine recognizes its input only if it halts.  (page 96)

A Turing Machine will never halt if given an infinite string.

Q.E.D.


Re  Infinite Strings and Formal Language Theory

You can invent a new formal language theory that permits infinite strings
just as you can invent new geometries that are not based on the parallel
postulate.  I encourage anyone to do so.  But, you will need to start from
the ground up because the theorems of your new branch of mathematics will
change in non-obvious ways from standard formal language theory.
335.20BEING::POSTPISCHILWed Oct 02 1985 11:5653
Re .19:

I assuming by "twin sister the procedure" you do not really mean twin sister;
that seems to imply every algorithm has an equivalent procedure.  I do not
think definitions for "algorithm" are anywhere near as standard as are
definitions for things like "strings" or "graphs".  Note that a definition
of "algorithm" that requires algorithms to stop is undesirable, because that
would mean there probably exist things it is impossible for us to determine
whether or not they are algorithms.

Is Aho and Ullman using this "algorithm" in a formal or informal context?

In regard to the theorem that an infinite string is not a member of any
language, I can construct a Turing Machine that recognizes a language containing
infinite strings, if infinite strings exist at all.  The Turing Machine halts
immediately, regardless of input. 

The problem here is whether or not infinite strings exist at all.  This would
follow immediately from the definition of string (which I would like to see).
After some thought, I am inclined to believe standard definitions should
not permit it.  There are important points that rely on strings being finite.
For example, computatibility often involves assigning a number to each of
the strings of a language.  With infinite strings, the number of strings
in a language could become uncountable, prohibiting a needed 1-1 function
from integers to strings.

So, henceforth, everybody should be aware that we are discussing strings
somewhat different from normal.  I do not believe it is necessary to reprove
very many theorems, for a simple reason:  We probably will not be using very
many theorems.

Also, I am going to redefine concatenation.  The current definition does
not permit the statement xa = xb implies a = b.

If there is an n in Z+ such that n = l(x), then f(i) = f[x](i) for i < n
and f(i) = f[y](i-n) for i >= n is said to be the concatenation of x and
y and is denoted xy.

Now, xa = xb implies a = b, although ax = bx does not imply a = b.

NR contains infinite strings.  EC does not.  The infinity lemma does not
demonstrate there exists an infinite string in each infinite set of strings.

For people who are uneasy with infinity, I would like to point out that it
is not necessary to consider infinity as a "thing".  My definition of infinite
strings does not require infinity to exist.  Saying a string is infinite
is merely saying there is no number in Z+ such that the string-function is
undefined for that number.  Similarly, other statements do not really discuss
infinite things, but only the existence or non-existence of certain finite
integers.
         

				-- edp
335.21LATOUR::AMARTINThu Oct 03 1985 01:24156
Sorry if some of this is repetitious.  However, most of it has been
sitting around for days, and I might as well get it out and done with.

17>Let us define a string (for our purposes; if anyone has a _formal_ definition
17>from somewhere, I would like to see it) as function:  The string, y, formed

If you want to use the definition of "string" which predominates in
formal language and automata theory, you must accept that they are
defined as finite sequences of symbols.  Otherwise, you have a large
job ahead of you - redefining formal language and automata theory (and
computability theory as well).

For definitions which state that strings are of finite length, see:

1.  Knuth, (The Art of Computer Programming, V1: Fundamental Algorithms),
Index, p 632.

2.  Aho, Hopcroft and Ullman (The Design and Analysis of Computer Algorithms),
section 9.1 (Finite Automata and Regular Expressions), p 318.

3.  Aho and Ullman (Principles of Compiler Design, First Edition), section 3.3
(Regular Expressions), p 83.

4.  Gries (Compiler Construction for Digital Computers), section 2.2 (Symbols
and Strings), p 15.

5.  DeRemer (Chapter 2A (Review of Formalisms and Notations) of Compiler
Construction, An Advanced Course), section 1.1 (Unrestricted Rewriting
Systems), p 37.

The only minority opinion I could uncover was "All strings which we shall
encounter will be of finite length" on p 16 in section 0.2.1 (Strings) of
Aho and Ullman's "The Theory of Parsing, Translation and Compiling;
Volume I: Parsing".  However, I find the book covers more than enough
material to discuss the topic at hand, without needing infinite strings.


If all strings are finite, then either the assertion made in several places
that NR is contained in (A+B+C)* is false, or all members of NR have finite
length.


The backtracking program has been called "the search-from-below algorithm" by
a several people, including myself.  However, I realized on further
reflection, that the program is not an algorithm.  While the program is:

1.  Composed of a finite number of simple instructions
2.  Each of which can easily be followed by expending
    a finite amount of effort in a finite amount of time

no one has proved that:

3.  It halts.

At least one reply contains the assertion (proof?) that it doesn't halt.
Since it fails to have one of the requirements of an algorithm, it is at best
a "computational procedure".  Indeed, if it halted, it wouldn't find an
infinite sequence with the NR property.  

16>Algorithms are not required to halt.  The subject of computability often
16>focuses on non-halting algorithms.

I have not studied a lot of computability theory.  I have studied
a fair amount of what is called algorithmic analysis, and texts
on the subject almost invariably define "algorithm" in the very first
chapter, and note that any program which claims to be an algorithm
is required to halt.  Some texts bother with the detail of defining
"computational procedure", which has properties 1 and 2 from above, but
does not necessarily have property 3.

20>								I do not
20>think definitions for "algorithm" are anywhere near as standard as are
20>definitions for things like "strings" or "graphs".

Some references which use the same definition of algorithm:

1.  Horowitz and Sahni (Fundamentals of Computer Algorithms), section 1.1
(What is an algorithm?), pp 1-2.  Defines Algorithm and Computational
Procedure.

2.  Knuth, (The Art of Computer Programming, V1: Fundamental Algorithms),
section 1.1 (Algorithms), pp 4-5.  Defines Algorithm and Computational
Method.

3.  Aho, Hopcroft and Ullman (Data Structures and Algorithms), section 1
(From Problems to Programs), p2.  Defines Algorithm.

4.  Aho, Hopcroft and Ullman (The Theory of Parsing, Translation and
Compiling; Volume I: Parsing), section 0.4 (Procedures and Algorithms),
pp 25-33.  Defines Algorithm and Procedure.

Do books more directly related to computability theory use the same
definitions, or do they redefine "algorithm"?


20>						Note that a definition
20>of "algorithm" that requires algorithms to stop is undesirable, because that
20>would mean there probably exist things it is impossible for us to determine
20>whether or not they are algorithms.

Congratulations.  You have just discovered what is called "the halting
problem".  Alan Turing got there before you, though.  Not only is there
no algorithm to decide whether or not a computational procedure halts,
there is no computational procedure to determine this, either.


20>Is Aho and Ullman using this "algorithm" in a formal or informal context?

Although they claim they could go further, the treatment is still far more
rigorous and thorough than I have related here.


Computational procedures and Turing Machines have the same power to recognize
strings.  Turing Machines recognize the largest class of languages in the
Chomsky hierarchy (any set of finitely-sized sequences of symbols).  And
recognition must take place in a finite amount of time.  This corroborates
my assertion that any member of NR which is not of finite length is not
a string.


Yet another proof that the formal language theoretic definition of strings
requires them to be of finite length can be constructed from properties
of grammars.  A grammar has finite terminal and nonterminal vocabularies,
and both sides of the productions are of finite length.  If a particular
grammar defines a language, then any string in the language can be derived
from the start symbol by a finite number of applications of productions from
the grammar.  A finite number of rewritings, each of which increases the
number of terminals in a sentential form by a finite number, must result
in a sentence of finite length.


4>Quiz Questions:
4>
4>        We have sort of been assuming (A+B+C)* contains infinitely-long
4>        strings, since NR is a subset of it.  (This depends upon our
4>        definition of "string".)  Demonstrate a set which contains
4>        arbitrarily-long strings but not infinitely-long strings. 
4>. . .

Answer: The set of all legal Fortran programs.

(Just in case some of the readers feel more comfortable with Hollerith
cards than rational numbers, which were given as the first answer).


20>In regard to the theorem that an infinite string is not a member of any
20>language, I can construct a Turing Machine that recognizes a language containing
20>infinite strings, if infinite strings exist at all.  The Turing Machine halts
20>immediately, regardless of input. 

I believe that part of the definition of a Turing Machine string recognition
is that the machine must halt at the far end of the input tape, after the
EOT marker.  If so, the proposed machine only recognizes the empty language,
which certainly does not contain any infinite sequences.  I'll look up
the definition of recognition tomorrow.
				/AHM
335.22KOBAL::GILBERTThu Oct 03 1985 11:5812
To avoid the problems with infinite strings, we could consider a string
as a 'decimal' expansion of some value:

	0.abacab...
	0.121312...

Then the argument proceeds: Given a set containing a countably infinite number
of rational numbers, the set must contain an irrational number.  This is absurd,
but brings the major point -- that of convergence -- to the forefront again.
We want to find a series of closer, and arbitrarily close, approximations to
some irrational number, which we can choose.  I don't know whether this is
true when framed this way, but suspect it's a well-known result.
335.23BEING::POSTPISCHILThu Oct 03 1985 12:2522
Re .21:

> If you want to use the definition of "string" which predominates in
> formal language and automata theory, you must accept that they are
> defined as finite sequences of symbols.  Otherwise, you have a large
> job ahead of you - redefining formal language and automata theory (and
> computability theory as well).

It's rather strange that you say this after I have stated that we are using
a non-standard definition of string and explained why it is not necessary
to redefine formal language and automata theory.

Obviously, we will have to extend the definition of "*", so that S* denotes
the set of all strings formed by concatenating strings of S, including an
infinite number of concatenations.

I am not concerned with words, so please change references to "algorithm"
in previous discussions and proofs to whatever term you desire.  The statements
will hold.


				-- edp
335.24RANI::LEICHTERJSun Oct 06 1985 03:1244
As it happens I've done a fair amount of work with computational complexity,
etc.  Be aware that the careful formality of introductory texts is not neces-
sarily followed in more advanced work.  (Also, a book like Knuth, which isn't
introductory but IS meant to be a reference will tend to be more formal - and
may try to "standardize" a notation or terminology.  It may or may not succeed.
The current use of O(n), Theta(n), and Omega(n) was proposed as a "standard"
by Knuth in a article a number of years back; in this case, people thought
Knuth had a good idea and went along.)

"Algorithm" is generally used informally.  Halting algorithms, non-halting
algorithms, non-deterministic algorithms, probablistic algorithms (of many
varieties) are all discussed freely.  When you want to be formal, you tend
not to talk of "an algorithm" but of "a Turing machine" or "a non-deterministic
Turing machine" or whatever.

"String" is also used informally.  Depending on context, they may be finite
or infinite.  The formal term of interest here is "language", which is
generally taken to mean a subset of the set of all FINITE strings over some
alphabet.  The Kleene * operator is defined to produce all FINITE repetitions
of its argument; this has nothing to do with whether it produces "strings"
or not.

"Acceptance of a language" is defined in a number of ways.  Sometimes the
head must be at the end of the string when the machine enters the "accept"
state; sometimes it must return to the starting position; usually it doesn't
matter.  For finite strings, these are all trivially equivalent.  The ques-
tion of "infinite strings" doesn't normally arise because of the definition
of a computation of a TM:  The input tape is assumed to contain blanks in all
but a finite number of cells.  Because of this, the "infinite length" of the
tape on a TM is actually an illusion:  The tape must be unbounded in size, but
any halting computation must of necessity only use a finite amount of tape.
It makes perfect sense to talk about "infinite input tapes", and of accepting
infinite input languages, though if you want "acceptance" to continue to
mean "the machine halts with this inupt in the ACCEPT state", nothing interes-
ting happens - the languages you can recognize are just cylinder sets on the
"finite" languages you could recognize before (i.e., languages that are of
the form px, where p is in some "finite" language and x is ANY suffix.  An
example, trivially recognizable, is L={all strings whose first character is A}.)
The proof is easy....)  I don't know of any work on such "infinite languages",
though there may be some.

Finally, to see if you REALLY understand the infinity lemma, see note 339.

							-- Jerry