[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

277.0. "shrtst str containing everythin" by SPRITE::OSMAN () Thu May 02 1985 19:32

Here's a problem about substring matching.  I'm looking for the shortest
possible string of letters of the alphabet such that a search for any substring
of 8 (arbitrary choice) or less letters can be found.

Here's one string of letters:

	AAABACADAE...

In this string, we can clearly find AA or AB or AC or AD etc.  We quickly
realize that we can let the AA and AB overlap to reveal a shorter string, such
that AA, AB, AC, AD etc. are still to be found, namely:

	AABACADAE...

How long a string is needed, so all substrings up to 8 letters long can
be found ?
-------------------------------
A little history:  I thought of this problem a few years ago while designing
an EMACS macro.  The macro had internal need for the "search" command.  However,
I didn't want the macro to leave the default search string altered when it
was through executing, lest the user's attempt to issue a "search next" command
look for the macro's internal string rather than the last string the user
searched for.

EMACS didn't (does it now ??) have a command for revealing the current default
search string.  If it did, I'd use it at the beginning of my macro in order
to save it, then restore it at the end.

So, I was thinking about how I might reveal the current default search
string in lieu of a builtin function to do it.  My theoretical solution was
to have a buffer called ALLCOMBOS which would contain an appropriate huge
string, constructed so that ALL possible search strings could be found !
That way, I could figure out the default search string by putting the cursor
at the beginning of that buffer, issuing "search next", marking the spot,
"search previous", after which point and mark would surround sought default
search string.

Of course, this was just theoretical, since such an ALLCOMBOS buffer would
probably be larger than the universe.  However, it does lead to the
aforehereto mentioned substring puzzle.

/Eric

                   
T.RTitleUserPersonal
Name
DateLines
277.1GOLLY::GILBERTFri May 03 1985 03:274
Oddly, there is some *old* work on this problem.  It's similar to the
problem of ringing 'cycles' (I think that's what they're called) on a
carillion, or other sets of chimes.  The Encyclopaedia Brittanica had
a pretty interesting article about this.
277.2METOO::YARBROUGHWed May 08 1985 12:178
Here's another related problem: find a sequence of 1's and 0's with the 
following property: given a boolean function B, the leftmost ocurrence in
the string of bits x and y is immediately followed by B(x,y). (I have used
this to implement Boolean algebra in a SNOBOL program, for example.) Some
interesting relationships fall out of this: for example, AND(x,y) can be
represented by '11100010' and OR(x,y) is represented by its complement,
'00011101'. The symetric function XOR(x,y) can be represented either by
'1011000' or by its reversal, '0001101'. And so forth. - Lynn Yarbrough
277.3R2ME2::GILBERTThu May 09 1985 02:0318
Problem: Given a boolean function B(x,y) construct a sequence of 1s and 0s
such that the leftmost occurence of x,y in the sequence is immediately
followed by B(x,y).

Solution.  We construct the sequence by appending digits to a partial solution.

	(1) Initially, set S (the sequence being constructed) to 00.
	(2) If the 2**2 triples x,y,B(x,y) occur in S, then we are done.
	    Otherwise, let x,y be the last two digits of S.
	(3) If this is the leftmost occurence of x,y in S, then append B(x,y),
	    and go to step (2).
	(4) If y,z does not appear in S (for some z), append z to S, and go to
	    step (2).
	(5) Let x',y' be some pair that doesn't appear in S; append x',y' to S,
	    and go to step (2).

Note that a similar procedure can be used to generate a finite solution for any
function B(x,y,...z) over a finite alphabet.
277.4RANI::LEICHTERJTue May 14 1985 02:0910
"A string over an alphabet of k characters is said to be an n-meander if it
contains every possible substring of length n over the alphabet.  For example,
0001111011001010000 is a 4-meander for the alphabet 01."  This is quoted from
the "Reference Manual for the Icon Programming Language", which proceeds to
give an example Icon program that generates meanders.  It referes to the
following article:  "Minimal Meandering Strings", by James F. Gimpel and
William Keister.  Technical report, Bell Labs, Holmdel, NJ July 1970.

The minimal meandering string has length k^n+n-1.
							-- Jerry
277.5R2ME2::STANFri May 24 1985 05:4020
Re: .0

What you want is very closely related to De Bruijn Sequences.

Given m+1 symbols (0,1,2,...,m), a De Bruijn sequence is the minimal
sequence which when arranged in a circle, contains as subsequences
of consecutive symbols, every sequence of length n of the symbols.

[you don't want the sequence arranged in a circle]

Anyhow, there is much literature on this and related problems.
I suggest you start with

Anthony Ralston, De Bruijn Sequences, Mathematics Magazine, 55(1982)131-143.

See also

S. W. Golomb, Shift Register Sequences, Holden Day, 1967, pp. 118-122

and see also Knuth, volume 2.