[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

623.0. "let's flip a coin" by AVANTI::OSMAN (and silos to fill before I feep, and silos to fill before I feep) Tue Dec 09 1986 19:03




	Suppose you flip a fair coin, and then continue flipping until
	an equal number of heads and tails have been seen, and write down
	the number of flips that were necessary.

	Now suppose this task is repeated many times.  What is the expected
	average of the numbers written down ?


	Wording this differently, but intending exactly the same thing:
	What is the expected number of flips of a fair coin necessary to
	produce an equal number of heads and tails ?

	/Eric
T.RTitleUserPersonal
Name
DateLines
623.1ENGINE::ROTHTue Dec 09 1986 19:327
    I suspect infinity.  The probability of an equal number of 1 and 0 bits
    in an even length bitstring is asymptotically 1/sqrt(2*pi*n) so the
    probability of still flipping for a match after n tries approaches 1...

    This may be closely related to martingales...

    - Jim
623.2CLT::GILBERTeager like a childTue Dec 09 1986 21:091
    What is the probability that you'll stop after exactly 2*n flips?
623.3Expected Value divergesSQM::HALLYBAre all the good ones taken?Tue Dec 09 1986 23:1428
    The probability of stopping after 2n flips is given by
        
    	( 2n )      1
    	(    )	*  --- ^ (2n)		a.k.a.    C(2n,n) * 2^(-n)
        ( n  )      2
    
    Where the coefficient on the left is the combination of 2n things
    taken n at a time.
    
    If it exists, the expected value E(X) = sum(X * probability(X))
    would be given by:
    
    	  inf
    	 ----	     ( 2n )
    	 \	2n * (    ) * 2 ^ (-2n)
         /           ( n  )
         ----
         n = 0
    
    For large M, the value of C(2M,M) is approximately 4^M/sqrt(pi*M)
    so the summand is approximately
    
    		2n * 4^n / sqrt(pi * n) * 2^(-2n)
    
    Since 4^n = 2^(2n), the above reduces to 2n / sqrt(pi * n), which
    diverges as n --> oo.  Adding them up gives you no finite total.

    There is therefore no expected value.
623.4BEING::POSTPISCHILAlways mount a scratch monkey.Tue Dec 09 1986 23:388
    Re .3:
    
    You need to subtract the probability that you will have stopped before
    reaching 2n flips.  Also, you use 2n for the exponent in the first
    formula, but n for the second.  (2n is correct.)
    
    
    				-- edp 
623.5Recurrence Relations and Path Counting?RTVAX::BRIDGEWATERThu Dec 25 1986 00:4559
    Re 623.4:
    
    I'm not sure what you meant by subtracting the probability that
    you will have stopped before reaching 2n flips.  I thought you
    meant something like the following:


                               n-1
           ( 2n )   -2n       ----  ( 2i )   -2i
    P(n) = (    )  2      -   \     (    )  2
           (  n )             /     (  i )
                              ----
                               i=1
    where,

    P(n) = probability that the result of a series of 2n flips of a
    fair coin is n heads and n tails and that at no time after the
    first 2i flips, for 0 < i < n, was there a result of exactly i
    heads and i tails.

    Unfortunately, the above formula is not correct.  It does not
    take into account the non-linear distributional effects that
    halting the coin flipping when the number of heads and tails are
    equal causes in the probability graph.

    A quick example (n=3) illustrates that the formula is incorrect:
                 
    P(3) = 20*(1/64) - 2*(1/4) - 6*(1/16) = -9/16

    A negative probability here shows that something is wrong with the
    formula.

    I have been looking at this problem by trying to solve recurrence
    relation equations I found by looking at the probability graph
    (similar to Pascal's triangle).  Unfortunately, I haven't been
    able to solve them for P(n) yet.  It appears that the expected
    number of flips is infinite, but I haven't generated a proof of
    that yet.

    I think the answer might be found by counting the paths in the
    probability graph that go from the node representing 0 tosses
    and ending in the node representing n heads and n tails.  Then
    subtract out the paths that go through nodes that represent equal
    heads and tails where this number of heads or tails is in the
    doubly inclusive range of 1 to n-1.  Each of these paths are
    equiprobable since the coin is fair at every flip.  Divide this
    path count by 2^n and you have calculated P(n).  Unfortunately,
    I haven't (yet!) figured out how to count the paths so that the
    result is a non-recursive function of n.

    There must be a closed-form solution for P(n) lurking somewhere!
    Once we have it, we should be able to show whether the infinite
    series involved in calculating the expected number of tosses
    converges or not.  Intuition and random bits of memory related
    to stationary/non-stationary random processes tell me that the
    expected value is infinite.  But, where's the beef?  ... er, I
    mean, where's the proof?  :^)

    - Don
623.6A recursive representation of P(n)RTVAX::BRIDGEWATERThu Dec 25 1986 02:0436
    Re 623.5:

    I have found an equation for P(n) but I am having trouble making
    it non-recursive.


                         n-1
           ( 2n )       ----  ( 2n-2i )   2i
           (    )   -   \     (       )  2    P(i)
           (  n )       /     (  n-i  )
                        ----
                         i=1
    P(n) =  ---------------------------------------  ,    for n>0
                             2n
                            2

    where,

    P(n) = probability that the result of a series of 2n flips of a
    fair coin is n heads and n tails and that at no time after the
    first 2i flips, for 0 < i < n, was there a result of exactly i
    heads and i tails.

    The equation for the expected value of the number of flips before
    halting (E) is:

          oo
         ----
         \
    E =   >   2nP(n)
         /
         ----
          n=1


    - Don
623.7fudge factorVINO::JMUNZERMon Dec 29 1986 12:4415
    Seems that the probability of stopping after 2n flips is
        
	    	    1	      ( 2n )      1
		 --------  *  (    )  *  --- ^ (2n)
		  2n - 1      ( n  )      2
    
    E.g. for n = 3:

	 1        6!         1           1              1       1
	---  *  -------  *  --- ^ 6  =  ---  *  20  *  ---  =  ---
	 5      3! * 3!      2           5              64      16
                                                             
    Note there are four ways to stop -- HHHTTT, HHTHTT, TTTHHH, TTHTHH.

    John
623.8BEING::POSTPISCHILAlways mount a scratch monkey.Mon Dec 29 1986 15:5337
    The formula in .7, C(2n,n) * 2^(-2n) / (2n-1), matches a recurrence
    relation I have developed.
    
    Let Q(n,l,c) be the number of head-tail strings of length n such that
    H-T is always greater than or equal to l, where H is the number of
    heads at a point in the string and T the number of tails, and H-T is c
    at the end of the string.
    
    Then Q(n,l,c) is zero if abs(c) > n or l > 0 (since H-T is zero
    initially).   If l <= 0, Q(0,l,0) is 1 (the empty string has the
    required properties).  For other values,
    
    	Q(n+1,l,c) = Q(n,l-1,c-1) + Q(n,l+1,c+1),
    
    because we can start with a head and append a string of length n with
    the indicated properties or start with a tail.
    
    However, Q does not count the number of strings for which H-T is zero
    for the second time after n symbols.  This count, R(n), can be found by
    taking a head and a tail and inserting between them a string of length
    n-2 for which H-T does not drop below zero and the change (c) is zero.
    Q only counts strings for which H-T does not drop below a certain
    value, but by symmetry, we can also start with a tail and a head and
    insert a string of length n-2 for which H-T does not exceed zero and
    the change is zero -- and there are the same number of these strings as
    of the other type.
    
    So R(n) = 2 Q(n-2,0,0).
    
    It is fairly simple to write a program which evaluates Q, and it checks
    with the formula.  If a formula can be found for the Q's instead of
    just the special case of Q(n-2,0,0), it will be simple to complete the
    proof that the P(n) = R(2n) / 2^(2n) (my Q and R functions use the
    length of the string, P uses half the length of the string). 
    
    
    				-- edp 
623.9CLT::GILBERTeager like a childMon Dec 29 1986 17:4515
>   However, Q does not count the number of strings for which H-T is zero
>   for the second time after n symbols.

    It took me a while to understand this -- the 'first time' is the
    null string.

>   Then Q(n,l,c) is zero if abs(c) > n or l > 0 (since H-T is zero
>   initially).   If l <= 0, Q(0,l,0) is 1 (the empty string has the
>   required properties).

    Q(n,l,c) is also zero if c < l.


    It may be useful to note that c-l is an anvariant amoung the Q's
    in the recurrence relation.  Then again....
623.10BEING::POSTPISCHILAlways mount a scratch monkey.Mon Dec 29 1986 18:5033
    Re .9:
    
    I noted that c-l is invariant.  In fact, c = l in all uses of Q.
    Because of this, we can simply refer to Q(n,l).  Collapsing the
    definition of Q gives: 
    
    Q(0,l) = 1 if l = 0 or 0 if l <> 0.
    Q(n+1,l) = 0 if |l| > n or Q(n,l-1)+Q(n,l+1) otherwise.
    
    It turns out that Q(n,l) = C(p,q) - C(p,q-2), where p = n-1 and q =
    (n-l)/2.  When q is not an integer, Q is zero.
    
    For this, we need to use definitions of C(p,q) where C(p,-h) and
    C(p,p+h) are zero for positive values of h, and for fractional values
    of q.  Observation shows the formula is correct for n = 0 and |l| > n.
    
    For the remaining values, consider Q(n+1,l) = Q(n,l-1)+Q(n,l+1).  We
    need to test that C(n,q0) - C(n,q0-2) = C(n-1,q1) - C(n-1,q1-2) +
    C(n-1,q2) - C(n-1,q2-2), where q0 = (n-l)/2, q1 = [(n-1)-(l-1)]/2, and
    q2 = [(n-1)-(l+1)]/2.  A little work shows q0 = q1 = q2+1, so C(n,q0) -
    C(n,q0-2) = C(n-1,q0) - C(n-1,q0-2) + C(n-1,q0-1) - C(n-1,q0-3), which
    is just two simultaneous instances of the well-known formula governing
    Pascal's triangle.
    
    Since R(n) = 2 Q(n-2,0,0), R(n) = 2 [C(n-3,n/2-1) - C(n-3,n/2-3)].
    
    Some arithmetic shows R(n) = C(n,n/2)/(n-1), the formula given earlier.
    
    Now we must find the expected value, the limit as n goes to infinity of
    the sum from i = 1 to n of i C(i,i/2) 2^(-i) / (i-1).
    
    
    				-- edp
623.11BEING::POSTPISCHILAlways mount a scratch monkey.Mon Dec 29 1986 19:5214
    Re .10:
    
    The series being summed is (odd terms omitted because they are zero): 
    
    	    1   1*3   1*3*5   1*3*5*7
    	1 + - + --- + ----- + ------- + . . .
    	    2   2*4   2*4*6   2*4*6*8
    
    The ratio between the n-th and (n+2)-th terms is one to (n-1)/n.  I
    will have to look up the tests for convergence and divergence, although
    I would guess this series diverges.
    
    
    				-- edp
623.12BEING::POSTPISCHILAlways mount a scratch monkey.Mon Dec 29 1986 23:1423
    Re .11:
    
    > 	    1   1*3   1*3*5   1*3*5*7
    >  	1 + - + --- + ----- + ------- + . . .
    >	    2   2*4   2*4*6   2*4*6*8
    
    We may ignore the first term while considering divergence.  We may also
    double each term without altering convergence or divergence.  Observe
    that 1 >= 1, 3/2 >= 1, 3*5/(2*4) >= 1, (3*5*7) >= 2*4*6*8.  So in the
    doubled series, 2[1/2] >= 1/1, 2[1*3/(2*4)] >= 1/2, 2[1*3*5/(2*4*6)] >=
    1/3, and so on -- each term is greater than the corresponding term of
    the harmonic series, which diverges, so this series also diverges.
    
    Unless I have made an error, I extend this offer:  I will give any
    taker a dollar if they agree to flip a penny, give it to me, and repeat
    this until the number of heads flipped equals the number of tails
    flipped. 
    
    Sure, it's a good bet -- go for it.  You have a better than 90%
    chance of making money on this deal.
    
                                        
    				-- edp
623.13proof of Munzer's conjectureCLT::GILBERTeager like a childTue Dec 30 1986 00:1491
Let S(n,t) be the number of strings of length 2n with H-T = 2t.
Then S(n+1,t) = S(n-1,t-1) + 2*S(n-1,t) + S(n-1,t+1).  Then we
can easily create a table...

	S(n,t)	    n
	        0  1  2  3
	    3		 1
	    2	      1  6
	    1	   1  4 15
	t   0	1  2  6 20
	   -1	   1  4 15
	   -2	      1  6
	   -3		 1

And we see that S(n,t) = C(2n,n+t).

Similarly, let P(n,t) be the number of strings of length 2n, with H-T = 2t,
such that the string contains no non-null prefix with H-T = 0.  We get a
table similar to the above, except that the t=0 entries don't participate
in the recurrence.  The effects of this change are easily accounted:

			  n-1
	P(n,t) = S(n,t) - Sum P(j,0) * S(n-j,t)
			  j=1

That is, a change in P(j,0) propagates just like the recurrence for S(n),
so we subtract all these propagated effects.

Now our primary interest is in P(n,0).  Letting P(n) = P(n,0) and
S(n) = S(n,0) = C(2n,n), we have the recurrence:

		      n-1
	P(n) = S(n) - Sum P(j) * S(n-j)
		      j=1

We know that P(1) = 2 (the strings HT and TH), and (courtesy of J.Munzer)
conjecture that P(n) = C(2n,n) / (2n-1).  We prove this by induction.
We verify that C(2,1) / (2-1) = 2 = P(1), assume the conjecture holds
for all j < n, and then proceed to evaluate the expression:

		      n-1
	Q(n) = S(n) - Sum P(j) * S(n-j)
		      j=1

		 n-1
	= S(n) - Sum C(2j,j) C(2n-2j,n-j) / (2j-1)
		 j=1

		 n-1
	= S(n) - Sum 2 C(2j-1,j-1) C(2n-2j,n-j) / (2j-1)
		 j=1

		(since C(r,k) = (r/k) C(r-1,k-1) if k <> 0)

Now, from Eq. (26) in Section 1.2.6 of "The Art of Computer Programming",
we know that:

	 Sum C(r-tk,k) C(s-t(n-k),n-k) r/(r-tk) = C(r+s-tn,n), for integer n
	k>=0
	
We use this with the simultaneous substitutions:  t = -2, k = j-1, r = 1,
s = 0, to give:

	 Sum C(2j-1,j-1) C(2n-2j+2,n-j+1) / (2j-1) = C(2n+1,n), for integer n
	j>=1

And a shift in n gives:

	 Sum C(2j-1,j-1) C(2n-2j,n-j) / (2j-1) = C(2n-1,n-1), for integer n
	j>=1

Noting that the expression being summed in our equation for Q(n) is zero
if j > n, and substitute the above into our equation, we get:

			n-1
	Q(n) = S(n) - 2 Sum C(2j-1,j-1) C(2n-2j,n-j) / (2j-1)
			j=1

		     n
	= S(n) - 2  Sum C(2j-1,j-1) C(2n-2j,n-j) / (2j-1)
		    j=1

		+ 2 C(2n-1,n-1) C(2n-2n,n-n) / (2n-1)

	= C(2n,n) - 2 C(2n-1,n-1) + 2 C(2n-1,n-1) / (2n-1)

And since 2 C(2n-1,n-1) = C(2n,n) for n <> 0, we have:

	Q(n) = C(2n,n) / (2n-1)

This completes the induction that P(n) = C(2n,n) / (2n-1).
623.14Hallyburton's quite correctCLT::GILBERTeager like a childTue Dec 30 1986 00:3527
    The expected value diverges.

    The probability of stopping after 2n flips is given by
	
	  1     ( 2n )   -2n
	------  (    )  2    ,  or  C(2n,n) * 2^(-2n) / (2n-1)
	2n - 1  ( n  )
    
    If it exists, the expected value E(X) = sum(X * probability(X))
    would be given by:
    
	oo
	---     1     ( 2j )   -2j
	>     ------  (    )  2
	---   2j - 1  ( j  )
	j=0
    
    For large n, the value of C(2n,n) is approximately 4^n/sqrt(pi*n)
    so the summand is approximately
    
    		2j * 4^j / sqrt(pi * j) * 2^(-2j) / (2j-1)
    
    Since 4^j = 2^(2j), the above reduces to 2j / (2j-1) / sqrt(pi * j),
    or roughly 1 / sqrt(pi * j).  The summation of j^r diverges for all
    r >= -1, and so this summation also diverges (r = -1/2 > -1).

    There is therefore no expected value.
623.15P(n,t) -- the rest of the storyCLT::GILBERTeager like a childTue Dec 30 1986 01:3335
But if we *were* interested in P(n,t), we'd use Eq. (26) to find that

	Sum  C(2j-1,j-1) C(2n-2j,n-j+t) / (2j-1) = C(2n-1,n+t-1)
	j>=1

And so
			  n-1
	P(n,t) = S(n,t) - Sum 2 C(2j-1,j-1) C(2n-2j,n-j+t) / (2j-1)
			  j=1

	= S(n,t) - 2 C(2n-1,n+t-1) + 2 Sum C(2j-1,j-1) C(2n-j,n-j+t) / (2n-1)
				       j>=n

	= S(n,t) - 2 C(2n-1,n+t-1) + 0, if t < 0

But
	C(2n,n+t) = (2n)/(n+t) C(2n-1,n+t-1), for n+t <> 0,
so that
	C(2n-1,n+t-1) = (n+t)/(2n) C(2n,n+t), for n+t <> 0, n <> 0.

Hence
	P(n,t) = C(2n,n+t) - (n+t)/n C(2n,n+t), for n+t <> 0, n <> 0, t < 0

	P(n,t) = -t/n C(2n,n+t), if -n < t < 0.

By symmetry, P(n,t) = P(n,-t), and

	P(n,t) = |t|/n C(2n,n+t), if 0 < |t| < n

We note that this also gives the correct values for |t| = n (namely, 1),
and for |t| > n (namely, 0).  Thus,

	P(n,t) = |t|/n C(2n,n+t), if t <> 0
and
	P(n,t) = C(2n,n) / (2n-1), if t = 0.
623.16.13 - .15VINO::JMUNZERTue Dec 30 1986 12:233
    Very nicely completed, Peter!
    
    John
623.17Good problem, Eric!AURORA::HALLYBAre all the good ones taken?Tue Dec 30 1986 14:185
    A bit more subtle than it appears at first glance.  Looks like it
    would make a fine question for a take-home final in a probability
    course.
    
      John
623.18CAUTION: don't accept edp's $1 bet !VIDEO::OSMANand silos to fill before I feep, and silos to fill before I feepMon Jan 05 1987 18:1630
Beware of edp's offer of $1 for which you must pay back flipped pennies
until an equal number of heads and tails have been seen !

Although you'll probably win, the rare chance that you lose can be
quite devastating.  You can fairly easily lose thousands of dollars.

Here's a C program that illustrates the possibility of losing lots.
If it weren't for the damping (see variable "large"), the program
would quickly run into a very long loop, which indicates you losing
your shirt.
-----------------------------------------------------------------------
main (){

int ntries=0, heads, tails, seed, large=10000;
double average;

	while (1)
	{
	heads = tails = 0;
	do {if (mth$random(&seed)&1)heads++;else tails++;}
	    while (heads != tails && heads+tails <= large);
	if (heads+tails <= large) printf ("	%d\n", heads*2);
	    else printf ("	> %d\n", large);

	ntries++;
	printf ("%g\n", average = (average * (ntries-1) + heads*2)/ntries);
	}
}
-----------------------------------------------------------------------
/Eric
623.19BEING::POSTPISCHILAlways mount a scratch monkey.Wed Oct 25 1989 01:339
    Here's a question with a nice answer:  What's the probability the
    difference of the numbers of heads and tails reaches n before it
    returns to 0?
    
    The answer to the above question provides a simple answer for the base
    note.
    
    
    				-- edp 
623.20BEING::POSTPISCHILAlways mount a scratch monkey.Thu Oct 26 1989 11:464
    Is anybody working on the question in .19, or should I post the answer?
    
    
    				-- edp 
623.21Every little bit helpsAKQJ10::YARBROUGHI prefer PiThu Oct 26 1989 12:094
I'm pretty sure it's 2^-n ... since the probabilities sum to 1. Actualy, I 
worked this out a long time ago but I'm too rushed to do the details again.

Lynn 
623.22KOBAL::GILBERTOwnership ObligatesThu Oct 26 1989 13:462
    It appears that the answer to .19 is 1/n.  It's a nice answer, but I
    don't yet have a proof.
623.23BEING::POSTPISCHILAlways mount a scratch monkey.Thu Oct 26 1989 16:2817
    .22 has it:  The probability that the difference reaches n before it
    returns to 0 is 1/n.  I'll let Gilbert work on the proof a bit longer,
    but here's how it applies to the original problem:
    
    If a difference of n is reached, there must have been at least n flips. 
    
    In all of the trials, a difference of 1 is reached.
    In 1/2 of them, a difference of 2 is reached, 1 more than the 1 that
    	is reached in all trials.
    In 1/3 of them, a difference of 3 is reached, 1 more than the 2 that
    	is reached in the above trials.
    
    And so on -- the expected number of flips is at least 1 + 1/2 + 1/3 +
    1/4 + ..., which increases without bound. 
    
    
    				-- edp 
623.24isn't this the same as the "Wandering Coin" ?HANNAH::OSMANsee HANNAH::IGLOO$:[OSMAN]ERIC.VT240Thu Oct 26 1989 17:075
	This sounds quite similar to the "Wandering Coin" problem I posed
	in # 848 in the BOGGLERS notes file.

	/Eric
623.251/n demonstrationESCROW::MUNZERThu Oct 26 1989 17:5038
>    Here's a question with a nice answer:  What's the probability the
>    difference of the numbers of heads and tails reaches n before it
>    returns to 0?
    
 **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  ** 

Let P(k,n) = probability that the difference increases from k to n before it
	     decreases from k to zero

Note P(1,n) is the answer to the question.

It's reasonable to say:

		P(n,n) = 1
		P(0,n) = 0

Then for 0 < k < n:

		P(k,n) = 1/2 * P(k-1,n) + 1/2 * P(k+1,n)

Well:		P(1,n) = 1/2 * P(2,n)
		P(2,n) = 1/2 * P(1,n) + 1/2 * P(3,n)
		       = 1/4 * P(2,n) + 1/2 * P(3,n)
		       = 2/3 * P(3,n)
		P(3,n) = 1/2 * P(2,n) + 1/2 * P(4,n)
		       = 1/3 * P(3,n) + 1/2 * P(4,n)
		       = 3/4 * P(3,n)
		.
		.
		P(k,n) = k/(k+1) * P(k+1,n)
		.
		.
		P(n-1,n) = (n-1)/n * 1

P(1,n) = 1/2 * 2/3 * 3/4 * ... * k/(k+1) * ... * (n-1)/n
       = 1/n	

John
623.26re .24ESCROW::MUNZERFri Oct 27 1989 22:103
    Absolutely right, Eric.  Please see bogglers 848.15
    
    John