[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

623.0. "let's flip a coin" by AVANTI::OSMAN (and silos to fill before I feep, and silos to fill before I feep) Tue Dec 09 1986 19:03




	Suppose you flip a fair coin, and then continue flipping until
	an equal number of heads and tails have been seen, and write down
	the number of flips that were necessary.

	Now suppose this task is repeated many times.  What is the expected
	average of the numbers written down ?


	Wording this differently, but intending exactly the same thing:
	What is the expected number of flips of a fair coin necessary to
	produce an equal number of heads and tails ?

	/Eric

T.R	Title	User	Personal Name	Date	Lines
623.1		ENGINE::ROTH		`Tue Dec 09 1986 19:32`	7
	I suspect infinity. The probability of an equal number of 1 and 0 bits in an even length bitstring is asymptotically 1/sqrt(2pin) so the probability of still flipping for a match after n tries approaches 1... This may be closely related to martingales... - Jim
623.2		CLT::GILBERT	eager like a child	`Tue Dec 09 1986 21:09`	1
	What is the probability that you'll stop after exactly 2*n flips?
623.3	Expected Value diverges	SQM::HALLYB	Are all the good ones taken?	`Tue Dec 09 1986 23:14`	28
	The probability of stopping after 2n flips is given by ( 2n ) 1 ( ) * --- ^ (2n) a.k.a. C(2n,n) * 2^(-n) ( n ) 2 Where the coefficient on the left is the combination of 2n things taken n at a time. If it exists, the expected value E(X) = sum(X * probability(X)) would be given by: inf ---- ( 2n ) \ 2n * ( ) * 2 ^ (-2n) / ( n ) ---- n = 0 For large M, the value of C(2M,M) is approximately 4^M/sqrt(piM) so the summand is approximately 2n 4^n / sqrt(pi * n) * 2^(-2n) Since 4^n = 2^(2n), the above reduces to 2n / sqrt(pi * n), which diverges as n --> oo. Adding them up gives you no finite total. There is therefore no expected value.
623.4		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Tue Dec 09 1986 23:38`	8
	Re .3: You need to subtract the probability that you will have stopped before reaching 2n flips. Also, you use 2n for the exponent in the first formula, but n for the second. (2n is correct.) -- edp
623.5	Recurrence Relations and Path Counting?	RTVAX::BRIDGEWATER		`Thu Dec 25 1986 00:45`	59
	Re 623.4: I'm not sure what you meant by subtracting the probability that you will have stopped before reaching 2n flips. I thought you meant something like the following: n-1 ( 2n ) -2n ---- ( 2i ) -2i P(n) = ( ) 2 - \ ( ) 2 ( n ) / ( i ) ---- i=1 where, P(n) = probability that the result of a series of 2n flips of a fair coin is n heads and n tails and that at no time after the first 2i flips, for 0 < i < n, was there a result of exactly i heads and i tails. Unfortunately, the above formula is not correct. It does not take into account the non-linear distributional effects that halting the coin flipping when the number of heads and tails are equal causes in the probability graph. A quick example (n=3) illustrates that the formula is incorrect: P(3) = 20(1/64) - 2(1/4) - 6*(1/16) = -9/16 A negative probability here shows that something is wrong with the formula. I have been looking at this problem by trying to solve recurrence relation equations I found by looking at the probability graph (similar to Pascal's triangle). Unfortunately, I haven't been able to solve them for P(n) yet. It appears that the expected number of flips is infinite, but I haven't generated a proof of that yet. I think the answer might be found by counting the paths in the probability graph that go from the node representing 0 tosses and ending in the node representing n heads and n tails. Then subtract out the paths that go through nodes that represent equal heads and tails where this number of heads or tails is in the doubly inclusive range of 1 to n-1. Each of these paths are equiprobable since the coin is fair at every flip. Divide this path count by 2^n and you have calculated P(n). Unfortunately, I haven't (yet!) figured out how to count the paths so that the result is a non-recursive function of n. There must be a closed-form solution for P(n) lurking somewhere! Once we have it, we should be able to show whether the infinite series involved in calculating the expected number of tosses converges or not. Intuition and random bits of memory related to stationary/non-stationary random processes tell me that the expected value is infinite. But, where's the beef? ... er, I mean, where's the proof? :^) - Don
623.6	A recursive representation of P(n)	RTVAX::BRIDGEWATER		`Thu Dec 25 1986 02:04`	36
	Re 623.5: I have found an equation for P(n) but I am having trouble making it non-recursive. n-1 ( 2n ) ---- ( 2n-2i ) 2i ( ) - \ ( ) 2 P(i) ( n ) / ( n-i ) ---- i=1 P(n) = --------------------------------------- , for n>0 2n 2 where, P(n) = probability that the result of a series of 2n flips of a fair coin is n heads and n tails and that at no time after the first 2i flips, for 0 < i < n, was there a result of exactly i heads and i tails. The equation for the expected value of the number of flips before halting (E) is: oo ---- \ E = > 2nP(n) / ---- n=1 - Don
623.7	fudge factor	VINO::JMUNZER		`Mon Dec 29 1986 12:44`	15
	Seems that the probability of stopping after 2n flips is 1 ( 2n ) 1 -------- * ( ) * --- ^ (2n) 2n - 1 ( n ) 2 E.g. for n = 3: 1 6! 1 1 1 1 --- * ------- * --- ^ 6 = --- * 20 * --- = --- 5 3! * 3! 2 5 64 16 Note there are four ways to stop -- HHHTTT, HHTHTT, TTTHHH, TTHTHH. John
623.8		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Mon Dec 29 1986 15:53`	37
	The formula in .7, C(2n,n) * 2^(-2n) / (2n-1), matches a recurrence relation I have developed. Let Q(n,l,c) be the number of head-tail strings of length n such that H-T is always greater than or equal to l, where H is the number of heads at a point in the string and T the number of tails, and H-T is c at the end of the string. Then Q(n,l,c) is zero if abs(c) > n or l > 0 (since H-T is zero initially). If l <= 0, Q(0,l,0) is 1 (the empty string has the required properties). For other values, Q(n+1,l,c) = Q(n,l-1,c-1) + Q(n,l+1,c+1), because we can start with a head and append a string of length n with the indicated properties or start with a tail. However, Q does not count the number of strings for which H-T is zero for the second time after n symbols. This count, R(n), can be found by taking a head and a tail and inserting between them a string of length n-2 for which H-T does not drop below zero and the change (c) is zero. Q only counts strings for which H-T does not drop below a certain value, but by symmetry, we can also start with a tail and a head and insert a string of length n-2 for which H-T does not exceed zero and the change is zero -- and there are the same number of these strings as of the other type. So R(n) = 2 Q(n-2,0,0). It is fairly simple to write a program which evaluates Q, and it checks with the formula. If a formula can be found for the Q's instead of just the special case of Q(n-2,0,0), it will be simple to complete the proof that the P(n) = R(2n) / 2^(2n) (my Q and R functions use the length of the string, P uses half the length of the string). -- edp
623.9		CLT::GILBERT	eager like a child	`Mon Dec 29 1986 17:45`	15
	> However, Q does not count the number of strings for which H-T is zero > for the second time after n symbols. It took me a while to understand this -- the 'first time' is the null string. > Then Q(n,l,c) is zero if abs(c) > n or l > 0 (since H-T is zero > initially). If l <= 0, Q(0,l,0) is 1 (the empty string has the > required properties). Q(n,l,c) is also zero if c < l. It may be useful to note that c-l is an anvariant amoung the Q's in the recurrence relation. Then again....
623.10		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Mon Dec 29 1986 18:50`	33
	Re .9: I noted that c-l is invariant. In fact, c = l in all uses of Q. Because of this, we can simply refer to Q(n,l). Collapsing the definition of Q gives: Q(0,l) = 1 if l = 0 or 0 if l <> 0. Q(n+1,l) = 0 if \|l\| > n or Q(n,l-1)+Q(n,l+1) otherwise. It turns out that Q(n,l) = C(p,q) - C(p,q-2), where p = n-1 and q = (n-l)/2. When q is not an integer, Q is zero. For this, we need to use definitions of C(p,q) where C(p,-h) and C(p,p+h) are zero for positive values of h, and for fractional values of q. Observation shows the formula is correct for n = 0 and \|l\| > n. For the remaining values, consider Q(n+1,l) = Q(n,l-1)+Q(n,l+1). We need to test that C(n,q0) - C(n,q0-2) = C(n-1,q1) - C(n-1,q1-2) + C(n-1,q2) - C(n-1,q2-2), where q0 = (n-l)/2, q1 = [(n-1)-(l-1)]/2, and q2 = [(n-1)-(l+1)]/2. A little work shows q0 = q1 = q2+1, so C(n,q0) - C(n,q0-2) = C(n-1,q0) - C(n-1,q0-2) + C(n-1,q0-1) - C(n-1,q0-3), which is just two simultaneous instances of the well-known formula governing Pascal's triangle. Since R(n) = 2 Q(n-2,0,0), R(n) = 2 [C(n-3,n/2-1) - C(n-3,n/2-3)]. Some arithmetic shows R(n) = C(n,n/2)/(n-1), the formula given earlier. Now we must find the expected value, the limit as n goes to infinity of the sum from i = 1 to n of i C(i,i/2) 2^(-i) / (i-1). -- edp
623.11		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Mon Dec 29 1986 19:52`	14
	Re .10: The series being summed is (odd terms omitted because they are zero): 1 13 135 1357 1 + - + --- + ----- + ------- + . . . 2 24 246 2468 The ratio between the n-th and (n+2)-th terms is one to (n-1)/n. I will have to look up the tests for convergence and divergence, although I would guess this series diverges. -- edp
623.12		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Mon Dec 29 1986 23:14`	23
	Re .11: > 1 13 135 1357 > 1 + - + --- + ----- + ------- + . . . > 2 24 246 2468 We may ignore the first term while considering divergence. We may also double each term without altering convergence or divergence. Observe that 1 >= 1, 3/2 >= 1, 35/(24) >= 1, (357) >= 2468. So in the doubled series, 2[1/2] >= 1/1, 2[13/(24)] >= 1/2, 2[135/(24*6)] >= 1/3, and so on -- each term is greater than the corresponding term of the harmonic series, which diverges, so this series also diverges. Unless I have made an error, I extend this offer: I will give any taker a dollar if they agree to flip a penny, give it to me, and repeat this until the number of heads flipped equals the number of tails flipped. Sure, it's a good bet -- go for it. You have a better than 90% chance of making money on this deal. -- edp
623.13	proof of Munzer's conjecture	CLT::GILBERT	eager like a child	`Tue Dec 30 1986 00:14`	91
	Let S(n,t) be the number of strings of length 2n with H-T = 2t. Then S(n+1,t) = S(n-1,t-1) + 2S(n-1,t) + S(n-1,t+1). Then we can easily create a table... S(n,t) n 0 1 2 3 3 1 2 1 6 1 1 4 15 t 0 1 2 6 20 -1 1 4 15 -2 1 6 -3 1 And we see that S(n,t) = C(2n,n+t). Similarly, let P(n,t) be the number of strings of length 2n, with H-T = 2t, such that the string contains no non-null prefix with H-T = 0. We get a table similar to the above, except that the t=0 entries don't participate in the recurrence. The effects of this change are easily accounted: n-1 P(n,t) = S(n,t) - Sum P(j,0) S(n-j,t) j=1 That is, a change in P(j,0) propagates just like the recurrence for S(n), so we subtract all these propagated effects. Now our primary interest is in P(n,0). Letting P(n) = P(n,0) and S(n) = S(n,0) = C(2n,n), we have the recurrence: n-1 P(n) = S(n) - Sum P(j) * S(n-j) j=1 We know that P(1) = 2 (the strings HT and TH), and (courtesy of J.Munzer) conjecture that P(n) = C(2n,n) / (2n-1). We prove this by induction. We verify that C(2,1) / (2-1) = 2 = P(1), assume the conjecture holds for all j < n, and then proceed to evaluate the expression: n-1 Q(n) = S(n) - Sum P(j) * S(n-j) j=1 n-1 = S(n) - Sum C(2j,j) C(2n-2j,n-j) / (2j-1) j=1 n-1 = S(n) - Sum 2 C(2j-1,j-1) C(2n-2j,n-j) / (2j-1) j=1 (since C(r,k) = (r/k) C(r-1,k-1) if k <> 0) Now, from Eq. (26) in Section 1.2.6 of "The Art of Computer Programming", we know that: Sum C(r-tk,k) C(s-t(n-k),n-k) r/(r-tk) = C(r+s-tn,n), for integer n k>=0 We use this with the simultaneous substitutions: t = -2, k = j-1, r = 1, s = 0, to give: Sum C(2j-1,j-1) C(2n-2j+2,n-j+1) / (2j-1) = C(2n+1,n), for integer n j>=1 And a shift in n gives: Sum C(2j-1,j-1) C(2n-2j,n-j) / (2j-1) = C(2n-1,n-1), for integer n j>=1 Noting that the expression being summed in our equation for Q(n) is zero if j > n, and substitute the above into our equation, we get: n-1 Q(n) = S(n) - 2 Sum C(2j-1,j-1) C(2n-2j,n-j) / (2j-1) j=1 n = S(n) - 2 Sum C(2j-1,j-1) C(2n-2j,n-j) / (2j-1) j=1 + 2 C(2n-1,n-1) C(2n-2n,n-n) / (2n-1) = C(2n,n) - 2 C(2n-1,n-1) + 2 C(2n-1,n-1) / (2n-1) And since 2 C(2n-1,n-1) = C(2n,n) for n <> 0, we have: Q(n) = C(2n,n) / (2n-1) This completes the induction that P(n) = C(2n,n) / (2n-1).
623.14	Hallyburton's quite correct	CLT::GILBERT	eager like a child	`Tue Dec 30 1986 00:35`	27
	The expected value diverges. The probability of stopping after 2n flips is given by 1 ( 2n ) -2n ------ ( ) 2 , or C(2n,n) * 2^(-2n) / (2n-1) 2n - 1 ( n ) If it exists, the expected value E(X) = sum(X * probability(X)) would be given by: oo --- 1 ( 2j ) -2j > ------ ( ) 2 --- 2j - 1 ( j ) j=0 For large n, the value of C(2n,n) is approximately 4^n/sqrt(pin) so the summand is approximately 2j 4^j / sqrt(pi * j) * 2^(-2j) / (2j-1) Since 4^j = 2^(2j), the above reduces to 2j / (2j-1) / sqrt(pi * j), or roughly 1 / sqrt(pi * j). The summation of j^r diverges for all r >= -1, and so this summation also diverges (r = -1/2 > -1). There is therefore no expected value.
623.15	P(n,t) -- the rest of the story	CLT::GILBERT	eager like a child	`Tue Dec 30 1986 01:33`	35
	But if we were interested in P(n,t), we'd use Eq. (26) to find that Sum C(2j-1,j-1) C(2n-2j,n-j+t) / (2j-1) = C(2n-1,n+t-1) j>=1 And so n-1 P(n,t) = S(n,t) - Sum 2 C(2j-1,j-1) C(2n-2j,n-j+t) / (2j-1) j=1 = S(n,t) - 2 C(2n-1,n+t-1) + 2 Sum C(2j-1,j-1) C(2n-j,n-j+t) / (2n-1) j>=n = S(n,t) - 2 C(2n-1,n+t-1) + 0, if t < 0 But C(2n,n+t) = (2n)/(n+t) C(2n-1,n+t-1), for n+t <> 0, so that C(2n-1,n+t-1) = (n+t)/(2n) C(2n,n+t), for n+t <> 0, n <> 0. Hence P(n,t) = C(2n,n+t) - (n+t)/n C(2n,n+t), for n+t <> 0, n <> 0, t < 0 P(n,t) = -t/n C(2n,n+t), if -n < t < 0. By symmetry, P(n,t) = P(n,-t), and P(n,t) = \|t\|/n C(2n,n+t), if 0 < \|t\| < n We note that this also gives the correct values for \|t\| = n (namely, 1), and for \|t\| > n (namely, 0). Thus, P(n,t) = \|t\|/n C(2n,n+t), if t <> 0 and P(n,t) = C(2n,n) / (2n-1), if t = 0.
623.16	.13 - .15	VINO::JMUNZER		`Tue Dec 30 1986 12:23`	3
	Very nicely completed, Peter! John
623.17	Good problem, Eric!	AURORA::HALLYB	Are all the good ones taken?	`Tue Dec 30 1986 14:18`	5
	A bit more subtle than it appears at first glance. Looks like it would make a fine question for a take-home final in a probability course. John
623.18	CAUTION: don't accept edp's $1 bet !	VIDEO::OSMAN	and silos to fill before I feep, and silos to fill before I feep	`Mon Jan 05 1987 18:16`	30
	Beware of edp's offer of $1 for which you must pay back flipped pennies until an equal number of heads and tails have been seen ! Although you'll probably win, the rare chance that you lose can be quite devastating. You can fairly easily lose thousands of dollars. Here's a C program that illustrates the possibility of losing lots. If it weren't for the damping (see variable "large"), the program would quickly run into a very long loop, which indicates you losing your shirt. ----------------------------------------------------------------------- main (){ int ntries=0, heads, tails, seed, large=10000; double average; while (1) { heads = tails = 0; do {if (mth$random(&seed)&1)heads++;else tails++;} while (heads != tails && heads+tails <= large); if (heads+tails <= large) printf (" %d\n", heads2); else printf (" > %d\n", large); ntries++; printf ("%g\n", average = (average (ntries-1) + heads*2)/ntries); } } ----------------------------------------------------------------------- /Eric
623.19		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Wed Oct 25 1989 01:33`	9
	Here's a question with a nice answer: What's the probability the difference of the numbers of heads and tails reaches n before it returns to 0? The answer to the above question provides a simple answer for the base note. -- edp
623.20		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Thu Oct 26 1989 11:46`	4
	Is anybody working on the question in .19, or should I post the answer? -- edp
623.21	Every little bit helps	AKQJ10::YARBROUGH	I prefer Pi	`Thu Oct 26 1989 12:09`	4
	I'm pretty sure it's 2^-n ... since the probabilities sum to 1. Actualy, I worked this out a long time ago but I'm too rushed to do the details again. Lynn
623.22		KOBAL::GILBERT	Ownership Obligates	`Thu Oct 26 1989 13:46`	2
	It appears that the answer to .19 is 1/n. It's a nice answer, but I don't yet have a proof.
623.23		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Thu Oct 26 1989 16:28`	17
	.22 has it: The probability that the difference reaches n before it returns to 0 is 1/n. I'll let Gilbert work on the proof a bit longer, but here's how it applies to the original problem: If a difference of n is reached, there must have been at least n flips. In all of the trials, a difference of 1 is reached. In 1/2 of them, a difference of 2 is reached, 1 more than the 1 that is reached in all trials. In 1/3 of them, a difference of 3 is reached, 1 more than the 2 that is reached in the above trials. And so on -- the expected number of flips is at least 1 + 1/2 + 1/3 + 1/4 + ..., which increases without bound. -- edp
623.24	isn't this the same as the "Wandering Coin" ?	HANNAH::OSMAN	see HANNAH::IGLOO$:[OSMAN]ERIC.VT240	`Thu Oct 26 1989 17:07`	5
	This sounds quite similar to the "Wandering Coin" problem I posed in # 848 in the BOGGLERS notes file. /Eric
623.25	1/n demonstration	ESCROW::MUNZER		`Thu Oct 26 1989 17:50`	38
	> Here's a question with a nice answer: What's the probability the > difference of the numbers of heads and tails reaches n before it > returns to 0? Let P(k,n) = probability that the difference increases from k to n before it decreases from k to zero Note P(1,n) is the answer to the question. It's reasonable to say: P(n,n) = 1 P(0,n) = 0 Then for 0 < k < n: P(k,n) = 1/2 * P(k-1,n) + 1/2 * P(k+1,n) Well: P(1,n) = 1/2 * P(2,n) P(2,n) = 1/2 * P(1,n) + 1/2 * P(3,n) = 1/4 * P(2,n) + 1/2 * P(3,n) = 2/3 * P(3,n) P(3,n) = 1/2 * P(2,n) + 1/2 * P(4,n) = 1/3 * P(3,n) + 1/2 * P(4,n) = 3/4 * P(3,n) . . P(k,n) = k/(k+1) * P(k+1,n) . . P(n-1,n) = (n-1)/n * 1 P(1,n) = 1/2 * 2/3 * 3/4 * ... * k/(k+1) * ... * (n-1)/n = 1/n John
623.26	re .24	ESCROW::MUNZER		`Fri Oct 27 1989 22:10`	3
	Absolutely right, Eric. Please see bogglers 848.15 John