T.R | Title | User | Personal Name | Date | Lines |
---|
623.1 | | ENGINE::ROTH | | Tue Dec 09 1986 19:32 | 7 |
| I suspect infinity. The probability of an equal number of 1 and 0 bits
in an even length bitstring is asymptotically 1/sqrt(2*pi*n) so the
probability of still flipping for a match after n tries approaches 1...
This may be closely related to martingales...
- Jim
|
623.2 | | CLT::GILBERT | eager like a child | Tue Dec 09 1986 21:09 | 1 |
| What is the probability that you'll stop after exactly 2*n flips?
|
623.3 | Expected Value diverges | SQM::HALLYB | Are all the good ones taken? | Tue Dec 09 1986 23:14 | 28 |
| The probability of stopping after 2n flips is given by
( 2n ) 1
( ) * --- ^ (2n) a.k.a. C(2n,n) * 2^(-n)
( n ) 2
Where the coefficient on the left is the combination of 2n things
taken n at a time.
If it exists, the expected value E(X) = sum(X * probability(X))
would be given by:
inf
---- ( 2n )
\ 2n * ( ) * 2 ^ (-2n)
/ ( n )
----
n = 0
For large M, the value of C(2M,M) is approximately 4^M/sqrt(pi*M)
so the summand is approximately
2n * 4^n / sqrt(pi * n) * 2^(-2n)
Since 4^n = 2^(2n), the above reduces to 2n / sqrt(pi * n), which
diverges as n --> oo. Adding them up gives you no finite total.
There is therefore no expected value.
|
623.4 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Tue Dec 09 1986 23:38 | 8 |
| Re .3:
You need to subtract the probability that you will have stopped before
reaching 2n flips. Also, you use 2n for the exponent in the first
formula, but n for the second. (2n is correct.)
-- edp
|
623.5 | Recurrence Relations and Path Counting? | RTVAX::BRIDGEWATER | | Thu Dec 25 1986 00:45 | 59 |
| Re 623.4:
I'm not sure what you meant by subtracting the probability that
you will have stopped before reaching 2n flips. I thought you
meant something like the following:
n-1
( 2n ) -2n ---- ( 2i ) -2i
P(n) = ( ) 2 - \ ( ) 2
( n ) / ( i )
----
i=1
where,
P(n) = probability that the result of a series of 2n flips of a
fair coin is n heads and n tails and that at no time after the
first 2i flips, for 0 < i < n, was there a result of exactly i
heads and i tails.
Unfortunately, the above formula is not correct. It does not
take into account the non-linear distributional effects that
halting the coin flipping when the number of heads and tails are
equal causes in the probability graph.
A quick example (n=3) illustrates that the formula is incorrect:
P(3) = 20*(1/64) - 2*(1/4) - 6*(1/16) = -9/16
A negative probability here shows that something is wrong with the
formula.
I have been looking at this problem by trying to solve recurrence
relation equations I found by looking at the probability graph
(similar to Pascal's triangle). Unfortunately, I haven't been
able to solve them for P(n) yet. It appears that the expected
number of flips is infinite, but I haven't generated a proof of
that yet.
I think the answer might be found by counting the paths in the
probability graph that go from the node representing 0 tosses
and ending in the node representing n heads and n tails. Then
subtract out the paths that go through nodes that represent equal
heads and tails where this number of heads or tails is in the
doubly inclusive range of 1 to n-1. Each of these paths are
equiprobable since the coin is fair at every flip. Divide this
path count by 2^n and you have calculated P(n). Unfortunately,
I haven't (yet!) figured out how to count the paths so that the
result is a non-recursive function of n.
There must be a closed-form solution for P(n) lurking somewhere!
Once we have it, we should be able to show whether the infinite
series involved in calculating the expected number of tosses
converges or not. Intuition and random bits of memory related
to stationary/non-stationary random processes tell me that the
expected value is infinite. But, where's the beef? ... er, I
mean, where's the proof? :^)
- Don
|
623.6 | A recursive representation of P(n) | RTVAX::BRIDGEWATER | | Thu Dec 25 1986 02:04 | 36 |
| Re 623.5:
I have found an equation for P(n) but I am having trouble making
it non-recursive.
n-1
( 2n ) ---- ( 2n-2i ) 2i
( ) - \ ( ) 2 P(i)
( n ) / ( n-i )
----
i=1
P(n) = --------------------------------------- , for n>0
2n
2
where,
P(n) = probability that the result of a series of 2n flips of a
fair coin is n heads and n tails and that at no time after the
first 2i flips, for 0 < i < n, was there a result of exactly i
heads and i tails.
The equation for the expected value of the number of flips before
halting (E) is:
oo
----
\
E = > 2nP(n)
/
----
n=1
- Don
|
623.7 | fudge factor | VINO::JMUNZER | | Mon Dec 29 1986 12:44 | 15 |
| Seems that the probability of stopping after 2n flips is
1 ( 2n ) 1
-------- * ( ) * --- ^ (2n)
2n - 1 ( n ) 2
E.g. for n = 3:
1 6! 1 1 1 1
--- * ------- * --- ^ 6 = --- * 20 * --- = ---
5 3! * 3! 2 5 64 16
Note there are four ways to stop -- HHHTTT, HHTHTT, TTTHHH, TTHTHH.
John
|
623.8 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Mon Dec 29 1986 15:53 | 37 |
| The formula in .7, C(2n,n) * 2^(-2n) / (2n-1), matches a recurrence
relation I have developed.
Let Q(n,l,c) be the number of head-tail strings of length n such that
H-T is always greater than or equal to l, where H is the number of
heads at a point in the string and T the number of tails, and H-T is c
at the end of the string.
Then Q(n,l,c) is zero if abs(c) > n or l > 0 (since H-T is zero
initially). If l <= 0, Q(0,l,0) is 1 (the empty string has the
required properties). For other values,
Q(n+1,l,c) = Q(n,l-1,c-1) + Q(n,l+1,c+1),
because we can start with a head and append a string of length n with
the indicated properties or start with a tail.
However, Q does not count the number of strings for which H-T is zero
for the second time after n symbols. This count, R(n), can be found by
taking a head and a tail and inserting between them a string of length
n-2 for which H-T does not drop below zero and the change (c) is zero.
Q only counts strings for which H-T does not drop below a certain
value, but by symmetry, we can also start with a tail and a head and
insert a string of length n-2 for which H-T does not exceed zero and
the change is zero -- and there are the same number of these strings as
of the other type.
So R(n) = 2 Q(n-2,0,0).
It is fairly simple to write a program which evaluates Q, and it checks
with the formula. If a formula can be found for the Q's instead of
just the special case of Q(n-2,0,0), it will be simple to complete the
proof that the P(n) = R(2n) / 2^(2n) (my Q and R functions use the
length of the string, P uses half the length of the string).
-- edp
|
623.9 | | CLT::GILBERT | eager like a child | Mon Dec 29 1986 17:45 | 15 |
| > However, Q does not count the number of strings for which H-T is zero
> for the second time after n symbols.
It took me a while to understand this -- the 'first time' is the
null string.
> Then Q(n,l,c) is zero if abs(c) > n or l > 0 (since H-T is zero
> initially). If l <= 0, Q(0,l,0) is 1 (the empty string has the
> required properties).
Q(n,l,c) is also zero if c < l.
It may be useful to note that c-l is an anvariant amoung the Q's
in the recurrence relation. Then again....
|
623.10 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Mon Dec 29 1986 18:50 | 33 |
| Re .9:
I noted that c-l is invariant. In fact, c = l in all uses of Q.
Because of this, we can simply refer to Q(n,l). Collapsing the
definition of Q gives:
Q(0,l) = 1 if l = 0 or 0 if l <> 0.
Q(n+1,l) = 0 if |l| > n or Q(n,l-1)+Q(n,l+1) otherwise.
It turns out that Q(n,l) = C(p,q) - C(p,q-2), where p = n-1 and q =
(n-l)/2. When q is not an integer, Q is zero.
For this, we need to use definitions of C(p,q) where C(p,-h) and
C(p,p+h) are zero for positive values of h, and for fractional values
of q. Observation shows the formula is correct for n = 0 and |l| > n.
For the remaining values, consider Q(n+1,l) = Q(n,l-1)+Q(n,l+1). We
need to test that C(n,q0) - C(n,q0-2) = C(n-1,q1) - C(n-1,q1-2) +
C(n-1,q2) - C(n-1,q2-2), where q0 = (n-l)/2, q1 = [(n-1)-(l-1)]/2, and
q2 = [(n-1)-(l+1)]/2. A little work shows q0 = q1 = q2+1, so C(n,q0) -
C(n,q0-2) = C(n-1,q0) - C(n-1,q0-2) + C(n-1,q0-1) - C(n-1,q0-3), which
is just two simultaneous instances of the well-known formula governing
Pascal's triangle.
Since R(n) = 2 Q(n-2,0,0), R(n) = 2 [C(n-3,n/2-1) - C(n-3,n/2-3)].
Some arithmetic shows R(n) = C(n,n/2)/(n-1), the formula given earlier.
Now we must find the expected value, the limit as n goes to infinity of
the sum from i = 1 to n of i C(i,i/2) 2^(-i) / (i-1).
-- edp
|
623.11 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Mon Dec 29 1986 19:52 | 14 |
| Re .10:
The series being summed is (odd terms omitted because they are zero):
1 1*3 1*3*5 1*3*5*7
1 + - + --- + ----- + ------- + . . .
2 2*4 2*4*6 2*4*6*8
The ratio between the n-th and (n+2)-th terms is one to (n-1)/n. I
will have to look up the tests for convergence and divergence, although
I would guess this series diverges.
-- edp
|
623.12 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Mon Dec 29 1986 23:14 | 23 |
| Re .11:
> 1 1*3 1*3*5 1*3*5*7
> 1 + - + --- + ----- + ------- + . . .
> 2 2*4 2*4*6 2*4*6*8
We may ignore the first term while considering divergence. We may also
double each term without altering convergence or divergence. Observe
that 1 >= 1, 3/2 >= 1, 3*5/(2*4) >= 1, (3*5*7) >= 2*4*6*8. So in the
doubled series, 2[1/2] >= 1/1, 2[1*3/(2*4)] >= 1/2, 2[1*3*5/(2*4*6)] >=
1/3, and so on -- each term is greater than the corresponding term of
the harmonic series, which diverges, so this series also diverges.
Unless I have made an error, I extend this offer: I will give any
taker a dollar if they agree to flip a penny, give it to me, and repeat
this until the number of heads flipped equals the number of tails
flipped.
Sure, it's a good bet -- go for it. You have a better than 90%
chance of making money on this deal.
-- edp
|
623.13 | proof of Munzer's conjecture | CLT::GILBERT | eager like a child | Tue Dec 30 1986 00:14 | 91 |
| Let S(n,t) be the number of strings of length 2n with H-T = 2t.
Then S(n+1,t) = S(n-1,t-1) + 2*S(n-1,t) + S(n-1,t+1). Then we
can easily create a table...
S(n,t) n
0 1 2 3
3 1
2 1 6
1 1 4 15
t 0 1 2 6 20
-1 1 4 15
-2 1 6
-3 1
And we see that S(n,t) = C(2n,n+t).
Similarly, let P(n,t) be the number of strings of length 2n, with H-T = 2t,
such that the string contains no non-null prefix with H-T = 0. We get a
table similar to the above, except that the t=0 entries don't participate
in the recurrence. The effects of this change are easily accounted:
n-1
P(n,t) = S(n,t) - Sum P(j,0) * S(n-j,t)
j=1
That is, a change in P(j,0) propagates just like the recurrence for S(n),
so we subtract all these propagated effects.
Now our primary interest is in P(n,0). Letting P(n) = P(n,0) and
S(n) = S(n,0) = C(2n,n), we have the recurrence:
n-1
P(n) = S(n) - Sum P(j) * S(n-j)
j=1
We know that P(1) = 2 (the strings HT and TH), and (courtesy of J.Munzer)
conjecture that P(n) = C(2n,n) / (2n-1). We prove this by induction.
We verify that C(2,1) / (2-1) = 2 = P(1), assume the conjecture holds
for all j < n, and then proceed to evaluate the expression:
n-1
Q(n) = S(n) - Sum P(j) * S(n-j)
j=1
n-1
= S(n) - Sum C(2j,j) C(2n-2j,n-j) / (2j-1)
j=1
n-1
= S(n) - Sum 2 C(2j-1,j-1) C(2n-2j,n-j) / (2j-1)
j=1
(since C(r,k) = (r/k) C(r-1,k-1) if k <> 0)
Now, from Eq. (26) in Section 1.2.6 of "The Art of Computer Programming",
we know that:
Sum C(r-tk,k) C(s-t(n-k),n-k) r/(r-tk) = C(r+s-tn,n), for integer n
k>=0
We use this with the simultaneous substitutions: t = -2, k = j-1, r = 1,
s = 0, to give:
Sum C(2j-1,j-1) C(2n-2j+2,n-j+1) / (2j-1) = C(2n+1,n), for integer n
j>=1
And a shift in n gives:
Sum C(2j-1,j-1) C(2n-2j,n-j) / (2j-1) = C(2n-1,n-1), for integer n
j>=1
Noting that the expression being summed in our equation for Q(n) is zero
if j > n, and substitute the above into our equation, we get:
n-1
Q(n) = S(n) - 2 Sum C(2j-1,j-1) C(2n-2j,n-j) / (2j-1)
j=1
n
= S(n) - 2 Sum C(2j-1,j-1) C(2n-2j,n-j) / (2j-1)
j=1
+ 2 C(2n-1,n-1) C(2n-2n,n-n) / (2n-1)
= C(2n,n) - 2 C(2n-1,n-1) + 2 C(2n-1,n-1) / (2n-1)
And since 2 C(2n-1,n-1) = C(2n,n) for n <> 0, we have:
Q(n) = C(2n,n) / (2n-1)
This completes the induction that P(n) = C(2n,n) / (2n-1).
|
623.14 | Hallyburton's quite correct | CLT::GILBERT | eager like a child | Tue Dec 30 1986 00:35 | 27 |
| The expected value diverges.
The probability of stopping after 2n flips is given by
1 ( 2n ) -2n
------ ( ) 2 , or C(2n,n) * 2^(-2n) / (2n-1)
2n - 1 ( n )
If it exists, the expected value E(X) = sum(X * probability(X))
would be given by:
oo
--- 1 ( 2j ) -2j
> ------ ( ) 2
--- 2j - 1 ( j )
j=0
For large n, the value of C(2n,n) is approximately 4^n/sqrt(pi*n)
so the summand is approximately
2j * 4^j / sqrt(pi * j) * 2^(-2j) / (2j-1)
Since 4^j = 2^(2j), the above reduces to 2j / (2j-1) / sqrt(pi * j),
or roughly 1 / sqrt(pi * j). The summation of j^r diverges for all
r >= -1, and so this summation also diverges (r = -1/2 > -1).
There is therefore no expected value.
|
623.15 | P(n,t) -- the rest of the story | CLT::GILBERT | eager like a child | Tue Dec 30 1986 01:33 | 35 |
| But if we *were* interested in P(n,t), we'd use Eq. (26) to find that
Sum C(2j-1,j-1) C(2n-2j,n-j+t) / (2j-1) = C(2n-1,n+t-1)
j>=1
And so
n-1
P(n,t) = S(n,t) - Sum 2 C(2j-1,j-1) C(2n-2j,n-j+t) / (2j-1)
j=1
= S(n,t) - 2 C(2n-1,n+t-1) + 2 Sum C(2j-1,j-1) C(2n-j,n-j+t) / (2n-1)
j>=n
= S(n,t) - 2 C(2n-1,n+t-1) + 0, if t < 0
But
C(2n,n+t) = (2n)/(n+t) C(2n-1,n+t-1), for n+t <> 0,
so that
C(2n-1,n+t-1) = (n+t)/(2n) C(2n,n+t), for n+t <> 0, n <> 0.
Hence
P(n,t) = C(2n,n+t) - (n+t)/n C(2n,n+t), for n+t <> 0, n <> 0, t < 0
P(n,t) = -t/n C(2n,n+t), if -n < t < 0.
By symmetry, P(n,t) = P(n,-t), and
P(n,t) = |t|/n C(2n,n+t), if 0 < |t| < n
We note that this also gives the correct values for |t| = n (namely, 1),
and for |t| > n (namely, 0). Thus,
P(n,t) = |t|/n C(2n,n+t), if t <> 0
and
P(n,t) = C(2n,n) / (2n-1), if t = 0.
|
623.16 | .13 - .15 | VINO::JMUNZER | | Tue Dec 30 1986 12:23 | 3 |
| Very nicely completed, Peter!
John
|
623.17 | Good problem, Eric! | AURORA::HALLYB | Are all the good ones taken? | Tue Dec 30 1986 14:18 | 5 |
| A bit more subtle than it appears at first glance. Looks like it
would make a fine question for a take-home final in a probability
course.
John
|
623.18 | CAUTION: don't accept edp's $1 bet ! | VIDEO::OSMAN | and silos to fill before I feep, and silos to fill before I feep | Mon Jan 05 1987 18:16 | 30 |
| Beware of edp's offer of $1 for which you must pay back flipped pennies
until an equal number of heads and tails have been seen !
Although you'll probably win, the rare chance that you lose can be
quite devastating. You can fairly easily lose thousands of dollars.
Here's a C program that illustrates the possibility of losing lots.
If it weren't for the damping (see variable "large"), the program
would quickly run into a very long loop, which indicates you losing
your shirt.
-----------------------------------------------------------------------
main (){
int ntries=0, heads, tails, seed, large=10000;
double average;
while (1)
{
heads = tails = 0;
do {if (mth$random(&seed)&1)heads++;else tails++;}
while (heads != tails && heads+tails <= large);
if (heads+tails <= large) printf (" %d\n", heads*2);
else printf (" > %d\n", large);
ntries++;
printf ("%g\n", average = (average * (ntries-1) + heads*2)/ntries);
}
}
-----------------------------------------------------------------------
/Eric
|
623.19 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Wed Oct 25 1989 01:33 | 9 |
| Here's a question with a nice answer: What's the probability the
difference of the numbers of heads and tails reaches n before it
returns to 0?
The answer to the above question provides a simple answer for the base
note.
-- edp
|
623.20 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Thu Oct 26 1989 11:46 | 4 |
| Is anybody working on the question in .19, or should I post the answer?
-- edp
|
623.21 | Every little bit helps | AKQJ10::YARBROUGH | I prefer Pi | Thu Oct 26 1989 12:09 | 4 |
| I'm pretty sure it's 2^-n ... since the probabilities sum to 1. Actualy, I
worked this out a long time ago but I'm too rushed to do the details again.
Lynn
|
623.22 | | KOBAL::GILBERT | Ownership Obligates | Thu Oct 26 1989 13:46 | 2 |
| It appears that the answer to .19 is 1/n. It's a nice answer, but I
don't yet have a proof.
|
623.23 | | BEING::POSTPISCHIL | Always mount a scratch monkey. | Thu Oct 26 1989 16:28 | 17 |
| .22 has it: The probability that the difference reaches n before it
returns to 0 is 1/n. I'll let Gilbert work on the proof a bit longer,
but here's how it applies to the original problem:
If a difference of n is reached, there must have been at least n flips.
In all of the trials, a difference of 1 is reached.
In 1/2 of them, a difference of 2 is reached, 1 more than the 1 that
is reached in all trials.
In 1/3 of them, a difference of 3 is reached, 1 more than the 2 that
is reached in the above trials.
And so on -- the expected number of flips is at least 1 + 1/2 + 1/3 +
1/4 + ..., which increases without bound.
-- edp
|
623.24 | isn't this the same as the "Wandering Coin" ? | HANNAH::OSMAN | see HANNAH::IGLOO$:[OSMAN]ERIC.VT240 | Thu Oct 26 1989 17:07 | 5 |
|
This sounds quite similar to the "Wandering Coin" problem I posed
in # 848 in the BOGGLERS notes file.
/Eric
|
623.25 | 1/n demonstration | ESCROW::MUNZER | | Thu Oct 26 1989 17:50 | 38 |
| > Here's a question with a nice answer: What's the probability the
> difference of the numbers of heads and tails reaches n before it
> returns to 0?
** ** ** ** ** ** ** ** ** ** ** ** ** ** ** **
Let P(k,n) = probability that the difference increases from k to n before it
decreases from k to zero
Note P(1,n) is the answer to the question.
It's reasonable to say:
P(n,n) = 1
P(0,n) = 0
Then for 0 < k < n:
P(k,n) = 1/2 * P(k-1,n) + 1/2 * P(k+1,n)
Well: P(1,n) = 1/2 * P(2,n)
P(2,n) = 1/2 * P(1,n) + 1/2 * P(3,n)
= 1/4 * P(2,n) + 1/2 * P(3,n)
= 2/3 * P(3,n)
P(3,n) = 1/2 * P(2,n) + 1/2 * P(4,n)
= 1/3 * P(3,n) + 1/2 * P(4,n)
= 3/4 * P(3,n)
.
.
P(k,n) = k/(k+1) * P(k+1,n)
.
.
P(n-1,n) = (n-1)/n * 1
P(1,n) = 1/2 * 2/3 * 3/4 * ... * k/(k+1) * ... * (n-1)/n
= 1/n
John
|
623.26 | re .24 | ESCROW::MUNZER | | Fri Oct 27 1989 22:10 | 3 |
| Absolutely right, Eric. Please see bogglers 848.15
John
|