[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

1533.0. "Simple Probability Problem" by IMTDEV::ROBERTS (Reason, Purpose, Self-esteem) Mon Dec 23 1991 15:50

    I need some help with an easy probability/statistics problem.
    
    Given that a henhouse produces an average of m eggs per day, what's the
    probability that exactly n eggs will be produced on a particular day?
    
    I'm not even sure what the distribution would be. Normal? Poisson?
    
    Dwayne
    
T.RTitleUserPersonal
Name
DateLines
1533.1Sounds like poor phrasingVMSDEV::HALLYBFish have no concept of fireMon Dec 23 1991 16:1714
>    I'm not even sure what the distribution would be. Normal? Poisson?
    
    This looks like a case of insufficient information.  Are there some
    hidden assumptions (such as "a hen produces one egg with probability p and
    zero eggs with probability q=1-p, per day, independently of other hens")?
    
    Alternatively one might assume the distribution is uniform, that the
    number of eggs produced is as likely to be one possible outcome as any
    other.  Which would mean the possibiities are the integers in [0..2m],
    so the probability of any one value selected is 1/(2m+1)
    
    There are probably :-) a thousand other readings of the problem.
    
      John
1533.2eggs and probability theorySTAR::ABBASIMon Dec 23 1991 16:5216
    also if we know the two more valuses, the extreme values of eggs 
    production (the best days, and worst day) we can improve the probability 
    in .1 even more . 
    
    let w=worst daily production of eggs , m=best.
    
    then if n<w or n>b, then the probability of n production is zero 
    
    else probability is 1/(b-w) (given normal distribution). 
    
    we can improve this more, by saying if n < m then p= 1/(m-w)
    or if n>m then p=1/(m-w). (normal bell shape again..)
     
    as .1 said, we need more info to solve this. iam sure of that.
    /nasser
    
1533.3IMTDEV::ROBERTSReason, Purpose, Self-esteemMon Dec 23 1991 17:307
    "then if n<w or n>b, then the probability of n production is zero"
    
    Does this mean that it's impossible to produce less than the worst
    that's ever been done or more than the best that's been done?
    
    Dwayne
    
1533.4p=0 do not mean impossibleSTAR::ABBASIMon Dec 23 1991 18:0216
    no, no, i've learned from this note file that if p=0, then that does not 
    mean it is impossible. 
    
    it makes sense (to me at least) to say that if the worst case
    ever recorded is n and you ask what is the probabilty of reaching
    lower than that, to say p=0 (given the previouse history of 
    observations in that farm only), the larger your data base, the
    more confident one is with p=0 is true, for example, if we measure
    the production from all farms in the world and normalize all numbers
    to some common base, and we still see that m is less than the worst 
    farm eggs production rate then p=0 seems to make sense more....
    
    but i dont know much about probability. so i could be wrong.
    
    /nasser
    
1533.5VMSDEV::HALLYBFish have no concept of fireTue Dec 24 1991 10:536
>    no, no, i've learned from this note file that if p=0, then that does not 
>    mean it is impossible. 
    
    p=0 means if such happens, then "it's a miracle".
    
    Oops, wrong note...		:-)
1533.6ZFC::deramoDan D'EramoTue Dec 24 1991 12:0613
If you model it as a Poisson process with parameter m eggs/day,
then the probability of k eggs on a given day is (m^k/k!)e^-m
This sums to 1 over all k (of course) and has mean and variance
both m (numerically, but the units are different).

Alternatively, you can take the henhouse to have N hens each
of which is modelled as a Poisson process with parameter m/N
eggs/day.  I think this gives the same distribution for k as
above.  In fact, I think the hens can have different "parameters"
as long as they all add up to m, and still have the same total
distribution as above.  Neat, huh?

Dan
1533.7Poisson model seems fishyVMSDEV::HALLYBFish have no concept of fireTue Dec 24 1991 13:067
> If you model it as a Poisson process with parameter m eggs/day,
    
    Any rationale for this model?  Meaning, farmer Dan, D'You have any
    statistics on egg production, or is this to be regarded as Another
    Possibility We Should Consider?
    
      John
1533.8ZFC::deramoDan D'EramoTue Dec 24 1991 13:265
>    Any rationale for this model?

The base note mentioned it. :-)  (And it was easy to analyze.)

Dan
1533.9Two approaches -- but still need more info.CADSYS::COOPERTopher CooperTue Dec 31 1991 13:3849
    As has been said previously, you need more information to solve this
    problem.

    There are two approaches to solving this problem which would be used in
    practice.

    The first is what has been suggested -- you look at the physical
    circumstances and choose a (simplified) model of what is going on and
    then derive an appropriate family of distributions from it.  You
    would then take actual measurements (e.g., the average number of eggs
    laid per day) and use that to constrain the family (usually by
    selecting a single "most likely" member of the family) to answer your
    question.  In practice you would want to make additional measurements
    to make sure that your model is reasonable.

    Ideally, your model will generate a "single-parameter" family of
    distributions, such as the Poisson distribution (.6) which you can
    fit with your single statistic "m".  Unfortunately, I don't think
    that that is realistic.

    From what little I know of hen's and egg-laying, I would guess that the
    best "first order" model would be as follows:  Each of the N hens in
    the henhouse have a probability, p, of laying an egg on any particular
    day.  The probability of each hen laying an egg on any particular day
    is independent of whether or not any other hen or any combination of
    hens lays an egg.  The appropriate distribution in this case is the
    binomial distribution with parameters N and p.  We can use m/N for p,
    and then just "read" the probability for n eggs being laid out of
    the formula for the binomial distribution.  We can check if the model
    is reasonable by measuring the variance of the number of eggs/day laid.
    This should be about m*(N-m)/N.

    Note that this does require you to know something more about the
    problem, though, the number of hens in the henhouse.

    The other approach is more pragmatic and is what would be used in
    practice most of the time.  In a sense, it is a special case of the
    first, but is less formal.  The statement would simply be made that
    the relevant distribution will probably be approximately normal
    whatever it "really" is, since it is, roughly speaking, the "sum" of
    the behaviors of a whole bunch of separate hens, and because it seems
    reasonably symetric (it does not seem significantly more likely to get
    6 eggs more than on the average than 6 eggs less than on the average).
    Therefore a normal distribution with mean m and a variance (or standard
    deviation) which needs to be estimated by looking at the records of
    egg-laying can be used to estimate the probability of n eggs being
    laid.  More information than is provided is still needed, obviously.

					Topher
1533.10a simpler but similar sort of problemHANNAH::OSMANsee HANNAH::IGLOO$:[OSMAN]ERIC.VT240Tue Dec 31 1991 16:417
Here's a simpler but similar sort of problem:

	A normal coin has a probability of 50% that a flip will produce heads.
	What is the probability that if we toss this coin 10 times, EXACTLY
	5 times will be heads.

1533.11Happy New Year!ZFC::deramoDan D'EramoTue Dec 31 1991 16:583
re .-1, just under a quarter, (10!/(5! 5!))/(2^10)

Dan
1533.12a late reply, but a possible answer to .0 as statedCSSE::NEILSENWally Neilsen-SteinhardtTue Feb 18 1992 16:0469
.9>    There are two approaches to solving this problem which would be used in
>      practice.

I agree, although there are several more possibilities, for which see below.

.9>    Note that this does require you to know something more about the
>      problem, though, the number of hens in the henhouse.

If the number of hens is large compared to m, then a Poisson distribution 
is a good approximation to the binomial, so you can use it.  This is an
advantage, since the single datum provided is enough to fix the distribution.

Note that if m is large compared to 1, then the normal distribution is a good
approximation to both Poisson and binomial, so we could use Topher's second 
method in good conscience.

In either case, we should note that extreme values may give misleading 
probabilities, since Poisson has one long tail and normal has two.  We would
not use normal if n was negative, and we would not use either if n was 
greater than 3m or so.

.9>  The probability of each hen laying an egg on any particular day
>    is independent of whether or not any other hen or any combination of
>    hens lays an egg.

In a real henhouse I would expect that this is not a very good approximation.
Weather and quality and quantity of feed should cause correlations in the 
probabilities of a hen laying an egg.  This would make more probable the n
values far from the average.

A simple way I could handle this is to make a histogram of actual egg 
production and compare it to the binomial distribution.  I could express
the actual probability as a product of binomial probability and a 
correction factor:

	PA(n) = PB(n) * PC(n)

and then use the usual graphic or numeric techniques to estimate PC(n).

I could also hunt up the modified forms of the normal (skew and kurtosis and
all that) and use my histogram to estimate their parameters.

I could also create a model in which I explicitly build in variations over time
of the probability that a hen will lay an egg.  I could assume a form with
parameters for this variation and then estimate these parameters.  I think 
this would involve me in a convolution integral of the normal distribution.
Hmm... I wonder if this would result in a normal with increased variance?
That would simplify my estimations.

Contrary to previous statements, it is possible to answer this question, with 
the information given, if you are willing to be subjectivist enough.  I doubt
this would appeal to the average poultry farmer, but I can think of 
circumstances where this approach would be useful.

Based on the problem statement in .0, we know we have a distribution P(n) with
mean value m.  From our general knowledge of hens, we know that 

	P(n)=0 		when n<0.

We can ask what distribution will satisfy these two statements while introducing 
the minimum new information.  We can quantify information as the sum of

	P(n) n ln(n)

so our problem becomes finding the distribution which gives an extremum for 
this sum.  I can't solve problems like this, but when I have seen them solved
the answer usually turns out to be something simple and well known.  So I will
guess that the solution is the Poisson distribution.  This would provide 
another answer to the question in .7.