Title: | Mathematics at DEC |
Moderator: | RUSURE::EDP |
Created: | Mon Feb 03 1986 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2083 |
Total number of notes: | 14613 |
Is there a closed form expression of the integral of the normal probability distribution function f(x) = (1/sqrt(2*pi))*exp(-0.5*x**2)? I'd like to compute the fraction of a population between given values of x. My CRC book lists the derivatives of this function, and indicates (obviously) that the definite integral for a given range is what I want, but I can't find a closed form expression of the integral itself. My calculus skills are very rusty, and I'm having trouble recalling the right substitutions to integrate a form of exp(f(t))dt. - tom powers]
T.R | Title | User | Personal Name | Date | Lines |
---|---|---|---|---|---|
1180.1 | Sorry, no such thing ... | COOKIE::PBERGH | Peter Bergh, DTN 523-3007 | Wed Jan 10 1990 13:32 | 2 |
To the best of my knowledge, there is no closed form for the integral of the normal probability distribution function. | |||||
1180.2 | No can do, can come close | VMSDEV::HALLYB | The Smart Money was on Goliath | Wed Jan 10 1990 14:23 | 5 |
That's a theorem that there is no closed form for the integral. However I recall there's a 5th degree polynomial (or so) that is quite an accurate approximation, if that will suffice. John | |||||
1180.3 | ALLVAX::ROTH | It's a bush recording... | Wed Jan 10 1990 16:04 | 4 | |
See note 1136 and some of the replies - there are some routines that can be adapted nicely to your problem... - Jim | |||||
1180.4 | REGENT::POWERS | Tue Jan 16 1990 12:15 | 16 | ||
> However I recall there's a 5th degree polynomial (or so) that is > quite an accurate approximation, if that will suffice. That would be handy.... ...as would be some partly tongue-in-cheek background: 1) If we don't have a closed form for the integral, how do we know the total area under the curve is, in fact, 1.00000......? 2) Presuming that the answer to 1) is based on connections with binomial distribution and sum of the negative powers of 2, what is the derivation of the form of the curve as an exponential of a function of x**2? - tom powers] | |||||
1180.5 | A partial answer ... | COOKIE::PBERGH | Peter Bergh, DTN 523-3007 | Tue Jan 16 1990 14:47 | 54 |
>> 1) If we don't have a closed form for the integral, how do we >> know the total area under the curve is, in fact, 1.00000......? The easiest way that I know of to evaluate I(-infinity, +infinity, e**(-x*x), dx) goes roughly as follows (for infinity, I use the symbol oo): Consider I(-oo, +oo, e**(-x*x), dx) * I(-oo, +oo, e**(-y*y), dy) = Z. Notice that this product is the same as the double integral over the whole (x,y) plane: II(-oo, +oo, -oo, +oo, e**(-x*x)*e**(-y*y), dx*dy) which in turn equals II(-oo, +oo, -oo, +oo, e**(-x*x-y*y), dx*dy). Transforming to polar coordinates, we get that Z = II(0, 2*PI, 0, +oo, r*e**(-r*r), dtheta*dr) Here, we can separate the two variables of integration, so Z = I(0, 2*PI, 1, dtheta) * I(0, +oo, r*e**(-r*r), dr). These two integrals can easily be evaluated and we get that Z = PI. Thus, we have proved that I(-oo, +oo, e**(-x*x), dx) = sqrt(PI). (Note that I haven't bothered to quote chapter and verse of the appropriate theorems; the integrands are extremely well behaved, so ordinary Riemann-integration theorems ought to suffice to justify these calculations.) >> 2) Presuming that the answer to 1) is based on connections with >> binomial distribution and sum of the negative powers of 2, >> what is the derivation of the form of the curve as an exponential >> of a function of x**2? As you notice, the binomial distribution does not enter into the proof at all, neither do negative powers of two. I don't know what your question here is aiming at, but I can tell you of a theorem in statistics (the law of large numbers) which I think may answer at least part of your question. The theorem goes roughly as follows: Given a set of independent random variables with the same distribution (note that there is no requirement for them to have a binomial distribution; the law of large numbers doesn't "care" what the distribution of a single random variable is), the sum of N of these random variables will have a distribution that converges in probability to a normal distribution. This has often been used to get a quick approximation to a normally-distributed random variable (one simply adds enough uniformly-distributed random variables and, presto, the sum is approximately normally distributed). (Convergence in probability means roughly "the probability of the distribution differing from the normal distribution converges to zero as the number of terms in the sum increases".) | |||||
1180.6 | AITG::DERAMO | Daniel V. {AITG,ZFC}:: D'Eramo | Wed Jan 17 1990 01:19 | 8 | |
re .5, I believe that the theorem at the end of reply .5 needs the added condition that the random variables' distribution have a well defined and finite mean and variance. Dan | |||||
1180.7 | ALLVAX::ROTH | It's a bush recording... | Wed Jan 17 1990 05:40 | 21 | |
Re .-1 Yes, that's clearly correct on intuitive grounds; there's no way a PDF that's a set of impulses will converge to a proper Gaussian. An easy way to see that sums of "nicly distributed" random variables converge to a normal distribution is that the distribution of their sum is the convolution of the individual distributions. Convolution causes smoothing and spreading out; try the simplest case of convolving rectangular pulses - very quickly a bell-shaped curve results. In fact, n-fold convolution of a rectangular pulse gives the uniform B-splines. +--+ + | | / \ | | -> / \ -> etc. ---+ +--- ---+ +--- one constant piece 2 linear pieces 3 parabolic pieces... - Jim | |||||
1180.8 | REGENT::POWERS | Wed Jan 17 1990 12:51 | 11 | ||
My reference to the binomial theorem was in regard to the physical demonstration of the normal distribution by dumping the balls over the pyramid of pegs and seeing how the normal curve appears as a histogram underneath. The reference to the negative powers of two comes from this demonstration (1/2 probability of left or right for each ball at every peg) and the fact that the sun of 2**(-i) for i=1 to infinity is 1. Admittedly naive.... - tom] | |||||
1180.9 | A confirmation and a refutation | COOKIE::PBERGH | Peter Bergh, DTN 523-3007 | Wed Jan 17 1990 13:20 | 19 |
Re .6: the requirement for a finite variance and a finite expected value is correct <<< Note 1180.8 by REGENT::POWERS >>> >> My reference to the binomial theorem was in regard to the physical >> demonstration of the normal distribution by dumping the balls over >> the pyramid of pegs and seeing how the normal curve appears as a >> histogram underneath. According to a book that I read some twenty years ago ("Theory of probability" by Gnedenko), the fact that the binomial distribution converges to the normal distribution as the number of trials grows is du to DeMoivre and Laplace, so that demonstration is probably a very early example of the occurrence (admittedly, only in the limit) of the normal distribution in nature. Thus, this is not naive; it is an excellent example of the use of the theorem in .5. | |||||
1180.10 | counter-example and two incomplete derivations | PULSAR::WALLY | Wally Neilsen-Steinhardt | Thu Jan 18 1990 16:06 | 57 |
re: <<< Note 1180.7 by ALLVAX::ROTH "It's a bush recording..." >>> > Yes, that's clearly correct on intuitive grounds; there's no way > a PDF that's a set of impulses will converge to a proper Gaussian. Consider a PDF which is zero everywhere but x=-1 and x=1, and its values there are such that the integral over the whole line is 1. Obviously we need integrals something like Stieljes (and I cannot even remember how to spell it!) This has zero mean and finite variance, but the sum of these distributions look like binomial distributions, and converge to the normal distribution in the very qualified sense mentioned earlier. To fail to converge to a normal distribution, the starting distribution has to lack a finite mean or variance, as previously stated. I have seen two other derivations for the form of the bell shaped curve. I could not reproduce either when I tired, but maybe if I put down what I remember, somebody else will fill in the gaps. A: Start with any well-known PDF, like the binomial distribution. Let the parameters in the distribution become very large. Take logs of both sides, and apply Stirling's Approximation log n! = n log n approximately to all the factorials. After a bit of algebra (which is what I have forgotten) you end up with something like log P = something - (r - n/2)**2 / something Raise e to the power of both sides and you get the Gaussian. The first term just becomes the normalization constant. Obviously, this only proves that a particular PDF converges to the Gaussian, but it is still interesting. B: Start with the fact that if you take n samples from any PDF, the mean of the sampling distribution is the mean of the PDF, and the variance of the sampling distribution is the variance of the PDF divided by n. Consider the logarithm of the sampling distribution, and expand it around its mean: log P(x-xm) = A0 + A1*(x-xm) + A2*(x-xm)^2 + ... One part I forgot is how you show that xm is also a maximum and therefore A1=0. Another part is how A2 remains constant while you increase n so that the variance decreases to zero, so for all the x where P is significantly far from zero, higher terms may be ignored. The result is the limit log P(x-xm) = A0 + A2 * (x-xm)^2 and you raise e to both sides as above. Neither of these is a proof of the central limit theorem, but they may give you a better feeling for where the Gaussian came from. | |||||
1180.11 | EVMS::HALLYB | Fish have no concept of fire | Wed Jul 17 1996 16:04 | 13 | |
Here's a problem I've come across that seems intuitively obvious but no proof comes to mind, other than "visualize it and it's obvious". Suppose N(I) is the area of the interval I under the standard normal curve. Let I be an interval of length dx containing 0 as an interior point. Let I' be an interval of length dx not containing 0, interior or end. Claim N(I) > N(I') is obviously true. Is there any rigorous way to prove this? John | |||||
1180.12 | AUSS::GARSON | DECcharity Program Office | Wed Jul 17 1996 22:56 | 46 |