| re: < Note 985.0 by BEING::POSTPISCHIL "Always mount a scratch monkey." >
-< Election Distribution >-
> What can we say about the probability of these samples having been
> arrived at by taking samples from some more even distribution and
> adjusting some of the samples? What statistical tests can be
> applied?
Nothing can be said and no tests can be applied because your hypotheses
are too vague. You need to begin by stating more precisely some
hypotheses, and then tests can be formulated for them. Just to
give a trivial example:
H0: these values are randomly drawn from a population characterized
by a normal distribution with some unknown mean and variance
Any of the usual goodness of fit tests would convince anyone who
did not trust their eyes.
Note that the statistical analysis of election results is quite
a cottage industry, heavily supported by the media, the parties
and the pols. You can see the output all over the place, and infer
from it some of the techniques being used. I've never been on the
inside, so what follows is just inference. Has anybody reading
this actually done or supported this work?
The standard technique is to assume that the vote in a state is
a function of current national economic, political, social and cultural
factors, specific statewide or regional economic, political, social
and cultural characteristics and other unpredictable factors. Various
hypotheses are formulated to express these functions. For example,
urban industrial states tend to vote Democratic unless there is
unusual prosperity or a major foreign threat. Note that the
distribution in .0 is consistent with this kind of assumption.
Various means are used to assign numerical values to all these
factors and characteristics, and factor analysis is used to test
hypotheses. The result is a set of more-or-less well confirmed
statements relating voting outcomes to current factors and local
characterisitics. The results I've seen are not too impressive,
but they burn MIPS and keep the pols out of worse trouble.
|
| Daryll Huff, in his classic "How to Lie with Statistics", poses the
question: given that party A beats party B by a votes to b, what is the
probability that party A is ahead of party B *throughout* the process of
counting votes? He states (without proof) that the answer is (a-b)/(a+b).
Andrew.
|
| Re 985.3
>> Daryll Huff, in his classic "How to Lie with Statistics", poses the
>> question: given that party A beats party B by a votes to b, what is the
>> probability that party A is ahead of party B *throughout* the process of
>> counting votes? He states (without proof) that the answer is (a-b)/(a+b).
The process of "counting votes" can be shown in a diagram where the coordinates
represent :
. x : the sum of the counted votes at a specific time
. y : the difference between the 2 parties A and B
a-b
|
|
|
| End
| /
| /\/
| /\ /
|/ \ /\/
+____\/\__/____________________________ a+b
\/
The number of distinct paths from Origin to End is equal to the number
of distinct permutations of a+b votes, among which "a" votes are identical :
distinct paths from Origin to End = C ( a+b, a )
= (a+b)! / (a! * b!)
let's call N(a,b) that quantity, so we have :
N(a,b) = (a+b)! / (a! * b!) [1]
All these paths have equal probability of occurring. So the probability
of each path is equal to 1 / N(a,b)
In order to always have a > b during the process, the paths must
go via the point X ( first vote good for Party A ) :
a-b
|
|
|
| . End
|
|
| X
|/
+______________________________________ a+b
\
Y
The probability of first vote being good for A is equal to
P1 = a/(a+b)
At point X, there remain in the box a-1 votes for party A, and b votes for
party B. So the overall number of paths going from X to End can be deduced
from relation [1] for N (a-1,b). We get
N (a-1,b) = (a+b-1)! / ( (a-1)! * b! ) [2]
Then, during the next votes, the path may not fall onto the x axis.
Just by symmetry, there are as many paths crossing the axis from point Y,
as there are paths coming from X, and falling then onto the x axis.
So the number of paths coming from X to fall onto the axis, is equal to the
number of paths going from Y to End. So, using the relation [1] above
for N (a,b-1), we get the number of paths falling from X onto the axis :
N (a,b-1) = (a+b-1)! / ( a! * (b-1)! ) [3]
So, the probability to always have a > b during the process is :
P = P1 * [N (a-1,b) - N (a,b-1)] / [N (a-1,b)]
|...| |.......................| |...........|
must paths from X which don't paths from
go fall onto the axis X to End
via
point
X
We finally get :
P = (a-b)/(a+b)
Friendly,
##### Remy #####
|