[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

985.0. "Election Distribution" by BEING::POSTPISCHIL (Always mount a scratch monkey.) Mon Dec 05 1988 18:42

    Here's a set of percentages of votes cast in each state (and the
    District of Columbia) in a national election.  Perhaps we should throw
    out D.C. as an exception.  There are two columns, being votes cast for
    one of two choices. There were other minor choices; data from them has
    been ignored in computing the percentages. 

    In the center of the range, note this frequency of samples occurring in
    these one-percent ranges (from the left column; the right column is
    100% minus the left column): 

	sub-range	number
	53 to 54	2 **
	52 to 53	3 ***
	51 to 52	2 **
	50 to 51	0 
	49 to 50	0 
	48 to 49	6 ******
	47 to 48	2 **
	46 to 47	3 ***
	45 to 46	1 *

    What can we say about the probability of these samples having been
    arrived at by taking samples from some more even distribution and
    adjusting some of the samples?  What statistical tests can be applied? 
                                                                          
DIS.COLUMBIA   85.61108      14.38892
RHODEISLAND    56.07379      43.92621
IOWA           55.19371      44.80629
HAWAII         54.80628      45.19373
MASS.          53.94824      46.05177
MINNESOTA      53.60203      46.39798
OREGON         52.62353      47.37647
W.VIRGINIA     52.41817      47.58184
NEWYORK        52.03937      47.96063
WISCONSIN      51.81251      48.18749
WASHINGTON     51.34891      48.65109
PENN.          48.80041      51.1996
MARYLAND       48.76353      51.23648
ILLINOIS       48.68293      51.31707
VERMONT        48.59194      51.40806
CALIFORNIA     48.32645      51.67355
MISSOURI       48.15071      51.8493
NEWMEXICO      47.56053      52.43948
CONNECTICUT    47.46092      52.53908
MONTANA        46.99792      53.00209
S.DAKOTA       46.80474      53.19527
COLORADO       46.05035      53.94965
MICHIGAN       45.93818      54.06183
LOUISIANA      44.82543      55.17457
OHIO           44.51544      55.48456
KENTUCKY       44.18671      55.81329
MAINE          44.16273      55.83728
TEXAS          43.61381      56.38619
N.DAKOTA       43.43195      56.56806
KANSAS         43.30046      56.69955
DELAWARE       43.24046      56.75954
NEWJERSEY      42.86363      57.13637
ARKANSAS       42.66707      57.33293
N.CAROLINA     41.93989      58.06012
TENNESSEE      41.90802      58.09199
OKLAHOMA       41.61208      58.38792
ALABAMA        40.33478      59.66522
GEORGIA        40.07535      59.92466
INDIANA        39.92463      60.07537
VIRGINIA       39.74181      60.2582
MISSISSIPPI    39.54387      60.45613
NEBRASKA       39.51819      60.48181
NEVADA         39.18868      60.81132
ARIZONA        39.17765      60.82236
FLORIDA        39.12862      60.87138
WYOMING        38.57417      61.42584
S.CAROLINA     37.99027      62.00974
ALASKA         37.79483      62.20517
IDAHO          36.77346      63.22655
N.HAMPSHIRE    36.71866      63.28135
UTAH           32.64154      67.35847

T.R	Title	User	Personal Name	Date	Lines
985.1	analysing election results	PULSAR::WALLY	Wally Neilsen-Steinhardt	`Thu Dec 08 1988 15:08`	41
	re: < Note 985.0 by BEING::POSTPISCHIL "Always mount a scratch monkey." > -< Election Distribution >- > What can we say about the probability of these samples having been > arrived at by taking samples from some more even distribution and > adjusting some of the samples? What statistical tests can be > applied? Nothing can be said and no tests can be applied because your hypotheses are too vague. You need to begin by stating more precisely some hypotheses, and then tests can be formulated for them. Just to give a trivial example: H0: these values are randomly drawn from a population characterized by a normal distribution with some unknown mean and variance Any of the usual goodness of fit tests would convince anyone who did not trust their eyes. Note that the statistical analysis of election results is quite a cottage industry, heavily supported by the media, the parties and the pols. You can see the output all over the place, and infer from it some of the techniques being used. I've never been on the inside, so what follows is just inference. Has anybody reading this actually done or supported this work? The standard technique is to assume that the vote in a state is a function of current national economic, political, social and cultural factors, specific statewide or regional economic, political, social and cultural characteristics and other unpredictable factors. Various hypotheses are formulated to express these functions. For example, urban industrial states tend to vote Democratic unless there is unusual prosperity or a major foreign threat. Note that the distribution in .0 is consistent with this kind of assumption. Various means are used to assign numerical values to all these factors and characteristics, and factor analysis is used to test hypotheses. The result is a set of more-or-less well confirmed statements relating voting outcomes to current factors and local characterisitics. The results I've seen are not too impressive, but they burn MIPS and keep the pols out of worse trouble.
985.2	Law of Cubic Proportions	AUSSIE::GARSON	nouveau pauvre	`Mon Aug 16 1993 22:53`	33
985.3	a related question	HERON::BUCHANAN	The was not found.	`Tue Aug 17 1993 09:09`	6
	Daryll Huff, in his classic "How to Lie with Statistics", poses the question: given that party A beats party B by a votes to b, what is the probability that party A is ahead of party B throughout the process of counting votes? He states (without proof) that the answer is (a-b)/(a+b). Andrew.
985.4		AUSSIE::GARSON	nouveau pauvre	`Wed Aug 18 1993 22:05`	4
	re .-1 Do you have a page or chapter reference for that? I quickly flipped through my dusty copy and couldn't find it.
985.5	the proof of Daryll Huff statement	GVAADG::DUBE		`Fri Aug 27 1993 09:15`	100
	Re 985.3 >> Daryll Huff, in his classic "How to Lie with Statistics", poses the >> question: given that party A beats party B by a votes to b, what is the >> probability that party A is ahead of party B throughout the process of >> counting votes? He states (without proof) that the answer is (a-b)/(a+b). The process of "counting votes" can be shown in a diagram where the coordinates represent : . x : the sum of the counted votes at a specific time . y : the difference between the 2 parties A and B a-b \| \| \| \| End \| / \| /\/ \| /\ / \|/ \ /\/ +____\/\__/____________________________ a+b \/ The number of distinct paths from Origin to End is equal to the number of distinct permutations of a+b votes, among which "a" votes are identical : distinct paths from Origin to End = C ( a+b, a ) = (a+b)! / (a! * b!) let's call N(a,b) that quantity, so we have : N(a,b) = (a+b)! / (a! * b!) [1] All these paths have equal probability of occurring. So the probability of each path is equal to 1 / N(a,b) In order to always have a > b during the process, the paths must go via the point X ( first vote good for Party A ) : a-b \| \| \| \| . End \| \| \| X \|/ +______________________________________ a+b \ Y The probability of first vote being good for A is equal to P1 = a/(a+b) At point X, there remain in the box a-1 votes for party A, and b votes for party B. So the overall number of paths going from X to End can be deduced from relation [1] for N (a-1,b). We get N (a-1,b) = (a+b-1)! / ( (a-1)! * b! ) [2] Then, during the next votes, the path may not fall onto the x axis. Just by symmetry, there are as many paths crossing the axis from point Y, as there are paths coming from X, and falling then onto the x axis. So the number of paths coming from X to fall onto the axis, is equal to the number of paths going from Y to End. So, using the relation [1] above for N (a,b-1), we get the number of paths falling from X onto the axis : N (a,b-1) = (a+b-1)! / ( a! * (b-1)! ) [3] So, the probability to always have a > b during the process is : P = P1 * [N (a-1,b) - N (a,b-1)] / [N (a-1,b)] \|...\| \|.......................\| \|...........\| must paths from X which don't paths from go fall onto the axis X to End via point X We finally get : P = (a-b)/(a+b) Friendly, ##### Remy #####