| I tried to come up with a concrete example to illustrate it and
couldn't come up with one on my own, so I got this one out of the
Encyclopedia of Statistical Science (fleshing it out a bit):
Imagine that a survey, involving a single yes/no question, is done.
Two different NY and LA are used. What is of interest is the
difference between men and women in the proportion of "yes" answers.
Here are the results:
NY LA
---- ----
M F M F
-----------------
Y| 4 3 | 1 3 |
N| 6 3 | 9 18 |
-----------------
Notice that at both locations, a higher proportion of females than
males answered "yes". But if we condense the table by eliminating the
"nuisance" variable of location, we get:
M F
---------
Y| 5 6 |
N| 15 21 |
---------
and a higher proportion of males than females are seen to have answered
"yes". This results from the strong differences in the male/female
ratio in the populations sampled at the two locations, making
condensation inappropriate. This is known as Simpson's paradox. It
is a paradox in the sense of being a counter-intuitive truth (the
sum being proportioned the "other" way from all of its components)
rather than in the sense of being a logical contradiction.
Topher
|