Sunday, September 26, 2004
Methods of polling sampling
PoliPundit.com � Poll Methodology - A 2004 Guide
This is a superb guide to polling methodologies of the current election cycle. It lists very important details of the methods used to report the sample, by polling company and contractor.
Most people must be confused about the very different results different polls are publicizing. There are two factors here - one is sample size, which determines degrees of deviation, and one is sampling methodology.
One reason for our confusion is that different pollsters are using very different ways to select and report their samples. If you take a very large sample size you can be sure that your numbers reflect the real opinions of the population. The correct sample size is, of course, the one that votes on election day, but no one's going to spend the money to call that many people, and in any case the opinions of the electorate will change between the polling period and the election.
When one takes small samplings of a large population, an inherent inaccuracy will be present. The "target population" of 1000 throws of a six-sided die is 166.66~ * 1, 2, 3, 4, 5, and 6. If you throw the die once, you will get a number in the range of 1 to 6, but whatever result you get will not predict what will happen the next time the die is thrown. If you throw the die a hundred times, you're in much safer territory. If you throw it a thousand times, you're going to get darned accurate results, although they will not be totally accurate. You may get 169 1's, for example, but you certainly won't get 166.66~ 1's.
Inaccuracy in a phone poll is also uncontrollable due to the method. That uncertainty can be as simple as the fact that older voters may be at home and answer the phone more than younger voters, so one may end up with a sampling skewed toward older voters. Some households only have cell phones, and they aren't called at all. Plus, a person has to make an effort to vote, whereas a telephone poll is passive and therefore samples a very significant number of persons who will not be present in the target population. In order to "fix" such a problem, pollsters often adjust their samples to get a correct demographic spread, which is why a good poll will ask all those annoying extra questions.
One of the demographic factors some pollsters adjust for is party identification. They measure how many people are R vs. D, and then throw out the extra samples for the overrepresented population. This is not a bad method, except that you have to know the correct ratio in the population first.
But you don't, and using exit polling from 4 years ago to determine what party identification is now is an extremely dangerous thing to do, and even more dangerous than normal given that our national situation has changed so greatly, which changes the electorate's priorities. And then there is the coattail effect, which is that a good democratic candidate will always raise the number of people identifying themselves as Democratic, and a good Republican candidate will always raise the number of people identifying themselves as Republican.
Any reasonably intelligent person who has a feel for numbers can mentally correct for some of this error in a poll if the underlying raw sample and the assumptions are disclosed. Suppose a pollster is correcting the sample for party identification based on the 2000 vote, and you see many other polls show that 15% of democrats intend to vote republican, but only 8% of republicans intend to vote democrat. Such a result tends to indicate that the pollster who is "balancing" the sample to achieve a democratic/republican ratio of 2000 is going to overreport the democratic vote and underreport the republican vote. These percentages suggest that more of the voters who have historically identified themselves as democrats will be voting republican this year.
The link above will give you, the poll consumer, the information you need to figure out which polls to watch.
All of the above is a perplexing puzzle that occupies the minds of pollsters who are striving for accuracy - but there are also many pollsters who exist to create spin in the minds of the electorate. If, for example, results look bad for one party, you can be sure that pollsters will be contracted to generate poll results to look better for that party, just to encourage their base to go to the polls and to continue to work for the party.
This is a superb guide to polling methodologies of the current election cycle. It lists very important details of the methods used to report the sample, by polling company and contractor.
Most people must be confused about the very different results different polls are publicizing. There are two factors here - one is sample size, which determines degrees of deviation, and one is sampling methodology.
One reason for our confusion is that different pollsters are using very different ways to select and report their samples. If you take a very large sample size you can be sure that your numbers reflect the real opinions of the population. The correct sample size is, of course, the one that votes on election day, but no one's going to spend the money to call that many people, and in any case the opinions of the electorate will change between the polling period and the election.
When one takes small samplings of a large population, an inherent inaccuracy will be present. The "target population" of 1000 throws of a six-sided die is 166.66~ * 1, 2, 3, 4, 5, and 6. If you throw the die once, you will get a number in the range of 1 to 6, but whatever result you get will not predict what will happen the next time the die is thrown. If you throw the die a hundred times, you're in much safer territory. If you throw it a thousand times, you're going to get darned accurate results, although they will not be totally accurate. You may get 169 1's, for example, but you certainly won't get 166.66~ 1's.
Inaccuracy in a phone poll is also uncontrollable due to the method. That uncertainty can be as simple as the fact that older voters may be at home and answer the phone more than younger voters, so one may end up with a sampling skewed toward older voters. Some households only have cell phones, and they aren't called at all. Plus, a person has to make an effort to vote, whereas a telephone poll is passive and therefore samples a very significant number of persons who will not be present in the target population. In order to "fix" such a problem, pollsters often adjust their samples to get a correct demographic spread, which is why a good poll will ask all those annoying extra questions.
One of the demographic factors some pollsters adjust for is party identification. They measure how many people are R vs. D, and then throw out the extra samples for the overrepresented population. This is not a bad method, except that you have to know the correct ratio in the population first.
But you don't, and using exit polling from 4 years ago to determine what party identification is now is an extremely dangerous thing to do, and even more dangerous than normal given that our national situation has changed so greatly, which changes the electorate's priorities. And then there is the coattail effect, which is that a good democratic candidate will always raise the number of people identifying themselves as Democratic, and a good Republican candidate will always raise the number of people identifying themselves as Republican.
Any reasonably intelligent person who has a feel for numbers can mentally correct for some of this error in a poll if the underlying raw sample and the assumptions are disclosed. Suppose a pollster is correcting the sample for party identification based on the 2000 vote, and you see many other polls show that 15% of democrats intend to vote republican, but only 8% of republicans intend to vote democrat. Such a result tends to indicate that the pollster who is "balancing" the sample to achieve a democratic/republican ratio of 2000 is going to overreport the democratic vote and underreport the republican vote. These percentages suggest that more of the voters who have historically identified themselves as democrats will be voting republican this year.
The link above will give you, the poll consumer, the information you need to figure out which polls to watch.
All of the above is a perplexing puzzle that occupies the minds of pollsters who are striving for accuracy - but there are also many pollsters who exist to create spin in the minds of the electorate. If, for example, results look bad for one party, you can be sure that pollsters will be contracted to generate poll results to look better for that party, just to encourage their base to go to the polls and to continue to work for the party.