1
Sampling Methods for Political Polling
It’s impractical to poll an entire population—say, all 145 million registered voters in the United States.
That is why pollsters select a sample of individuals that represents the whole population. Understanding
how respondents come to be selected to be in a poll is a big step toward determining how well their
views and opinions mirror those of the voting population.
To sample individuals, polling organizations can choose from a wide variety of options. Pollsters
generally divide them into two types: those that are based on probability sampling methods and those
based on non-probability sampling techniques.
For more than five decades probability sampling was the standard method for polls. But in recent years,
as fewer people respond to polls and the costs of polls have gone up, researchers have turned to non-
probability based sampling methods. For example, they may collect data on-line from volunteers who
have joined an Internet panel. In a number of instances, these non- probability samples have produced
results that were comparable or, in some cases, more accurate in predicting election outcomes than
probability-based surveys.
Now, more than ever, journalists and the public need to understand the strengths and weaknesses of
both sampling techniques to effectively evaluate the quality of a survey, particularly election polls.
Probability and Non-probability Samples
In a probability sample, all persons in the target population have a change of being selected for the
survey sample and we know what that chance is. For example, in a telephone survey based on random
digit dialing (RDD) sampling, researchers know the chance or probability that a particular telephone
number will be selected. (A description of RDD sampling and other techniques commonly used in
election surveys appears at the end of this brief.)
The major advantage of a probability-based sampling is that we can calculate how well the findings
from the sample represent the total population. That is, we can calculate the margin of sampling error,
which measures how much our estimates vary based on the fact we’re only measuring a sample of the
population and not every member of the population. This ability to estimate, within a specified range,
the accuracy of survey findings has made probability-based sampling the cornerstone of modern
survey research.
Non-probability sampling methods do not share this feature that everyone in a population has a
chance of being selected and researchers know exactly what that chance is. Participants are typically
not selected at random to be included in the sample but rather come to be included by other means,
for instance because they volunteer, a person’s chance of being in the sample is unknown. For example,
in an opt-in sample a person accepts an invitation to complete a survey that is offered to all visitors to
a website. The chance of that person visiting that website and then choosing to participate in the survey
cannot be known. One serious consequence is that only certain types of people may choose to opt into
2
the survey and they may be different than those who do not in ways that could potentially bias the final
results.
With non-probability samples is there is no simple way to calculate the “margin of error;” instead,
estimates of the likely error must be based on a statistical models. As a result, AAPOR has
cautioned that it may be misleading to report a margin of sampling error for surveys based on non-
probability samples.
Nonresponse to polls is a big factor affecting the accuracy of poll results. In a probability sample, the
respondents can be thought of as “self-selecting” into the sample. To the extent that the respondents
and non-respondents differ systematically on the survey variablesfor example, which candidate they
support in an upcoming election--nonresponse can bias the poll results, and that is true even if the
initial sample was a probability sample. In a similar way, the accuracy of non-probability samples, such
as opt-in samples, can be affected by self-selection. In both types of sampling, if the people who
participate in the poll are different from those who do not, results can be biased because of these
differences.
In addition to sampling method, there are a number of other features of polls that affect the accuracy
of the results. For example, how questions are worded or the sequence of questions presented to
respondents have been shown to affect poll results and whether they reflect what people in total
population really think.
For such reasons, AAPOR’s Code of Professional Ethics calls for transparency in the reporting of sample
design, response rates, and the wording of the questions so that these elements can be assessed along
with poll results.
Types of Sampling Techniques
Probability Samples
Random-Digit Dialing (RDD)
Samples of telephone area codes and exchanges are selected, and then random digits are
added to the end to create 10-digit phone numbers. The first step ensures phone numbers are
distributed properly by geography. The second step, adding the random numbers, makes sure
that even unlisted numbers are included. This has traditionally been the standard practiced by
almost all public pollsters. The major advantage of RDD is the coverage of the population:
Everyone with a telephone is eligible to be sampled. The major disadvantage is that it is
expensive, since many of the landline telephone numbers generated are non-working numbers
and cellphone numbers need to be manually dialed by interviewers.
o Within Household Sample Selection
In households in which more than one eligible respondent residesin the case of
election polls, more than one registered voter--further sampling among the members
of the household should be done to produce a random sample of voters. Journalists
should ask how respondents were selected. Simply taking the person who answers the
telephone will not necessarily result in a representative sample.
Registration-Based Sampling (RBS)
3
This begins with a sample of individuals drawn from lists of registered voters, to which phone numbers
are then matched (or sometimes available from the voter list). This is less costly and more efficient, as
almost all calls result in reaching a working phone number, which is not true of an RDD sample. One
disadvantage of an RBS sample is that voter lists often do not include unlisted telephone numbers or full
coverage of cellphone numbers; additionally they may not include voters who have just moved or
registered to vote.
Non-probability Samples
Self-Selected Samples (SSS)
In self-selected or opt-in samples, respondents have selected themselves, and this means their
answers may not be representative of the larger population. Types of self- selected samples
include dial-in polls popular with the media and many Internet-based polls. The American
Association for Public Opinion Research (AAPOR) cautions that results of surveys based on
respondents who self-select may not be reliable. The characteristics of people who choose to
participate in this type of survey may be different than those who do not in ways that bias the
final results. These polls may sometimes be accurate, but it is very hard to evaluate whether
they are accurate simply because of good luck or because they were able to capture good
information about the population they were trying to represent. AAPOR has not yet made a final
judgment about the reliability of opt-in samples, but warns that this type of sample is not based
on the full target population.
Samples from Internet Panels
One variation of the self-selected sample is the random sample selected from among people
who have signed up to be members of an Internet panel. While the sample itself is random, the
population from which the sample is drawn is made up of people who have signed up to be
members of the panel, which may potential lead to selection bias.