SAMPLING (GENERAL DISCUSSION)
TYPES OF SAMPLING
SIMPLE RANDOM SAMPLING WITH REPLACEMENT
• Each individual from the population has an equal chance of being selected.
• After an individual is selected, they are placed back into the population for the
next selection.
• This allows the same individual to be chosen more than once.
• Use case: Lottery draws where numbers can be selected multiple times.
SIMPLE RANDOM SAMPLING WITHOUT REPLACEMENT
• Each individual from the population has an equal chance of being selected.
• Once an individual is selected, they are not returned to the population for future
selections.
• This ensures each individual is selected at most once.
• Use case: Randomly choosing participants for a study where duplicates are not
allowed.
CLUSTER SAMPLING
•The population is divided into groups or clusters, often based on geography or other natural divisions.
•A random sample of clusters is selected, and all individuals within chosen clusters are studied.
•Use case: Used when the population is large and geographically spread out.
•Example: Selecting schools as clusters and studying all students within the selected schools.
STRATIFIED SAMPLING
• The population is divided into strata or subgroups that share similar
characteristics.
• A random sample is drawn from each stratum, ensuring representation from all
subgroups.
• Use case: Ensures smaller but important groups are included in the sample.
• Example: Dividing a population by age group and selecting random individuals
from each age group.
BIASES IN SAMPLING
1. Selection Bias
Occurs when the sample isn't representative because certain groups are systematically excluded.
Example:
You want to know if dogs or cats are better pets, so you go to a dog park and ask only dog owners. Shockingly, 100% of
your respondents claim dogs are the superior pet!
2. Non-Response Bias
Happens when many people don’t respond, and those who do respond differ significantly from those who don’t.
Example:
You send out a survey to find out how many people hate surveys. Only people who don’t hate surveys respond, telling you
that no one hates surveys. Problem solved... or not?
You go to LBS to find out if people enjoy waking up early. The only people who respond are the morning people who are
already awake at 6 AM. According to your survey, it turns out that nearly everyone loves getting up at the crack of dawn —
except for the night owls who were too busy sleeping to reply!
BIASES IN SAMPLING
Response Bias:
A professor asks students during class, “Is my teaching style effective?” Afraid of upsetting the professor (and
potentially their grades), all the students nod enthusiastically, even though they find the lectures confusing.
A course feedback form includes the question, “How useful did you find the course materials, which were
developed by leading experts?” Students, influenced by the authoritative phrasing, feel pressured to give higher
ratings, even if they struggled with the materials.
THE FAMOUS LITERARY DIGEST EXAMPLE
https://en.wikipedia.org/wiki/The_Literary_Digest
The Literary Digest Poll of 1936:
In 1936, Literary Digest, a popular magazine, conducted a massive opinion poll to predict the outcome of
the U.S. presidential election between Franklin D. Roosevelt and Alf Landon. At the time, Literary Digest
was known for conducting large-scale polls, and it had successfully predicted past elections using similar
methods. However, 1936 was different.
The Sampling Method:
• Sample Size: Literary Digest mailed out approximately 10 million surveys.
• Respondents: Around 2.4 million people responded to the survey, a substantial number by any standard.
• Sampling Frame: The magazine used data from telephone directories, automobile registrations, and
magazine subscribers to create their list of potential respondents. These lists skewed toward wealthier
individuals since, in the 1930s, owning a telephone, a car, or subscribing to magazines indicated higher
socioeconomic status.
OUTCOME
The Poll's Prediction:
Based on the responses they received, Literary Digest predicted that Alf Landon, the Republican candidate,
would win the election with 57% of the popular vote, leaving Franklin D. Roosevelt with only 43%. The poll
suggested a decisive victory for Landon.
The Actual Outcome:
In reality, the election was a landslide victory for Franklin D. Roosevelt, who won 61% of the popular vote,
while Alf Landon received only 37%. The result was not only the opposite of what the poll predicted but a
massive defeat for Literary Digest's reputation.
WHAT WENT WRONG?
What Went Wrong?
The Literary Digest poll failed because of sampling bias and non-response bias:
1. Sampling Bias:
1. The sampling frame was based on telephone directories, automobile registrations, and magazine subscriptions, which
disproportionately included wealthier Americans. During the 1930s, lower-income individuals and many rural voters,
who were more likely to support Roosevelt, were excluded because they couldn’t afford telephones or cars.
2. This meant that the sample was not representative of the general population, particularly missing the working-class
and poor voters who made up a significant portion of Roosevelt’s base.
2. Non-Response Bias:
1. Out of the 10 million surveys mailed, only 2.4 million people responded, which means there was a huge portion of
the population that didn’t participate. Those who responded were more likely to be politically engaged and had a
vested interest in the election. This group was skewed toward Landon supporters.
2. Non-respondents, who were likely Roosevelt supporters, were underrepresented in the final sample.
KEY LESSONS FROM THE LITERARY DIGEST POLL:
• Sample Size Isn't Everything: Despite having a huge sample size (2.4 million
respondents), the poll was inaccurate because the sample was not representative of the
population. A smaller, well-selected sample would have produced more accurate results.
• Importance of Representativeness: The sampling frame must reflect the entire
population. If certain groups are systematically excluded, the results will be biased, no matter
how large the sample is.
• Beware of Non-Response Bias: When a significant portion of people do not respond to a
survey, it’s important to consider how their lack of response might skew the results. In this
case, non-respondents were likely Roosevelt supporters, leading to an underestimation of his
actual voter base.
George Gallup's American Institute of Public Opinion achieved national recognition by
correctly predicting the result to within 1.4%, using a much smaller sample size of just
50,000.[5] Gallup's final poll before the election predicted that Roosevelt would receive
55.7% of the popular vote and 481 electoral votes: the official tally saw Roosevelt receive
62.46% of the popular vote and 523 electoral votes.
The failure of the 1936 Literary Digest poll was so significant that it marked the end of the
magazine’s credibility in polling, and it stopped conducting such polls after the incident. This
fiasco also paved the way for more sophisticated polling methods, such as those used by
George Gallup, who correctly predicted Roosevelt’s victory using a smaller but
scientifically selected and representative sample.