Topic 4
SAMPLING
Damian Jeremia,
OUTLINE
DEFINITION OF TERMS
PURPOSE OF SAMPLING
SAMPLING METHODS
BIAS IN SAMPLING
DEFINITIONS OF TERMS
STUDY POPULATION
A specified group of persons or things
to be studied. e.g. people, schools,
households, etc.
SAMPLE
A subset of a population whose
properties are to be generalized to the
whole population
DEFINITIONS – cont.
SAMPLING
The process of selecting study units
from a population
SAMPLING UNIT
The unit of selection in the sampling
process, e.g. person, a school, a
household, etc.
Study population
Sample
Sampling
Sampling: A Pictorial View
Sampled
(Study)
Population Sample
Target Population
Target Population Sampled Population Sample
6
DEFINITIONS – cont.
SAMPLING FRAME
A list of units from which a sample is to be
picked
SAMPLING FRACTION (Sampling ratio)
A proportion of sampling units to be picked
from a specified sampling frame
= number of units in a sample
number of units in sampling frame
DEFINITIONS – cont.
SAMPLING INTERVAL
An interval at which units are picked
from a sampling frame when systematic
sampling is done.
What makes a "good"
sample?
A RUSsIaN sample:
◦ Representative
◦ Unbiased
◦ Sampling Error can be quantified
◦ Maximum Information for minimum cost
◦ Non-sampling error is corrected for as
much as possible.
9
What makes a "good"
sample?
When using qualitative research
approaches, however,
representativeness of the sample is
NOT a primary concern
Key respondents should never be
chosen at random, but purposively
from among those who have the best
possible knowledge, experience or
overview with respect to topic of your
study.
10
WHY SAMPLING?
Sampling is necessary when the study
population is very big and the resources are
not adequate to reach everyone in the
population.
Important Considerations when sampling:
The sample should be representative of the
study population. i.e. It should have all the
important characteristics of the population
from which it was drawn.
SAMPLING METHODS
Non- probability methods
Convenience sampling
Quota sampling
Probability methods
Simple random sampling
Systematic sampling
Stratified sampling
Cluster sampling
Multi-stage sampling
Non- probability methods
CONVENIENCE SAMPLING
The study units available at the time of
data collection are selected and studied.
◦ Disadvantage: Sample may not be
representative of the study population, thus
generalizations made from the findings in
the sample may be incorrect
QUOTA SAMPLING
(Purposive Selection)
Composition of the sample e.g. age, sex,
social class etc. is determined in advance
by the researcher.
Those units are then sought to fill these
quotas.
Disadvantage: May not be
representative of the study population
Probability methods
SIMPLE RANDOM SAMPLING
The study units are randomly selected
i.e. using either a lottery method or use
of random number table.
e.g. Selection of a sample of 50 students
from a school with 250 students
By lottery method
STEPS:
◦ Prepare sampling frame
◦ Assign numbers to each unit (1 – 250)
◦ Write the numbers on pieces of paper
◦ Fold the papers and put in a box
◦ The box is shaken vigorously
◦ Pick 50 pieces of paper, one at a time.
Students with numbers corresponding to
those picked will constitute the sample.
By use of Random number table
STEPS:
Prepare a sampling frame
See if the total number of units is a one, two, three, digit
number.
Allocate numbers to every unit e.g. 0,1,2,
Decide on how you are going to move across the numbers.
Choose a starting point on the table by pin-pointing with a pointer
Read the numbers successively from the table according to the
predetermined direction.
* Pick only those numbers which are within range of your
sampling frame
* Ignore those which do not appear in the list and those that
reappear after they have been selected
* Continue the process until desired sample is obtained
* Pick those individuals whose numbers coincide with
chosen random numbers
SYSTEMATIC SAMPLING
Units are chosen from the sampling
frame on regular intervals.
The interval is determined by
calculating the sampling fraction first.
e.g. Choose 100 people from a population of 1000:
Sampling fraction 100 = 1
1000 10
Therefore: Sampling interval will be 10 – i.e. every
10th person on a list of 1000 people will be picked.
The starting point is selected at random.
STRATIFIED SAMPLING
The study population is first divided
into sub-groups or strata according to
one or more characteristics (e.g. sex,
age group, area of residence etc.) and
then simple random or systematic
sampling is performed on each group.
CLUSTER SAMPLING
Groups (or clusters) of individual units
are selected instead of the individual
units themselves.
◦ The clusters may be – villages, household,
clinics, schools, etc.
◦ Selection of clusters is done by simple
random sampling.
◦ Units found in the selected clusters are then
studied.
MULTI-STAGE SAMPLING (Syn:
Multi stage cluster
sampling)
This method is used in large scale studies.
It is essentially a cluster method but
the procedure is carried out in phases
(stages).
Example: - Multi-stage
sampling
Study of latrines in a district with 6
wards
◦ Stage 1: Select 3 wards out of six
(Randomly)
◦ Stage 2: For each ward select 5 villages
(Randomly)
◦ Stage 3: For each village select 10
households
◦ Stage 4: For each household select whom to
interview
BIAS IN SAMPLING
DEFINITION
Bias in sampling refers to the distortion of
results as a result of some influences in
selecting a sample from the population.
SOURCES OF BIAS
Non-response:-respondents refuse to respond
or forget to fill in the information
Use of only volunteers as study population
Use of registered patients only
Use of hospital or clinic populations
Conducting a study in one season only for a
problem which is known to vary with season.
Selecting a study area because they are easily
accessible (Tarmac bias).
Missing cases of short duration
STRATEGY TO REDUCE NON-
RESPONSE BIAS
Pre-testingthe questionnaire
Follow-up of non-respondents
Include additional people in the sample
(exceed minimum sample size)
ETHICS
Discuss all possible biases in the study
Sampling Error
No sample is the exact mirror image of the
population
Magnitude of error can be measured in probability
samples
Expressed by standard error (SE), e
◦ of mean, proportion, differences, etc (e.g. SE of p)
n p (1 p )
e 1
N n
Finite Population
Correction Factor
27
Sampling Error
Generally, the standard error is a
function of
◦ sample size, n
◦ amount of variability in measuring factor
of interest
Usually, with large study populations,
N, the finite population correction
(fpc) factor can be ignored
28
Non-Sampling Error
Due to problems in design or conduct
◦ Measurement Bias (systematic over or under-
reporting)
◦ Selection Bias (part of target population not in
sampled population)
◦ Questionnaire Design
◦ Interview Bias
◦ Processing Error
◦ Non-response
◦ Under coverage
29
END OF SAMPLING
METHODS
ANY QUESTIONS?
Next lecture on….
Hypothesis testing
Enjoy your weekend!!