SAMPLING TECHNIQUE
Mentortor : Dr. Sachin Pandey
[Link]
Department of Community Medicine
CIMS
Presented By: Dr. Darwin Deissuza
PG 1st Yr Dept. of Community Medicine
CONTENT
1. Definitions
2. Advantages and disadvantages of sampling
3. Types of sampling design
4. Factors affecting choice of sampling
design
5. Sample size
a. Factors affecting sample size
b. Calculation of sample.
1. DEFINITIONS
A. Population: The target group to which the findings
(of a study) would ultimately apply is called population.
B. Sample: Is that part of the target population, which is
selected in such a way that it is representative of the larger
population.
Definitions Cont..
C. Sampling Unit: Is The Unit Of Selection
D. Unit Of Study Or Element: Is The Subject On Which
Information Is Obtained.
E. Sampling Frame: List Of All Sampling Units In The
Target Population Is Called A Sampling Frame.
Definitions Cont..
F. Sample Size: The Number Of Units Or Subjects
Sampled For Inclusion In The Study Is Called Sample
Size.
G. Sampling Technique: Method Of Selecting
Sampling Units From Sampling Frame
PO PULATIO N VS .
SAMPLE
Population of Interest
Sample
Po pulatio n Sample
Parame te r Statis tic
We me as ure the s ample us ing s tatis tics in o rde r to draw
inferences about the population and its parameters.
TARGET
Population you want to POPULATION
generalize results to
Population you have access Study population
to for your study
How can you get access 1.
to study population? 2. Sampling frame
3…..
Study actually done on? Sample
2. Advantage of Sampling
A. Speed: Faster Results Due To Lesser Coverage.
B. To Draw Conclusions About Population From Sample, There
Are Two Major Requirements For A Sample.
a. The Sample Size Should Be Large.
b. The Sample Has To Be Selected Appropriately So That
It Is Representative Of The Population And Should Have All
The Characteristics Of The Population.
Disadvantages Of Sampling
1. Sampling entails an argument from the fraction to the
whole. Validity depends on representativeness of the
sample.
2. Fails to provide precise information in case of small
segments containing few individuals.
3. Not necessary in studies where complete enumeration is
needed.
4. May cause a feeling of discrimination among the subjects
who are not included in the study.
3. TYPES OF SAMPLING
A. Probability sampling B. Non probability sampling
[Link] of selection of
1. Probability of selection each individual is not known
of each individual is
known and pre [Link] sampling
determined
b. Purposive/ Judgmental
sampling
2. Simple random sampling
a. Systematic
random sampling c. Snowball/ Network
sampling
b. Stratified
random sampling
c. Cluster random d. Convenience/ Grab
sampling sampling (man in the
d. Multistage street)
A. SIMPLE RANDOM SAMPLING
1. Equal probability of selection of units for inclusion in the
study
2. Requires a list of all sampling units (sampling frame)
3. Each individual is chosen randomly.
4. Methods:
[Link] method (possible for finite population)
[Link] number tables
[Link] that generate random numbers
A. Lottery method
Lottery
method
B. RANDOM NUMBER TABLE
76 58 30 83 64
47 56 91 29 34
10 80 21 38 84
00 95 01 31 76
07 28 37 07 61
[Link] number generator software
SIMPLE RANDOM SAMPLING (CONTD.)
1. Simple random method
[Link] replacement
[Link] replacement
2. Advantage
[Link]
scientific
method
[Link]
chance of
all subjects
for
selection
B. STRATIFIED RANDOM SAMPLING
a. Preferred method when the population is heterogeneous
with respect to characteristic under study.
[Link] is divided into groups or strata on the basis of
certain characteristics.
c.A simple random sample is selected from each strata.
[Link] representation of different strata/ groups in the study
population.
[Link] be done by selecting individuals from different strata in
certain fixed predetermined proportions.
Stratified Random Sampling(contd.)
A. For Example, If We Draw A Simple Random Sample
From A Population, A Sample Of 100 May Contain
1. 10 To 15 From High Socioeconomic Group
2. 20 To 25 From Middle Socioeconomic Group
3. 70 To 75 From Low Socioeconomic Group
B. To Get Adequately Large Representation For All The Three
Socio Economic Structures, We Can Stratify On
Socioeconomic Class And Select Simple Random Samples
From Each Of The Three Strata.
POPULATION
HIGH MIDDLE LOW
SOCIOECONOMI SOCIOECONOMI SOCIOECONOMI
C C C
STRATIFIED RANDOM SAMPLING (CONTD.)
Advantage:
a. All groups, are equally represented.
b. highlight a specific subgroup within the population.
c. Observe existing relationships between two or more
subgroups.
[Link] statistical precision compared to simple random
sampling. (d/t lesser variability). So less time and money.
Disadvantage:
a. Requires a sampling frame for each stratum separately.
b. Requires accurate information on proportions of each
stratum
C . SYSTEMATIC RANDOM SAMPLING
Systematic sampling is a commonly employed technique,
when complete and up to date list of sampling units is
available.
systematic random sample is obtained by Selecting the first
unit on a random basis and others are included on the basis of
sampling interval I = N/n.
SYSTEMATIC RANDOM SAMPLING (CONTD.)
1 For example, if there are 100 patients (N) in a hospital and to
select a sample of 20 patients (n) by systematic random
sampling procedure,
A Step 1: write the names of 100 patients in alphabetical order
or their roll numbers one below the other.
B Step 2: sampling fraction: divide N by n to get the sampling
fraction (k).In the example k=100/20 = 5.
C Step 3: randomly select any number between 1 to k i.e.
between 1 to
5. Suppose the number we select is 4.
D Step 4: patient number 4 is selected in the sample.
E Step 5:Thereafter every 4+k th patient is selected in the
sample until we reach the last one.
SYSTEMATIC RANDOM SAMPLING
(CONTD.)
SYSTEMATIC RANDOM SAMPLING (CONTD.)
Advantage:
A. Easy To Draw, Simplicity.
B. Assurance That The Population Will Be Evenly
Sampled.
Disadvantage:
A. Requires Sampling Frame.
Eg. Random Blinded Rechecking Of Slides Under RNTCP.
Slides Are Drawn From The Register By Systematic Random
Sampling.
CLUSTER SAMPLING
[Link] population is divided into subgroups (clusters) like
families. A simple random sample is taken of the subgroups
and then all members of the cluster selected are surveyed.
2. Used for heterogeneous population.
3. Clusters are formed by grouping units on the basis of
their geographical locations.
4. Useful for field epidemiological research and health
administrators.
CLUSTER SAMPLING
Cluster 1 Cluster 2
Cluster 3
Cluster 5
Cluster 4
CLUSTER SAMPLING (CONTD.)
One Stage – When All Units In The Selected Cluster Are
Selected.
Two Stage – Only Some Units From A Selected Cluster
Are Taken Using Simple Random Or
Systematic Random Sampling.
CLUSTER SAMPLING (CONTD.)
A. Advantages
1. Simple As Complete List Of Sampling Units Within
Population Not Required.
2 .Low Cost.
3. Can Estimate Characteristics Of Both Cluster And
Population.
[Link] Travel/Resources Required.
B. Disadvantages
1. Potential Problem Is That Cluster Members Are More
Likely To Be Alike, Than Those In Another Cluster
(Homogenous).
2. Each Stage In Cluster Sampling Introduces Sampling
Error— The More Stages There Are, The More Error
There Tends To Be
3. Usually Less Expensive Than SRS But Not As Accurate
CLUSTER SAMPLING (CONTD.)
A. A Special Form Of Cluster Sampling Called The “30 X 7
Cluster Sampling”, Has Been Recommended By The WHO
For Field Studies In Assessing Vaccination Coverage.
B. In This A List Of All Villages (Clusters) For A Given
Geographical Area Is Made.
C. 30 Clusters Are Selected Using Probability Proportional To
Size (Pps).
D. From Each Of The Selected Clusters, 7 Subjects Are
Randomly Chosen.
E. Thus A Total Sample Of 30 X 7 = 210 Subjects Is Chosen.
F. The Advantage Of Cluster Sampling Is That Sampling Frame
Is Not Required
PROBABILITY PROPORTIONAL TO SIZE (PPS)
A. Steps:
a. List of all clusters (villages and sectors/wards) is made.
b. Population of each cluster is written against them.
c. Cumulative population is then written in serial order.
d. Sampling interval is calculated = Total cumulative
population/30
B. Choose a random number between 1 and the SI. This is the
Random Start (RS). The first cluster to be sampled contains this
cumulative population
Calculate the following series: RS; RS + SI; RS + 2SI; …. RS+(d-
1)*SI.
C. The clusters selected are those for which the cumulative
population contains one of the serial numbers.
MULTISTAGE RANDOM SAMPLING
A. Multistage sampling refers to sampling plans where the sampling
is carried out in stages using smaller and smaller sampling units
at each stage.
B. Not all Secondary Units Sampled normally used to overcome
problems associated with a geographically dispersed
population
MULTISTAGE RANDOM SAMPLING
A. In this method, the whole population is divided in first
stage sampling units from which a random sample is
selected.
B. The selected first stage is then subdivided into second stage
units from which another sample is selected.
C. Third and fourth stage sampling is done in the same manner
if necessary.
D. Example:
NFHS data is collected by multistage sampling.
Rural areas – 2 stage sampling – Villages from list by PPS,
Households from village
Urban areas – Wards (PPS) – CEB (PPS) – 30 households
from each CEB
WA
R D CEB HOUSHOLD
NON PROBABILITY SAMPLING
The probability of each case being selected from the total
population is not known
Units of the sample are chosen on the basis of personal judgment
or convenience
There are NO statistical techniques for measuring random
sampling error in a non-probability sample. Therefore,
generalizability is never statistically appropriate
NON PROBABILITY SAMPLING
a. Involves non random methods in selection of sample
[Link] have not equal chance of being selected
c. Selection depend upon situation
[Link] less expensive
e. Convenient
f. Sample chosen in many ways
TYPES OF NON PROBABILITY SAMPLING
a. Convenience/Grab/Availability
b. Judgment/Purposive sampling
c. Quota sampling
d. Snowball/Network
CONVENIENCE/GRAB/AVAILABILITY
SAMPLING
a. Subjects selected because it is easy to access them.
b. number of Students in your class, people on Street, friends etc
c. Advantages:
1. In pilot studies, convenience sample is usually used to obtain
basic data and trends.
2. In documenting that a particular quality of a
substance or phenomenon occurs within a
given sample.
d. Disadvantages:
1. Not representative of the entire population – skewed
results.
2. Limitation in generalization and inference making about the
entire population – low external validity.
SNOWBALL / NETWORK SAMPLING
A. If the sample for the study is very rare or is limited to a
very small subgroup of the population.
B. Works like a chain referral.
C. Initial subject helps identify people with a similar trait.
D. Advantages:
1. To reach rare and difficult to access populations.
2. Cheap, cost – efficient.
3. Lesser workforce, lesser planning.
E. Disadvantages:
4. Little control over sampling technique.
5. Representativeness is not guaranteed.
6. Sampling bias d/t people referring known people who
are more likely to be similar.
PURPOSIVE OR JUDGMENTAL SAMPLING
1. The specialty of an authority can select a more
representative sample. Knowledge of research question
required.
2. Subjects selected for a good reason tied to purposes of
research.
3. Advantages:
a. Hard-to-get populations that cannot be found
through screening general population.
b. Usually used when a limited number of individuals
possess the trait of interest.
4. Disadvantages:
a. No way to evaluate the reliability of the expert or
the authority.
b. Biased since no randomization was used in obtaining
the sample. So results cannot be generalised.
QUOTA SAMPLING
The population is divided into cells on the basis of relevant
control characteristics.
• a. A quota of sample units is established for each cell.
• b. A convenience sample is drawn for each cell until the quota
is met.
• c. Pre-plan number of subjects in specified categories
(e.g. 100 men, 100 women).
• d. In uncontrolled quota sampling, the subjects chosen for
those categories are a convenience sample.
• e. In controlled quota sampling, restrictions are imposed to
limit interviewer’s choice.
Quota sampling(cont)
1. To sample a subgroup that is of great interest to the study.
2. To observe relationships between subgroups.
3. Example – an interviewer may be told to sample 50 males
and 50 females.
4. Advantages:
a. Used when research budget limited
b. Introduces some elements of stratification
5. Disadvantages:
a. Variability and bias can not be controlled or measured
b. Time consuming
FACTORS AFFECTING CHOICE OF
SAMPLING DESIGNS
Heterogeneity: need larger sample to study more diverse
population
Desired precision: need larger sample to get smaller error
Nature of analysis: complex multivariate statistics need
larger samples
Accuracy of sample depends upon sample size, not ratio of
sample to population
THANK YOU