0% found this document useful (0 votes)
39 views59 pages

Unit 23 Sampling

This document discusses the importance of sampling in social science research, outlining objectives such as understanding sampling designs and differentiating between probability and non-probability sampling. It emphasizes the need for researchers to carefully select their sampling methods and determine sample sizes based on the specific research context and available resources. Key concepts such as population, sample, and sampling frame are defined, highlighting the significance of accurate sampling for drawing valid conclusions in research.

Uploaded by

t6jysm7jfd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views59 pages

Unit 23 Sampling

This document discusses the importance of sampling in social science research, outlining objectives such as understanding sampling designs and differentiating between probability and non-probability sampling. It emphasizes the need for researchers to carefully select their sampling methods and determine sample sizes based on the specific research context and available resources. Key concepts such as population, sample, and sampling frame are defined, highlighting the significance of accurate sampling for drawing valid conclusions in research.

Uploaded by

t6jysm7jfd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

From Dr.

Jason Mwanza’s Chronicles - Practical Handbook for Social Science


Research Methods

When you use social media to create a judgment on somebody, it is such a small sampling - Becca
Kufrin

Unit Twenty - Three - Sampling

Objectives (What you must know and do)


At the end of unit one, you should be able to;

1) Describe the importance of sampling in research.


2) Justify sampling designs theoretical basis, which are appropriate in a
qualitative and as well as a quantitative project.
3) Describe the theoretical basis of probability and non-probability sampling.
4) Describe how probability sampling differs from nonprobability sampling
5) Identify the various types of sampling designs and describe why a researcher
may use one type over another.
6) Differentiate sample size determination principles between qualitative and
quantitative research.

Here is the point of reflection before we look at this sub unit.


Reflection

1) Did you ever think that resources are critical in determining sample size in a
quantitative project and not a qualitative project?
2) Would sample size estimation be required a priori in any research project?
3) Think around these questions and see what they take.
4) You may have to pose for a while, reflect and write down what you think
about your answers to these questions or positions before you actually get
started. Once you have reflected and have written what you think, read this
unit and then determine if your position is at variance with the contents in
this unit.

Page 1
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Sampling

In this unit, we are going to devote some time to discuss

sampling. All researchers come to a point when they have to collect


data and they have to determine the units of analysis to be studied
or who will provide the data they need. You will be in such a position
and as such, you require paying so much attention to this unit.
The selection of sampling methods and determination of sample size in both nomothetic and
ideographic research are extremely important in research problems to draw correct conclusions.
Considerations of sampling are fundamental to any empirical study.

Key Concepts and Terms

Before describing sampling procedures, we need to define a few key terms.

Population: The population must be defined explicitly before a sample is taken. Population refers
to the target population or group of individuals of interest for study to which findings are to be
generalised. Often, the primary objective is to estimate certain characteristics of this population,
called population values. We consider the term population to imply all members that meet a set
of specifications or a specified criterion. For example, the population of Zambia is defined as all
people residing in the Republic of Zambia. The population of the City of Lusaka means all people
living within the city’s limits or boundary. A population of inanimate objects can also exist, such as
all households in a suburb in a particular year.

Figure 23.1 Population

Element: A single member of any given population is referred to as an element. When only some
elements are selected from a population, we refer to that as a sample; when all elements are
included, we call it a census. Let us look at what we could do if we were to answer our research
questions.

Page 2
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

A sample. This is a subset of the population being elements that is selected for the study. A
sampling unit is an element or an individual in the target population (see Figure 22.2).

Figure 23.2 Sample


Sampling frame. The sampling frame is the list of ultimate sampling units or entities, which may
be people, households, organizations, or other units of analysis. The list of registered students
may be the sampling frame for a survey of the student body at a university. In order to select a
sample according to your sample design, we need to have a where possible a complete list of
sampling units in the population. The sampling frame is a major determinant of the extent to which
a sample in a quantitative project is representative of the population under study. In essence, a
frame is any list, material or device that delimits, identifies, and allows access to the elements of
the survey population. We can say that a sampling frame is perfect “if every element appears on
the list separately, once, only once and nothing else appears on the list” (Kish, 1995). All the
elements included in the frame constitute the frame population. The sampling frame is the list of
ultimate sampling entities, which may be people, households, organizations, or other units of
analysis. Sampling frames are of two general types:
1. In form of lists, such as electoral registers or the membership of an organization, a list of
registered students may be the sampling frame for a survey of the student body at a university
or a telephone directory. Problems can arise in sampling frame bias. Telephone directories for
instance are often used as sampling frames, but tend to under-represent the poor (who have
fewer or no phones) and the wealthy (who have unlisted numbers). Random digit dialling
(RDD) reaches unlisted numbers but not those with no phones, while over representing
households owning multiple phones. In multi-stage sampling, discussed below, there will be
one sampling frame per stage (ex. a list of the 50 states, lists of Census tracts for sampled
states, lists of Census blocks for sampled tracts, and finally a list or residences for sample
blocks).
2. Area frames that may occur as sets of locations on maps (such as townships or rural
communities). In most cases, the sampling frame is imperfect: it has missing elements,
inappropriate listings, or duplications. Researchers conducting studies may have no up-to-date
or accurate lists of community members or households for designing a household sample of
the community. The best frames available may be lists of school students, utility customers,

Page 3
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

and members of local organizations. However, each of these lists will have built-in biases or
missing elements that may be significant enough to make it unsuitable for sampling the
community as a whole. If no adequate map is available to show locations of houses, the
researchers may have to make their own sketch map or see if there are any air photos that
they can use as the basis for one (with field checking to update it).

1) Enumerations or censuses are collections of data from every person or entity in the
population.
2) Strata or clusters are subdivisions of populations that are naturally divided into a number
of non-overlapping subpopulations. For example, national or city populations are divided
into male and female, and geographically into urban rural, high medium ow low residential
areas. Generally, the population is divided into strata, each consisting of individuals so
that the population size equals.

3) In order to answer research questions, it is doubtful that researchers should be able to


collect data from all cases in the population. It may be so in some cases and in not so in
other cases and the researcher will have to choose a method. The sampling method will
depend on the demands of each research question, its ontological orientation, the research
strategy and the ideal source of data. For instance, the researcher may want to understand
an issue in detail for one particular population rather than worry about the ‘generalisability’
of the findings. In such a scenario, the researcher may want to use ‘non random sampling1’
or purposive sampling for the study. If he wants to explain motivation at a place of work
involving numerous units, the researcher may want to use any one of the ‘random
sampling2’ techniques for the study.

As a researcher, you will always be faced with the challenge in the research process of determining
where, when and who will provide the information needed to answer your research questions. In
making such determinations, you will have to answer pertinent questions that need answers
beforehand. Some of the following questions are pertinent:

a) Who will be ideal to provide the required information (or informant) for this research
question?
b) Where will I get the informants or material to be studied to answer this research question?
c) When will sampling have to be done?
d) How many study units or sample elements or informants would I need to answer each
research question?

To get started, let us take this imaginary scenario. Assume that you are one of the very curious
students around campus. You are concerned that there is a lot of talk about the risky sexual

1
Non-random sampling is widely used as a case selection method in qualitative research, or for quantitative
studies of a preliminary and exploratory nature where random sampling is too costly, or where it is the only
feasible alternative. . However, random samples are always strongly preferred as only random samples
permit statistical inference. That is, there is no way to assess the validity of findings of non-random samples.
This type of sampling is used mostly in idiographic qualitative research.
2
Random sampling is data collection in which every person in the population has a chance of being selected
which is known in advance. Normally this is an equal chance of being selected. This type of sampling is
used mostly in nomothetic quantitative research.
Page 4
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

behaviour among married students at the University of Zambia. You are concerned of two things:
(i) You want to get the facts from the insiders’ point of view (the married) and (ii) you want to
know what all the students at this college think about the sexual behaviour of married students.
You know that married students are very few and hard to reach but you have a list of all the 9 000
students and their room numbers.

I am sure that this will not be a problem for you. You are already enough of a research expert. A
few friends come to suggest that you use questionnaires. Others choose in-depth interviews and
others suggest focus group discussions. You look back at the draft you have and after all that
preliminary work, you are faced with the most important questions: What should I do with these
proposals? Whom will I ask to complete the questionnaire or to invite for the in-depth interviews
or focus group discussions? Should I get all the 9 000 students? Do I just pick an average student?
Do I need only married students and if I do how will, I get them? How about asking only fourth
year students, because they are supposed to know what is going on at the university? You cannot
do that because freshers and fourth years may be married as well. What should you do then?
These are decisions that cannot be taken lightly. The success of the research will depend on the
way you select the people who will participate in this sexual risk behaviour study.

However, let me say this now. If one wanted to have accurate information about people or things,
the best thing to do is examine every member or element of a group. However, this is not always
the case. There are numerous limitations. All sampling problems stem from the limitations that are
imposed on observation. If one could observe directly all that one needs to know, there would be
no occasion to make inferences about what has not been observed or to generalise one's
knowledge. In most projects, we cannot involve all of the people we might like to involve. This is
because data collection costs money and time. Therefore, from the population, we need to
determine a sample to work with.

We would have to select a sample to work with and generalise to the population where the sample
has been drawn from. This will require that we select the best method of sampling. There is no
"best" method of sampling that can be followed blindly in all instances. The most effective sampling
methods are those that we have to design and these ought to specifically to fit the situation in
which they are to be used. They are based on the general theory of sampling derived from
mathematical statistics and economic theory, and they take advantage of what is known in advance
about the population, system of interaction, or process that is to be sampled. They are designed
to achieve the specific purposes of the study as effectively as is possible under the limitations set
by the funds, personnel, time, and other resources that are available. In a word, they are tailored
to fit the circumstances.

It would be a mistake to confine the discussion of sampling to the larger surveys, especially those
nation-wide polls. For each such survey, there may be scores of studies to be conducted on a
smaller scale, geographically and financially. They may be more intensive and complicated in the
variables they measure and the relationships they analyse. In the aggregate, and in some individual
instances, they may turn out to be more important for research than some of the large-scale
studies. So one may well ask, "Isn't there some simple dependable rule I can use to solve my
sampling problem? I'm studying a limited group, not seeking a national average." We believe that
such a question deserves careful examination and a serious answer. If it is put just this way,
without any information about the population to be sampled and the resources that can be devoted
to sampling, then one must necessarily sacrifice the advantages that come from fitting the

Page 5
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

sampling rule to the particular case. Hence, the answer could be, "Yes. You can have two choices.
These choices are structured under two rules as will be described below.

"Rule A is: Follow your common sense and select people by what seems to you to be a good way
to get a group similar to the entire group in which you are interested. Test it with every convenient
set of data you can obtain. Then be prudent in handling the findings of your study, for you may
be far off the beam without knowing it. As the airplane pilots say, you are "flying by the seat of
your pants." With good luck, you may not be too far off too often.

Rule B is for those who do not like to gamble or live dangerously. Such people may be less likely
to make occasional brilliant discoveries or produce a flood of studies, but, like the tortoise, they
may pass the hare before the race is finished. Rule B counsels:
a. Pin down very specifically the definition of the population you are studying and the
variables you wish to measure. (This may be no more than the arduous task of making
up your mind about just what you will attempt.)
b. Obtain a list of all the persons who make up that group or population or, lacking a list,
divide the population into many small parts according to residence, place of work, or other
suitable factors. Do this, however, in a way that tends to make each part a mixture of
different kinds of people rather than a cluster of per- sons who are similar with regard to
the variables you are studying. (This is where you may have to go contrary to your
intuition.)
c. Select by some strictly random procedure enough of these parts to give you the number
of persons you think you need after allowing for the loss of those you will not be able to
study successfully. You may even make up sets of persons or parts, well balanced on the
variables you know about be- forehand, until you have every person in the population in
the same number of sets. Then select strictly at random one such set as a sample.
d. Proceed to study the sample but keep a complete record of persons you miss and what
information you can find out about them.

We have deliberated so much Rules A and B. Sometimes these two rules may work quite well with
you. At other times, the findings will not be satisfactory unless greater care is taken with the
sampling. We suggest that you do any of the following: First, you could learn something more
about sampling from the reports of previous surveys, though most reports offer little in the way of
useful tests of sampling methods, and from journal publications in statistics and related fields.
Second, one can get help from advisers who have developed expertness in research methods.
Finally, one may seek to develop new methods appropriate for his own situation by his own
ingenuity and experimentation.

Sample size

In order to investigate a population, the investigator collects data. If it is possible, the best option
is to include the whole population in this investigation. Yet, it is often impossible to collect data on
the whole population, so the statistician collects a representative sample. This means that a subset
or a sample is collected in such a way that it provides a miniature image of the whole population.
If, moreover, the sample is large enough, then a diligent analysis of the sample will lead to
conclusions that are, to a large extent, also valid for the whole population. Such conclusions must

Page 6
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

be reliable, which includes that the probability to be correct must be known. This therefore requires
the researcher to determine the sample size.

Sample-size determination is often an important step in planning any study—and it is usually a


difficult one. If the sample size is too small, even a well conducted study may fail to detect
important effects or associations, or may estimate those impacts or associations too imprecisely.
Similarly, if the sample size were too large, the study would be more complex and may even lead
to inaccuracy in findings. Moreover, taking a too large sample size would also escalate the cost of
study. Therefore, the sample size is an essential factor of any scientific research.

Sathian et al. (2010)3 point out that sample size determination is a difficult process to handle and
advises researchers to seek collaboration of a statistician who happens to have good scientific
knowledge in the art and practice of statistics. There are distinct methods for calculating sample
size for different study designs and different outcome measures. Additionally, there are also some
different procedures for calculating the sample size for two approaches of drawing statistical
inference from the study findings on the basis of confidence interval approach and test of
significance approach.

When we go about sampling, we make a distinction between the theoretical population of interest
to our study and the final sample that we actually measure in our study. Therefore, what we have
to do is get a portion of the population for study or sample. Not only time, money and talent are
the only reasons we have to sample our population, there are other reasons. It may happen that
not all units4 in the population are identifiable in the sense that we may not have a comprehensive
list and as such, we may not reach out to them (Bless and Higson, 19955). For instance, this may
be true if we wanted to find out about the rates clients are charged by sex workers in Lusaka.
Therefore, to measure the rates charged by sex workers in Lusaka, you just have to take a sample
of them. In addition, even if all those sex workers in Lusaka could be identified, it would be too
expensive and in addition, time consuming to measure the rates charged. We may not have the
time too to get at every one of the sex workers. At times, we may not have the money to include
all possible study elements. Nonetheless, even if we may not get to everyone, it is possible to
examine a portion of the population and arrive at accurate conclusions. How will you know if you
“do the job right?” To understand sampling, you first need to distinguish between two general
sampling strategies: probability and nonprobability. With probability sampling, the likelihood of any
one member of the population being selected is known. If there are 9 000 students on campus,
and if there are 1,000 fourth years, then the odds of selecting one fourth year as part of the sample
is 1,000:9 000 or 0.11.

Nonprobability sampling is where the likelihood of selecting any one member from the population
is not known. For example, if you do not know how many children are enrolled in the district’s high
schools, then the likelihood of any one being selected cannot be computed.

3
Sathian B., Jaydevan Sreedharan, Suresh N. Babu, Krishna Sharan, E. S. Abhilash, E. Rajesh (2010):
Relevance of sample size determination in medical research, Nepal Journal of Epidemiology, 1 (1),
4
Usually the term "units" refers to the things that we sample and from whom or where we gather
information. Nevertheless, for some research projects the units are organizations, groups, or
geographical entities like cities or towns.
5
Bless, C., and Higson, C.S. (1995 ). An. African perspective. (2nd edition). Cape Town: Juta.
Page 7
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

This chapter suggests a number of sampling issues to consider in designing a study it further points
to some of the strengths and weaknesses of various types of sampling designs. However, this book
is not intended to be a primer on sampling strategies, which is a complex subject. For this, the
reader should explore some of the suggested readings in the bibliography. Ideally, sampling is the
process of collecting the set of research participants who will provide the data for a research
project from a given population. We are going to look at random sampling when researchers intend
to conduct quantitative research especially nomothetic research and non-random sampling when
qualitative researchers want to embark on idiographic research.

Quantitative Nomothetic Research Sampling – Random sampling

Nomothetic researchers want to generalise and learn by trying to explain something about social
regularities from a theory or principle some law if phenomena apply to people in general.
Nomothetic research attempts to discover what those theoretical assumptions or systems of laws
or principles are (Burrell and Morgan, 19796; Flood and Jackson, 1991). Since it is interested in
discovering the laws or principles that govern aspects of reality, nomothetic research cannot
depend on information that describes a single individual. It needs information that describes
enough cases or many cases and these could be hundreds or thousands so that general patterns
or relationships can be seen. It is not concerned for example with getting the facts from the
insider’s point of view (the married) but “all the students at the University of Zambia” think about
the sexual behaviour of married students. This therefore calls for the researcher to begin with the
population of interest to have enough and representative cases and not all cases. In essence, this
calls for use of random sampling.

In random sampling, every unit of the population has the same chance of being enumerated. This
implies that the selection of one unit is independent of that of any other. Furthermore, the method
of selection is independent of the characteristic to be examined. Random sampling, therefore,
requires the employment of some mechanical device, like a roulette wheel or a set of random
numbers, emancipating the selection from the control of the sampler, whether that control be
voluntary or involuntary. The result is the kind of sample to which the ordinary theory of uniform
probability applies.

Since, researchers are unlikely to study the population except when a census is needed, they are
faced with time, resource constraints or even the difficulty of locating particular units for study and
as such, they will have to decide on sampling the units. They do so in stages.

Stage 1: Clearly define the ontology, epistemology, human nature to guide the demands of the
research question. If the research question falls in the positivist paradigm, there are random
sampling designs one could choose from. If the research question falls in the anti-positivist
paradigm, there are purposive sampling designs to choose from.

Stage 2: Once the paradigm is settled, the second stage in the sampling process is to clearly define
target population. In research, you do not just study any one you target units that have relevance
to the subject under inquiry. The target population is therefore that group of individuals who are
relevant, from which the sample might be drawn.

6
Burrell, G. and G. Morgan. (1979). Sociological Paradigms and Organisational Analysis. Farnham: Ashgate.
Page 8
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Perhaps the most frequently asked questions you will desire to solve concerns nomothetic sampling
are, "What sample size do I need?" and how do I go about sampling – that is identifying the units
I need to study? Let us look at these two questions.

The question of Sample Size

Regarding sample size determination, which is the first question, the decision about sample size is
not a straightforward one at all. This is source of great anxiety to beginners in social science
research. A number of factors, including, influences the answer to this question:

 A compromise between the constraints of time and cost.


 The purpose of the study.
 Population size, the risk of selecting a "bad" sample.
 The need to consider an allowable sampling error for purposes of precision. A sampling
error is in essence the difference between the mean and sample population

You may settle the issue of these factors but you still get troubled with the sample size. What is
important is not the relative size but an absolute size. There are statistically valid ways of
determining an absolute sample size, depending on whether the analysis will use simple or complex
statistics (Kish, 1995).

An important consideration is determining the “crucial subgroup.” This is the group from which the
survey must obtain enough observations to result in reasonably accurate statements, such as “sex
workers who would prefer ‘live sex’ have higher incomes and higher education levels of HIV
infection than those who use condoms.” If the analysis will come from only a part of the sample,
then the sample size has to be increased significantly to maintain the level of accuracy.

Sample Size Criteria

In addition to the purpose of the study and population size, three criteria usually will need to be
specified to determine an absolute sample size: the level of precision, the level of confidence or
risk, and the degree of variability in the attributes being measured (Miaoulis and Michener, 1976).
Each of these is reviewed below.

The Level Of Precision

When properly conducted, a probability sample of this size provides reliable information with a
very small margin of error for the whole population. When determining sample size, take into
account the required levels of precision needed for the survey estimates, the type of design and
estimator to be used, the availability of auxiliary information, budgetary constraints, as well as
both sampling factors (e.g., clustering, stratification) and non-sampling factors (e.g., non-
response, presence of out-of-scope units, attrition in longitudinal surveys). For periodic surveys,
take into account expected births and deaths of units within the changing survey population. The
level of precision or reasonable certainty, sometimes called sampling error, is the range in which
the true value of the population is estimated to be. This range is often expressed in percentage
points, (e.g., ±5 percent) in the same way that findings for political campaign polls are reported

Page 9
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

by the media. Thus, if a researcher finds that 78% of voters in the sample have voted for The
Movement for Multiparty Democracy with a precision rate of ±5%, and then he or she can conclude
that between 73% and 83% of the voters in the population have a greater preference for this
party.

The Confidence Level

The confidence or risk level is based on ideas encompassed under the Central Limit Theorem. The
key idea encompassed in the Central Limit Theorem is that when a population is repeatedly
sampled, the average value of the attribute obtained by those samples is equal to the true
population value. Furthermore, the values obtained by these samples are distributed normally
about the true value, with some samples having a higher value and some obtaining a lower score
than the true population value. In a normal distribution, approximately 95% of the sample values
are within two standard deviations of the true population value (e.g., mean).

In other words, this means that, if a 95% confidence level is selected, 95 out of 100 samples will
have the true population value within the range of precision specified earlier. There is always a
chance that the sample you obtain does not represent the true population value. This risk is
reduced for 99% confidence levels and increased for 90% (or lower) confidence levels.

The confidence interval (also called margin of error) is the plus-or-minus figure usually reported
in newspaper or television opinion poll findings. For example, if you use a confidence interval of 4
and 47% percent of your sample picks an answer you can be "sure" that if you had asked the
question of the entire relevant population between 43% (47-4) and 51% (47+4) would have picked
that answer.

The confidence level tells you how sure you can be. It is expressed as a percentage and
represents how often the true percentage of the population who would pick an answer lies within
the confidence interval. The 95% confidence level means you can be 95% certain; the 99%
confidence level means you can be 99% certain. Most researchers use the 95% confidence level.

When you put the confidence level and the confidence interval together, you can say that you are
95% sure that the true percentage of the population is between 43% and 51%. The wider the
confidence interval you are willing to accept, the more certain you can be that the whole population
answers would be within that range.

For example, if you asked a sample of 1000 people in a city which brand of cola they preferred,
and 60% said Brand A, you can be very certain that between 40 and 80% of all the people in the
city actually do prefer that brand, but you cannot be so sure that between 59 and 61% of the
people in the city prefer the brand.

Factors that Affect Confidence Intervals


There are three factors that determine the size of the confidence interval for a given confidence
level:

 Sample size
 Percentage

Page 10
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

 Population size

The larger your sample size, the surer you can be that their answers truly reflect the population.
This indicates that for a given confidence level, the larger your sample size, the smaller your
confidence interval. However, the relationship is not linear (i.e., doubling the sample size does not
halve the confidence interval).

Strategies For Determining Sample Size

There are several approaches to determining the sample size that you can use. These include using
a census for small populations, imitating a sample size of similar studies, using published tables,
and applying formulas to calculate a sample size. Each strategy is discussed below.

Using a Census for Small Populations

One approach is to use the entire population as the sample. However, the detail of information
that can be asked in a sample is greater than that in a census due to the cost and time constraints
under which most researchers are operating. Although cost considerations make a census study
impossible for large populations, a census is only attractive for small populations (e.g., 200 or
less). A census eliminates sampling error and provides data on all the individuals in the population.
In addition, some costs such as questionnaire design and developing the sampling frame are
"fixed," that is, they will be the same for samples of 50 or 200. Finally, virtually the entire
population would have to be sampled in small populations to achieve a desirable level of precision.

Using a Sample Size of a Similar Study

Another approach is to use the same sample size as those of studies similar to the one you plan.
Without reviewing the procedures employed in these studies you may run the risk of repeating
errors that were made in determining the sample size for another study. However, a review of the
literature in your discipline can provide guidance about "typical" sample sizes that are used.

Using Published Tables

A third way to determine sample size is to rely on published tables, which provide the sample size
for a given set of criteria. Table 23.1a and Table 23.1b present sample sizes that would be
necessary for given combinations of precision, confidence levels, and variability. Please note two
things. First, these sample sizes reflect the number of obtained responses, and not necessarily the
number of surveys mailed or interviews planned (this number is often increased to compensate
for non-response). Second, the sample sizes in Table 23.2 presume that the attributes being
measured are distributed normally or nearly so. If this assumption cannot be met, then the entire
population may need to be surveyed. If you examine Tables 23.1a and 23.1b you will notice that
the researcher has to select a large number of sampling elements from the populating when the
population is small. As the population because bigger and bigger, the sample size becomes smaller
and smaller and so does the sampling error decrease marginally. If it happens that you want to
break down your population into smaller groups or categories, you will in the end have a larger
number of study elements.

Page 11
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Table 23.1a Sample Size Determinations

Sample size for ±5%, ±7% and ±10% Precision Levels Where Confidence Level is 95% and P =.5.

Sample Size (n) for Precision (e) of:


Population Size
±5% ±7% ±10%

100 81 67 51
125 96 78 56
150 110 86 61
175 122 94 64
200 134 101 67
225 144 107 70
250 154 112 72
275 163 117 74
300 172 121 76
325 180 125 77
350 187 129 78
375 194 132 80
400 201 135 81
425 207 138 82
450 212 140 82

Page 12
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Table 23.1b Sample Size Determinations

Sample size for ±3%, ±5%, ±7% and ±10% Precision Levels Where Confidence Level is 95% and P =. 5.

Sample Size (n) for Precision (e) of:


Population Size
±3% ±5% ±7% ±10%

500 a 222 145 83


600 a 240 152 86
700 a 255 158 88
800 a 267 163 89
900 a 277 166 90
1,000 a 286 169 91
2,000 714 333 185 95
3,000 811 353 191 97
4,000 870 364 194 98
5,000 909 370 196 98
6,000 938 375 197 98
7,000 959 378 198 99
8,000 976 381 199 99
9,000 989 383 200 99
10,000 1,000 385 200 99
15,000 1,034 390 201 99
20,000 1,053 392 204 100
25,000 1,064 394 204 100
50,000 1,087 397 204 100
100,000 1,099 398 204 100
>100,000 1,111 400 204 100
a = Assumption of normal population at ±3% is poor (Yamane, 1967). The entire population should be
sampled.

Using Formulas to Calculate a Sample Size

A cursory review of the literature shows that sample size can be determined in many ways using
formulas and/or tables and that there is no universal “formula” for sample size calculations. Each
of the methods has a recommended use. You will also find many sample size calculators available
online, many of them based on Cochran’s Sample Size Formula.

Sample size determination when the Population Size id known

Page 13
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

First, know the size of the population with which you are dealing. If your population is small (100
people or less), it may be preferable to do a census of everyone in the population, rather than a
sample. However, if the population from which you want to gather information is larger, it makes
sense to do a sample.

Second, determine the desired precision of findings. The level of precision is the closeness with
which the sample predicts where the true values in the population lie. The difference between the
sample and the real population is called the sampling error. If the sampling error is ±5%, this
means we add or subtract 5 percentage points from the value in the survey to find out the actual
value in the population. For example, if the value in a survey says that 85% of teachers use
dictation and the sampling error is ±5%, we know that in the real-world population, between 80%
and 90% are likely to use this pesticide. This range is also commonly referred to as the margin of
error. The level of precision you accept depends on balancing accuracy and resources. High levels
of precision require larger sample sizes and higher costs to achieve those samples, but high
margins of error can leave you with findings that aren’t a whole lot more meaningful than human
estimation. Tables 23.1a and 23.1b above provide sample sizes for precision levels of 7%, 5% and
3%.

Third, determine the Confidence Level. The confidence level involves the risk you’re willing to
accept that your sample is within the average or “bell curve” of the population. A confidence level
of 95% means that, were the population sampled 100 times in the same manner, 95 of these
samples would have the true population value within the range of precision specified earlier, and
5 would be unrepresentative samples. This level is standard for most social-science applications;
though higher levels can be used. If the confidence level that is chosen is too low, findings will be
“statistically insignificant”.

Fourth, estimate the Degree of Variability. Variability is the degree to which the attributes or
concepts being measured in the questions are distributed throughout the population. A
heterogeneous population, divided more or less 50%-50% on an attribute or a concept, will be
harder to measure precisely than a homogeneous population, divided say 80%-20%. Therefore,
the higher the degree of variability you expect the distribution of a concept to be in your target
audience, the larger the sample size must be to obtain the same level of precision. To come up
with an estimate of variability, simply take a reasonable guess of the size of the smaller attribute
or concept you’re trying to measure, rounding up if necessary. If you estimate that 25% of the
population in your county farms organically and 75% does not, then your variability would be 0.25
(which rounds up to 30% on the table provided at the end of this Tipsheet). If variability is too
difficult to estimate, it is best to use the conservative figure of 50%. Note: when the population is
extremely heterogeneous (i.e., greater than 90-10), a larger sample may be needed for an accurate
result, because the population with the minority attribute is so low. At this point, using the level of
precision and estimate of variability you have selected, you can use either the table or the equation
provided at the bottom of this Tip sheet to determine the base sample size for your project.

Six, estimate the Response Rate. The base sample size is the number of responses you must get
back when you conduct your survey. However, since not everyone will respond, you will need to
increase your sample size, and perhaps the number of contacts you attempt to account for these
non-responses. To estimate response rate that you are likely to get, you should take into
consideration the method of your survey and the population involved. When you have come up
with an estimate of the percentage you expect to respond, then divide the base sample size by

Page 14
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

the percentage of response. For example, if you estimated a response rate of 70% and had a base
sample size of 220, then your final sample size would be 315 (220/0.7). Once you have this, you
are ready to begin your sampling!

When determining your sample size, and in order for you to comfortably generalize from a random
sample. In order to avoid sampling errors or biases, we implore you to have a random and
adequate size. (Equation 1 below was used to calculate sample sizes in Tables 22.1a and 22.1b).
n = N

1 + N (e)2
Where:
a) n is the desired sample size,
b) N is the stated or known population size and,
c) e is the level of precision or the margin of error. This is the risk the researcher is willing
to accept. In the social research a 5% margin of error is acceptable.

The smaller the value of e the greater the sample size required as technically speaking sample
error is inversely proportional to the square root of n, however, a large sample cannot guarantee
precision (Bryman and Bell, 20037). We do know that while the larger the sample and the lesser
the likelihood that findings will be biased does hold, diminishing returns can quickly set in when
samples get over a specific size which need to be balanced against the researcher’s resources
(Gill et al., 20108). To put it bluntly, larger sample sizes reduce sampling error but at a decreasing
rate. You will observe in Table 22.1c below, this observation is noted.

Table 23.1c: diminishing returns as population increases

Sample size for ±3%, ±5%, ±7% and ±10% Precision Levels Where Confidence Level is 95% and P =. 5.

Sample Size (n) for Precision (e) of:


Population Size
±3% ±5% ±7% ±10%

15,000 1,034 390 201 99


20,000 1,053 392 204 100
25,000 1,064 394 204 100
50,000 1,087 397 204 100
100,000 1,099 398 204 100
>100,000 1,111 400 204 100

Many researchers commonly add 10% to the sample size to compensate for persons that the
researcher is unable to contact. The sample size also is often increased by 30% to compensate for

7
Bryman, A. and Bell, E. (2003). Business research methods, Oxford, Oxford University Press.
8
Gill, J., Johnson, P. & Clark, M. (2010). Research Methods for Managers, SAGE Publications.
Page 15
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

nonresponse. Thus, the number of mailed surveys can be substantially larger than the number
required for a desired level of confidence and precision.

Other Considerations

In completing this discussion of determining sample size, there are three additional issues. First,
the above approaches to determining sample size have assumed that a simple random sample is
the sampling design. More complex designs, e.g., stratified random samples, must take into
account the variances of subpopulations, strata, or clusters before an estimate of the variability in
the population as a whole can be made.
Another consideration with sample size is the number needed for the data analysis. If descriptive
statistics are to be used, e.g., mean, frequencies, and then nearly any sample size will suffice. On
the other hand, a good size sample, e.g., 200-500, is needed for multiple regression, analysis of
covariance, or log-linear analysis, which might be performed for more rigorous state impact
evaluations. The sample size should be appropriate for the analysis that is planned.

In addition, an adjustment in the sample size may be needed to accommodate a comparative


analysis of subgroups (e.g., such as an evaluation of program participants with non-participants).
Sudman (1976) suggests that a minimum of 100 elements is needed for each major group or
subgroup in the sample and for each minor subgroup, a sample of 20 to 50 elements is necessary.
Similarly, Kish (1965) says that 30 to 200 elements are sufficient when the attribute is present 20
to 80 percent of the time (i.e., the distribution approaches normality). On the other hand, skewed
distributions can result in serious departures from normality even for moderate size samples (Kish,
1965:179). Then a larger sample or a census is required.

Finally, the sample size formulas provide the number of responses that need to be obtained. Many
researchers commonly add 10% to the sample size to compensate for persons that the researcher
is unable to contact. The sample size also is often increased by 30% to compensate for non-
response. Thus, the number of mailed surveys or planned interviews can be substantially larger
than the number required for a desired level of confidence and precision.
Cochran’s Sample Size Formula

This formula is used to compute an ideal sample size for a desired level of precision, it is
recommended to be used for studies with infinite populations. Cochran’s formula is used for studies
with infinite populations (Cochran 197710). The formula is described below.

n0=z2⋅p⋅(1−p)

e2

Where

a) e: desired level of precision, the margin of error


b) p: the fraction of the population (as percentage) that displays the attribute

9
Kish, L. (1965). Survey Sampling. New York. Wiley.
10
Cochran, W.G. (1977). Sampling Techniques. 3rd ed. New York: John Wiley & Sons.

Page 16
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

c) z: the z-value, extracted from a z-table.


The entry for z in a z-table represents the area under the normal distribution curve to the left
of z (Figure 23.1)..

Figure 23.1: Area represented by the z-value

Here is an example. Think of a study of students in a large university campus for which we
don’t know the campus size. For example, a large campus may have 10 to 15 thousand
students. We are interested in finding the percentage of students who eat lunch at the campus
dinner halls but we do not have insider information. The question is how many students would
we need to ask that question to be able to determine, with reasonable confidence, what
percentage of students conform to the sought behaviour. Given the lack of information we
start by considering that 50% of the students eat lunch at the school dining halls, which
provides the largest variability. Then we consider a 95% confidence level (leading to
an α=0.05) and a ±5% precision. From the z-tables, the value for z is 1.96. Therefore, the
theoretical sample would be:

n0 = 1.962⋅0.5⋅(1−0.5) this will give us ≈385

0.052

How to find the value of z from a z-table? The procedure is:

1. Convert the confidence level from percent form to decimal form as value between 0
and 1. (95% → 0.95)
2. Subtract the value from 1 and divide by 2 to find out how much is half (1 - 0.95 = 0.05;
0.05/2 = 0.025)
3. Add the value from 2) to the value from 1) (0.95 + 0.025 = 0.975)
4. Look for the value obtained in step 3) in table values. In Table 6.1 the value sits at the
intersection of row labeled 1.9 and column labeled 0.06.
5. Determine the value of z by adding the value for the column with the value for the row
obtained in step 4 (1.9 + 0.06 = 1.96).

Cochran’s Modified Formula for Finite Populations

A slightly modified formula can be used if the size of the population is known.

n=n01+n0−1Nn=n01+n0−1N
Page 17
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

n0: Cochran’s sample size computed using the formula for ideal sample size;

N: the size of the population. The sample size is dependent on the size of the population until
the population reaches about 40-50 thousand after which the increase is almost none.
Therefore, if the estimated population is this large or larger, the theoretical sample size, as
computed for an unknown population, is about equal to the one generated by the modified
formula.

As an example, let’s look at the same problem as before but for a much smaller campus of N
= 600 students. While we can still use the theoretical sample of 385 participants computed
before, do we need to? The necessary sample size may be smaller.

N= 385 = 234.76 ≈235

385 −1
1+
600

The result of this computation indicates that for smaller populations the number of subjects
(sample size) can be smaller (235 vs. 385) for the researchers to be reasonably confident of
the findings.

Sample size in a regression analysis

For example, in regression analysis, many researchers say that there should be at least 10
observations per variable. If we are using three independent variables, then a clear rule would be
to have a minimum sample size of 30. Some researchers follow a statistical formula to calculate
the sample size

How do I go about sampling?

This is the second question we must address related to sampling. There are some common sample
designs described in the literature and in nomothetic studies, random sampling is of different types
and these are described below.

Types of Probability Sampling

In an ideal world, most studies would aim to obtain random probability samples, in which every
element (person, household, or event) has a known, nonzero probability of being selected. A
probability sampling method is any method of sampling that utilizes some form of random
selection. Probability sampling is used to select a sample from the survey population. The intention
is to gather useful information from the sampled units to allow inferences about the survey
population. This type of sampling implies a probabilistic selection of units from the frame in such
a way that all survey population units have known and positive inclusion probabilities. Sample size
is determined by the required precision and available budget for observing the selected units. The
probability distribution that governs the sample selection, along with the stages and units of
sampling, the stratification, and so on, are collectively called the sampling design or sample design.

Page 18
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

In order to have a random selection method, you must set up some process or procedure that
assures that the different units in your population have equal probabilities of being chosen. Most
statistical inferences about means and variances and regression coefficients are based on the
assumption that the sample is a simple random sample. There are several types of random
sampling, each of which affects how significance is computed.

Simple Random Sampling

Simple random sampling is a completely random method of selecting a sample in which each
element and each combination of elements in the population have an equal probability of being
selected as a part of the sample. Being one of the simplest forms of random sampling, this method
is a fair way to select a sample. As each member of the population has an equal probability of
being selected. As such, it is an equal probability selection method (EPSEM). Here, each member
of the population has an equal- and independent chance of being selected to be part of the sample.
Equal and independent are the key words here, equal because there is no bias that one person
will be chosen rather than another and independent because the choice of one person does not
bias the researcher for or against the choice of another. When sampling randomly, the
characteristics of the sample should be very close to that of the population. Even though it may
not be considered an ideal method of choosing the sample, still result obtained through this method
has high external validity or generalizability as compared to some other method of sample
selection. The process of simple random sampling consists of the following six steps:

1) The definition of the population from which you want to select the sample.
2) Identify an existing sampling frame of the target population or develop a new one.
3) Assign a unique number to each member of the population.
4) Evaluate the sampling frame for undercover age, over coverage, multiple coverage, and
clustering, and make adjustments where necessary.
5) Determine the sample size.
6) Randomly select the targeted number of population elements.

Three techniques are selected in fulfilling Step 6 and these are: the lottery method, a table of
random numbers, and randomly generated numbers using a computer program (i.e., random
number generator). In using the lottery method (also referred to as the “blind draw method” and
the “hat model”), the numbers representing each element in the target population are placed on
chips (i.e., cards, paper, or some other objects). The chips are then placed in a container and
thoroughly mixed. Next, blindly select chips from the container until the desired sample size has
been obtained. Disadvantages of this method of selecting the sample are that it is time-consuming,
and is limited to small populations.
A table of random numbers may also be used. The numbers in a table of random numbers are not
arranged in any particular pattern. They may be read in any manner, i.e., horizontally, vertically,
diagonally, forward, or backward. In using a table of random numbers, the researcher should
blindly select a starting point and then systematically proceed down (or up) the columns of num-
bers in the table. The number of digits that are used should correspond to the total size of the
target population. Every element whose assigned number matches a number the researcher comes
across is selected for the sample. Numbers the researcher comes across that do not match the
numbers assigned the elements in the target population are ignored. As in using the lottery
method, using a table of random numbers is a tedious, time-consuming process, and is not
recommended for large populations. Instead, statistical software should be used for large
Page 19
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

populations. Most statistical software and spread sheet software have routines for generating
random numbers. Elements of the populations whose assigned numbers match the numbers
generated by the software are included in the sample. One may select a number from a table of
random numbers for use as the starting number for the process.
What are the Subtypes of Simple Random Sampling?
There are two types of simple random sampling: sampling with replacement and sampling without
replacement. In sampling with replacement, after an element has been selected from the sampling
frame, it is returned to the frame and is eligible to be selected again. In sampling without
replacement, after an element is selected from the sampling frame, it is removed from the
population and is not returned to the sampling frame. Sampling without replacement tends to be
more efficient than sampling with replacement in producing representative samples. It does not
allow the same population element to enter the sample more than once. Sampling without
replacement is more common than sampling with replacement. It is the type that is the subject of
this text.
Lottery Method of Sampling
There are several different ways to draw a simple random sample. The most common way is the
lottery method. Here, each member or item of the population at hand is assigned a unique number.
The numbers are then thoroughly mixed, like if you put them in a bowl or jar and shook it. Then,
without looking, the researcher selects n numbers. The population members or items that are
assigned that number are then included in the sample.
Using Random Number Table
Most statistics books and many research methods books contain a table of random numbers as a
part of the appendices. A random number table typically contains 10,000 random digits between
0 and 9 that are arranged in groups of 5 and given in rows. In the table, all digits are equally
probable and the probability of any given digit is unaffected by the digits that precede it.
Equal Probability Systematic Sampling
Equal probability systematic sampling has a long tradition in survey sampling (e.g., Madow and
Madow 194411), Madow 194912, 1953)13. Equal probability systematic is an improvement over the
simple random sampling. This method requires first the complete information about the population.
Equal probability systematic sampling is a type of probability sampling method in which sample
members from a larger population are selected according to a random starting point but with a
fixed, periodic interval. This interval, called the sampling interval. It is calculated by dividing the
population size by the desired sample size.
Despite the sample population being selected in advance, equal probability systematic sampling is
still thought of as being random if the periodic interval is determined beforehand and the starting
point is random. Equal probability systematic being a very easy method to do, you actually choose
every “nth” participant from a complete list or sapling frame. Even though each element has an

11
Madow, W.G. (1949). On the Theory of Systematic Sampling, II. Annals of Mathematical Statistics, 20,
333– 354.
12
Madow, W.G. and Madow, L.H. (1944). On the Theory of Systematic Sampling. Annals of Mathematical
Statistics, 15, 1 –24.
13
Madow, W.G. (1953). On the Theory of Systematic Sampling, III. Annals of Mathematical Statistics, 24,
101– 106.
Page 20
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

equal probability of selection, but unlike as in simple random sampling, a combination of elements
has different probabilities in systematic random sampling.
Sampling begins by selecting randomly the first element. This is chosen at random and subsequent
elements are chosen using a fixed interval say ‘k’ (e.g., every tenth element) until you reach the
desired sample size. The researcher must ensure that the chosen sampling interval does not hide
a pattern. Any pattern would threaten randomness. In this sampling, every kth name on the list is
chosen. The term kth stands for a number between 0 and the size of the sample that you want to
select. For example, here is how to use systematic sampling to select 10 names from the list of 50
(although these steps apply to any size population and sample) shown in Table 23.2.
Table 23.2: Sampling frame of German names

1. Liam 2. Alexander 3. Julian 4. Hunter 5. Jason


6. Noah 7. Henry 8. Luke 9. Cameron 10. Emmett
11. Oliver 12. Jacob 13. Grayson 14. Connor 15. Sawyer
16. William 17. Michael 18. Isaac 19. Santiago 20. Silas
21. Elijah 22. Daniel 23. Jayden 24. Jeremiah 25. Bennett
26. James 27. Logan 28. Theodore 29. Ezekiel 30. Brooks
31. Benjamin 32. Jackson 33. Gabriel 34. Angel 35. Micah
36. Lucas 37. Sebastian 38. Anthony 39. Roman 40. Damian
41. Mason 42. Jack 43. Dylan 44. Easton 45. Harrison
46. Ethan 47. Aiden 48. Leo 49. Miles 50. Waylon

To do this, follow these steps:

 Divide the size of the population by the size of the desired sample. In this case, 100
divided by 10 is 10. Therefore, you will select every fifth name from the list.

Size of the population 100 10 Size of the step


Size of sample 10

 As the starting point, choose one name from the list at random. Do this by the “eyes
closed, pointing method” or, if the names are numbered, use any digits as the starting
point.
 Once the starting point has been determined, select every nth name and in case of the
example above, the nth name is a third name from the starting point.
This kind of sampling is called linear systematic sampling. Rather than selecting these ‘n’ units of
a sample randomly, a researcher can apply a skip logic to select these. It follows a linear path and
Page 21
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

then stops at the end of a particular population. Systematic sampling is easier and less trouble
than random sampling, and that is one reason why it is often preferred. It is also less precise.
Clearly, the assumption of each member of the population having an equal chance to be selected
is violated.
Systematic Sampling and simple random sampling differ in the sense that in the latter the
selections are independent from each other. In the former, selection is dependent on the selection
of a previous one. The population units are specially prepared and the sample units are selected
in systematic way by means of various techniques of which the sampling fraction technique method
is the most common. The researcher starts at a random point and selects every nth subject in the
sampling frame. The random starting point equals the sampling interval, n, times a random number
between 0 and 1, plus 1, rounded down. In systematic sampling, there is the danger of order bias:
the sampling frame list may arrange subjects in a pattern, and if the periodicity of systematic
sampling matches the periodicity of that pattern, the result may be the systematic over - or under-
representation of some stratum of the population. If, however, it can be assumed that the sampling
frame list is randomly ordered, systematic sampling is mathematically equivalent to an equally
precise as simple random sampling. If the list is stratified (ex. all females listed, then all males),
systematic sampling is mathematically equivalent to stratified sampling and is more precise than
simple random sampling.
Repeated Systematic14 Sampling
This is a variant that seeks to avoid the possibility of systematic biases due to periodicity in the
sampling frame. Repeated systematic sampling is done by taking several smaller systematic
samples, each with a different random starting point, rather than using one pass through the data
as in ordinary systematic sampling. Repeated systematic sample has the side benefit that the
variability in the sub sample means for a given variable is a measure of the variance of that
estimate in the entire sample.
Stratified15 Random Sampling
Stratified Random Sampling is an improvement over systematic sampling. Stratified random
sampling is useful method for data collection if the population is heterogeneous. In this method,
the entire heterogeneous population is divided in to a number of homogeneous groups, usually
known as Strata, each of these groups is homogeneous within itself, and then units are sampled
at random from each of these stratums.
The sampling frame is divided into subsections comprising groups that are relatively homogeneous
with respect to one or more characteristics and a random sample from each stratum is selected.
What we can say is that the two types of random sampling that were just discussed work fine if
specific characteristics of the population (such as age, gender, ethnicity, ability group) are of no
concern. In other words, if another set of 10 names were selected, one would assume that because
both groups were chosen at random, they are, in effect, equal. However, what if the individuals in
the population are not “equal” to begin with? In that case, you need to ensure that the profile of
the sample matches the profile of the population, and this is done by creating what is referred to
as stratified sampling.

14
Systematic sampling reduces the chance of certain participants be selected, therefore, it less unbiased than
simple random sampling.
15
Strata are like different layers, representing different characteristics
Page 22
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

The theory behind sampling (and the entire process of inference) goes something like this: If you
can select a sample that is as close as possible to being representative of a population, then any
observations you can make regarding that sample should also hold true for the population.
Representativeness refers to how well the sample to be drawn compares with (eg, is representative
of) the population of interest. Can the reader evaluate the study findings with assurance that the
sample of respondents reflects elements of the population with breadth and depth? So far so good.
Sometimes, though, random sampling leaves too much to chance, especially if you have no
assurance of equal distributions of population members throughout the sample and, most
important, if the factors that distinguish population members from one another (such as race,
gender, social class, or degree of intelligence) are related to what you are studying. This is a very
important point. In that case, stratified sampling is used to ensure that the strata (or layers) in the
population are fairly represented in the sample (which ends up being layered as well, right?).
For example, if the population is 82% Methodists, 14% Catholics, and 4% Jews, then the sample
should have the same characteristics if religion is an important variable in the first place.
Understanding the last part of the preceding sentence is critical. If a specified characteristic of the
population is not related to what is being studied, then there is no reason to be concerned about
creating a sample patterned after the population and stratifying on one of those variables. Let us
assume that the list of names in Table 6c above represents a stratified population (females and
males), and attitudes toward abortion is the topic of study. Because gender differences may be
important, you want a sample that reflects gender differences in the population. The list of 50
names consists of 20 females and 30 males, or 40% female and 60% male. The sample of 10
should mirror that distribution and contain 4 females and 6 males. Here is how you would select
such a sample using stratified random sampling. Once again, the example is the population we
created, but these steps apply to all circumstances.
1. All the males and all the females are listed separately.
2. Each member in each group receives a number. In this case, the males would be
numbered 01 through 30 and the females 01 through 20.
3. From a table of random numbers, 4 females are selected at random from the list of 20
using the procedures outlined earlier.
4. From a table of random numbers, 6 males are selected at random from the list of 30
using the procedures outlined earlier.
Although simple examples (with only one stratum or layer) such as this often occur, also you may
have to stratify on more than one variable. For example, in Figure 1, a population of 10,000
children is stratified on the variables of grade (40% first grade, 40%).
Stratified random sampling is a special form of simple or systematic sampling. The target
population is first separated into mutually exclusive, homogeneous segments (strata), and then a
simple random sample is selected from each segment (stratum). The samples selected from the
various strata are then combined into a single sample. This sampling procedure is sometimes
referred to as “quota random sampling.”
Stratification consists of dividing the population into homogeneous subgroups or subsets (called
strata) within each of which an independent sample is selected. The choice of strata is determined
based on the objective of the survey, the distribution characteristics of the variable of interest, and
the desired precision of the estimates. The division of the strata may take the form of age, sex
economic status etc., is based on more than one criterion. There are several major reasons why

Page 23
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

you might prefer stratified sampling to simple random sampling. First, it assures that you will be
able to represent not only the overall population, but also key subgroups of the population,
especially small minority groups. If you want to be able to talk about subgroups, this may be the
only way to effectively assure you will be able to. If the subgroup is extremely small, you can use
different sampling fractions (f) within the different strata to randomly over-sample the small group
(although you will then have to weigh the within-group estimates using the sampling fraction
whenever you want overall population estimates). When we use different sampling fractions in the
strata, we call this disproportionate stratified random sampling. Second, stratified random sampling
will generally have more statistical precision than simple random sampling. This will only be true
if the strata or groups are homogeneous. If they are, we expect that the variability within-groups
are lower than the variability for the population as a whole.
Even though this method of sampling enables the researcher to get detailed information about the
subgroups of the population, there are concerns regarding the difficulties to divide the population
into strata in some cases. This method of sampling is cardinal when a need to over-sample
particular subgroup is required. For example, you can study an equal number of girls and boys in
your university despite an inequality in the total population of female and male students in your
university.
Cluster Sampling
Cluster sampling is a probability sampling technique where researchers divide the population into
multiple groups which we could call clusters for research. We then then select random groups with
a simple random or systematic random sampling technique for data collection and data analysis.
Researchers opt to use cluster sapling when it is extremely difficult to have in the sample elements
from every sub population like having a student from every university in a country. Instead, we
may have to consider using cluster sampling, where we can club the universities in a country from
each city into one cluster. These clusters of universities in each city then define all the student
population in the country. Next, either using simple random sampling or systematic random
sampling we may have to pick universities randomly for the research study. Subsequently, by using
simple or systematic sampling, the students from each of these selected clusters can be chosen
on whom to conduct the research study.
Consider that we want to estimate pass rate in research methods among final year students in the
uncivilities in Zambia. in the City of Lusaka. We could take a random sample of 100 households
(HH). In that case, we need a sampling list or frame of Lusaka HHs. If the list is not available, we
need to conduct a census of HHs. The complete coverage of the city is required so that all HHs
are listed, which could be expensive. Furthermore, since our sample size is small compared to the
numbers of total HHs, we need to sample only few, say one or two, in each township (subdivisions).
Alternatively, we could select five townships (say the city is divided into 200 townships blocks),
and in each township we administer questionnaires to 20 HHs. We need to construct HH listing
frame only for 5 townships (less time and costs needed). Furthermore, by limiting the survey to a
smaller area, additional costs will be saved during the execution of interviews. Such sampling
strategy is known as “cluster sampling.” In cluster sampling, cluster, i.e., a group of population
elements, constitutes the sampling unit, instead of a single element of the population.

Types of cluster sampling


There are two ways to classify this sampling technique. The first way is based on the number of
stages followed to obtain the cluster sample, and the second way is the representation of the

Page 24
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

groups in the entire cluster. In most cases, sampling by clusters happens over multiple stages. A
stage is considered to be the step taken to get to the desired sample. We can divide this technique
into single-stage, two-stage, and multiple stages.
Single-stage cluster sampling
As the name suggests, sampling is done just once. An example of single-stage cluster sampling –
An NGO wants to create a sample of girls across five neighbouring towns to provide education.
Using single-stage sampling, the NGO randomly selects towns (clusters) to form a sample and
extend help to the girls deprived of education in those towns.
Two-stage cluster sampling
Here, instead of selecting all the elements of a cluster, only a handful of members are chosen from
each group by implementing systematic or simple random sampling. An example of two-stage
cluster sampling – A business owner wants to explore the performance of his/her plants that are
spread across various parts of the U.S. The owner creates clusters of the plants. He/she then
selects random samples from these clusters to conduct research.

Multiple stage cluster sampling

Multiple-stage cluster sampling takes a step or a few steps further than two-stage sampling.

For conducting effective research across multiple geographies, one needs to form complicated
clusters that can be achieved only using the multiple-stage sampling technique. An example of
Multiple stage sampling by clusters – An organization intends to survey to analyse the performance
of smartphones across Germany. They can divide the entire country’s population into cities
(clusters) and select cities with the highest population and also filter those using mobile devices.

Steps to conduct cluster sampling

Here are the steps to perform cluster sampling:

1. Sample: Decide the target audience and also the sample size.
2. Create and evaluate sampling frames: Create a sampling frame by using either an
existing framework or creating a new one for the target audience. Evaluate frameworks
based on coverage and clustering and make adjustments accordingly. These groups will be
varied, considering the population, which can be exclusive and comprehensive. Members of
a sample are selected individually.
3. Determine groups: Determine the number of groups by including the same average
members in each group. Make sure each of these groups are distinct from one another.
4. Select clusters: Choose clusters by applying a random selection.
5. Create sub-types: It is bifurcated into two-stage and multi-stage subtypes based on the
number of steps followed by researchers to form clusters.

Page 25
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Applications of cluster sampling

This sampling technique is used in an area or geographical cluster sampling for market research.
A broad geographic area can be expensive to survey in comparison to surveys that are sent to
clusters that are divided based on region. The sample numbers have to be increased to achieve
accurate results, but the cost savings involved make this process of rising clusters attainable.

Cluster sampling in statistics

The technique is widely used in statistics where the researcher can’t collect data from the entire
population as a whole. It is the most economical and practical solution for statisticians doing
research. Take the example of a researcher who is looking to understand the smartphone usage
in Germany. In this case, the cities of Germany will form clusters. This sampling method is also
used in situations like wars and natural calamities to draw inferences of a population, where
collecting data from every individual residing in the population is impossible.

Cluster sampling advantages

There are multiple advantages to using cluster sampling. Here they are:

1) Consumes less time and cost: Sampling of geographically divided groups requires less
work, time, and cost. It’s a highly economical method to observe clusters instead of
randomly doing it throughout a particular region by allocating a limited number of resources
to those selected clusters.
2) Convenient access: Researchers can choose large samples with this sampling technique,
and that’ll increase accessibility to various clusters.
3) Data accuracy: Since there can be large samples in each cluster, loss of accuracy in
information per individual can be compensated.
4) Ease of implementation: Cluster sampling facilitates information from various areas and
groups. Researchers can quickly implement it in practical situations compared to other
probability sampling methods.
5) In comparison to simple random sampling, this technique can be useful in deciding the
characteristics of a group such as population, and researchers can implement it without
having a sampling frame for all the elements for the entire population.

Cluster sampling vs stratified sampling

Since cluster sampling and stratified sampling are pretty similar, there could be issues with
understanding their finer nuances. Hence, the major differences between cluster sampling
and stratified sampling, are:

Page 26
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Cluster sampling Stratified sampling


Elements of a population are randomly
The researcher divides the entire population
selected to be a part of groups
into even segments (strata).
(clusters).
Researchers consider individual components of
Members from randomly selected
the strata randomly to be a part of sampling
clusters are a part of this sample.
units.
Researchers maintain homogeneity Researchers maintain homogeneity within the
between clusters. strata.
Researchers divide the clusters The researchers or statisticians primarily
naturally. decide the strata division.
The key objective is to minimize the The key objective is to conduct accurate
cost involved and enhance sampling, along with a properly represented
competence. population.
Systematic sampling and cluster sampling differ in how they pull sample points from the population
included in the sample. As we have seen from the example above, cluster sampling breaks the
population down into clusters, while systematic sampling uses fixed intervals from the larger
population to create the sample. Cluster sampling is considered less precise than other methods
of sampling. It may save costs on obtaining a sample. It may be used when completing a list of
the entire population is difficult. For example, it could be difficult to construct the entire population
of the customers of a grocery store to interview. However, cluster sampling is one of the efficient
methods of random sampling in which the population is first divided into clusters, and then a
sample is selected from the clusters randomly. Unlike the above, in pure cluster sampling, the
whole cluster is sampled. In contrary to stratified sampling, there should be heterogeneity within
the clusters and homogeneity between the clusters. The more homogeneity among the clusters,
lesser will be the margin of error or vice-versa. The method is mostly feasible in case of diverse
population spread over different areas.
This is a kind of sampling that will require you to select intact groups representing clusters of
individuals rather than choosing individuals one at a time. Cluster Sampling is used when a
sampling frame is not known. The researcher divides the population into elements of clusters. He
in turn samples the clusters, and then stratifies the samples. This is followed by re sampling,
repeating the process until the ultimate sampling units are selected at the last of the hierarchical
levels of clusters. When the strata are geographic units, this method is sometimes called area
sampling. For instance, at the top level, states may be sampled (with sampling proportionate to
state population size); then cities may be sampled; then schools; then classes; and finally students.
Probability proportional to size sampling (pps) is a related variant in which each of the hierarchical
levels prior to the ultimate level is sampled according to the number of ultimate units (example:
people or households) it contains.
Technically, cluster sampling is where all subjects at the lowest hierarchical level (example: all
students in a school) are sampled for each primary sampling unit (PSU's, which are the second-
lowest hierarchical level, such as schools or Census blocks), whereas multistage sampling is where
only a random sample of lowest hierarchical level subjects are selected. The greater the
heterogeneity of the strata and the finer the stratification (that is, the smaller the clusters involved)
depending on the topic of study, the more the precision of the findings. For instance, stratifying
Page 27
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

by gender at the highest level might well introduce bias in measuring opinions about an item
known to be gender-related, whereas stratifying by state would be less likely to introduce a bias
since there are more categories (more states than genders) and there is less likely to be a
correlation with the opinion item. At each stage, stratified sampling is used to further increase
precision.

Warnings on Use of Cluster Sampled Data


Clustering will produce correlated observations, which violates the assumption of independently
sampled cases - an assumption of many statistical techniques. Multi-level modelling is an example
of a technique, which is appropriate for clustered samples (see Goldstein, 1995). Nonetheless, it
should be noted that it is common practice to treat data from cluster sampling as if it were
randomly sampled data.

Multi-stage or cluster sampling

To draw the sample, this method actually uses a combination of various techniques. In this method,
the population is divided into groups at various levels. A group within a group, within a group and
so on. The sample is finally drawn from the smallest group among all the groups. Overall, multi-
stage or cluster sampling is usually less precise than simple random sampling, which in turn is less
precise than one-stage stratified sampling.

An example will suffice at this stage. Suppose you want to study the coverage of gender based
violence in the Daily Nation newspaper for ten years from 2000 to 2010. You may be expected to
break the period under inquiry giving a five-year gap and select years 2000, 2005, 2010 for the
study. Then, you may further select three months – for example, January, June and December -
of the aforementioned years for the study. You may further reduce the number of issues to be
studied and select only last two weeks of all the three months of the selected years. You may
further construct an even-numbered week or odd-numbered week and reduce your sample size.
Finally, you may select an odd-numbered week in January and even-numbered week in June and
odd-numbered week in December. You may follow this sequence and select 63 issues of the
newspaper for the study (There may be more roughly than some 3640 issues of the newspaper
published in 10 years).

Some warning is necessary when you select this method. Since multistage sampling is the most
prevalent form for large, national surveys, and since most computer programs use standard error
algorithms based on the assumption of simple random samples, the standard errors reported in
the literature often underestimate sampling error.

Idiographic Qualitative Research Sampling – Purposive Sampling

Now that we have had an opportunity to understand probability sampling within nomothetic
methodology, we turn to purposive sampling and its designs within ideographic methodology.
Sampling, which basically consist of sample size and sampling designs considerations, is very
important in all qualitative research. Such considerations would help qualitative researchers to
select sample sizes and sample designs that are most compatible with their research purposes

Page 28
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

(Onwuegbuzie and Leech, 200716). While quantitative researchers use complex mathematical
formulae to make sample size considerations, and they promote the use of random sampling, the
sample size considerations in qualitative studies are neither mathematical nor systematic. Rather,
they involve making series of decisions not only about how many individuals to include in a study
and how to select these individuals, but also about the conditions under which this selection will
take place. What we are saying here is that these decisions are extremely important (Curtis et al.,
200017).

The sampling designs in this paradigm are broadly called purposive, judgmental, selective, or
subjective sampling. Purposive sampling is a fad especially among novice researchers who find
random sampling a challenge and as such, it has been used widely and inappropriately. Despite
the wide use of purposeful sampling, there are numerous challenges in identifying and applying
an appropriate purposeful sampling strategy in a qualitative study. For instance, the range of
variation in a sample from which a purposive sample is to be taken is often not really known at
the outset of a study.

Purposive sampling, which basically consist of sample size and sampling designs considerations, is
very important in every qualitative research design. Such considerations would help you as a
qualitative researcher to select sample sizes and sample designs that are most compatible with
your overarching research or subsidiary research questions as the case may be (Onwuegbuzie and
Leech, 2007). In order to determine the sample size and the appropriate sampling design, we wish
to emphasise the need for researchers to appreciate the various purposive sampling designs. This
is because, purposive sampling represents a group of different sampling designs or sampling
designs (see Kuzel, 199918, Patton, 200219; for a complete list). This is so because the different
types of sampling designs rely on the judgement of the researcher when it comes to selecting the
units of analysis (e.g., people, cases/organisations, events, pieces of data). Usually, purposeful
sampling is used for the identification and selection of information-rich cases for the most effective
use of limited resources (Patton, 2002). This would require the researcher to identify and select
individuals or groups of individuals that are especially knowledgeable about or experienced with a
phenomenon of interest (Cresswell and Plano Clark, 201120). Usually, the sample being
investigated is quite small, especially when compared with probability sampling designs.

We need to appreciate that one of the core arguments supporting the application of a purposeful
sampling design is that it is not meant to be comprehensive in terms of getting potentially
representative units of analysis though some researchers would like to disagree with this position.
This is mainly because the interest we have as researchers is not in seeking a single ‘correct’ and
generalizable answer, but rather we want to examine the complexity of different conceptualizations
of phenomena that we want. This in essence is based on the qualitative research questions that
we have.

16
Onwuegbuzie, A. J., and Leech, N. L. (2007). A call for qualitative power analyses. Quality and Quantity.
41: 105-121.
17
Curtis, S., Gesler, W., Smith, G., and Washburn, S. (2002). Approaches to sampling and case selection in
qualitative research: Examples in the geography of health. Social Science and Medicine, 50, 1001-1014.
18
Kuzel A (1992). Sampling in qualitative inquiry. In: Crabtree B, Miller W (Eds) Doing Qualitative Research:
Research Methods for Primary Care (Vol. 3). London: Sage, 31-44.
19
Patton M (2002) Qualitative Evaluation and Research Methods. London. Sage.
20
Cresswell, JW.; Plano Clark, VL. (2011). Designing and conducting mixed method research. 2nd. Sage;
Thousand Oaks, CA:
Page 29
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

It is a pity that classical abduction which espouses very small samples or one unit seems to be
replaced by modified abduction where we have researchers opting to have larger or large samples.
This shift within the qualitative paradigm (from classical abduction to modified abduction) is
affecting the purest way of determining the sampling strategy and sample size issues. A very large
share of qualitative research which has been submitted to leading organisational outlets seem to
represent positivist epistemological standpoints of sampling which regrettably is a "mirror
quantitative research techniques" (Gephardt, 212004: 456). One visible sign of this mimicking is
the reporting of large and rather heterogeneous samples in qualitative studies. However, many of
these claims are based on anecdotal evidence, and there is a short supply of articles (Mason,
201022; Marshall et al., 201323) that actually confirm the dynamics in sample sizes in recent years.
We shall later on, discuss this trend in terms of sample size in studies applying qualitative
interviews as well as focus group discussion as methods that are linked to sample sizes for those
who desire to work with the rule of thumb.

Some people think that the design of a sampling strategy for a qualitative study is not as important
as that for quantitative inquiry. There is a tendency, particularly within the quantitative
environment, to consider that the sampling strategy for qualitative research is of lesser importance
to that where statistical inference is required. This is only true when sampling is being viewed from
the quantitative paradigm. Qualitative researchers do not focus on obtaining having evidence from
an accurate sample that will provide a reflection of the population. They instead strive for having
a “thick description” of phenomena from any given sample based on the research question and
design.

The term purposive sampling is the umbrella term for nonprobability sampling strategies used in
qualitative research. Purposive sampling strategies are predominantly used in naturalistic inquiries
when making decisions about selecting units of analysis (e.g., individuals, groups of individuals,
places or institutions). The selection is based on specific purposes associated with answering
research questions in a study (Teddlie and Yu, 2007: 7724). Purposive sampling helps the
researcher focus on the identification and selection of information-rich cases (persons or groups)
that are particularly appropriate or knowledgeable of the issues under investigation or interest
(Cresswell and Plano Clark, 201125). In addition to knowledge and experience, Bernard (2002) and
Spradley (1979) note the importance of availability and willingness to participate, and the ability
to communicate experiences and opinions in an articulate, expressive, and reflective manner.

21
Gephardt, R.P. (2004). Qualitative research and the Academy of Management Journal. Academy of
Management Journa. , 47(4): 454-462.
22
Mason, M. (2010). Sample size and saturation in PhD studies using qualitative interviews. Forum
Qualitative Sozialforschung / Forum: Qualitative Social Research, 11(3). Art. 8.
http://dx.doi.org/10.17169/fqs-11.3.1428 [Accessed: Aptil, 11, 2020]
23
Marshall, B., Cardon, P., Poddar, A., and Fontenot, R. (2013). Does sample size matter in qualitative
research?: A review of qualitative interviews in IS research. The Journal of Computer Information Systems,
54(1), 11-22.

24
Teddlie, C., and Yu, F. (2007). Mixed methods sampling. Journal of Mixed Methods Research, 1(1). 77-
100.
25
Cresswell, JW., Plano Clark, VL. (2011). Designing and conducting mixed method research. 2nd. Sage;
Thousand Oaks, CA:
Page 30
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

(Cresswell and Plano Clark, 2011; Bernard, 200026; Patton, 200227; Schutt, 200628; Ary et al.,
201029; Cresswell and Plano Clark, 2011). In addition, purposive sampling strategies allow the
researcher to decide the motives for selecting a specific category of informants in the study
(Bernard, 2000), and providing greater in-depth findings than other probability samplings methods
(Cohen et al., 201130).

Despite the wide use and mention of the term purposive sampling, there are numerous challenges
in identifying and applying the appropriate purposeful sampling strategy in any study. This is
because, there are many objectives that qualitative researchers might have and so is the list of
“purposive” strategies that one might follow. We can say it is virtually endless, and any given list
will reflect only the limited range of situations the author has considered. For example, you are
interested in studying Christian parents in your church who once had a child who aborted and you
would want to present their lived experiences on challenges, they had faced negotiating abstinence
with their children in the first place. This would be a difficult population to find if you were to go
for random sampling.

What we have affirm from the beginning is that with a purposive or non-random sample in a
qualitative project, the number of people or places or articles or documents that are required as
units of analysis is less important than the criteria used to select them. The characteristics of the
units are instead used as the basis of selection, most often chosen to reflect the diversity or
homogeneity and breadth of the sample population. The crux of the matter in a qualitative project
regarding sampling is that purposive sampling is promoted as a solution for pragmatic reasons of
which time and resources are not factors but the type and uses information that is sought. The
pragmatic uses of purposive sampling are often overshadowed by challenges which researchers
have regarding purposive sampling and we present five of the most pertinent challenges below.

The first one relates to the inability to understand what purposive sampling is. Patton (2002: 230)
has provided a definition of what purposeful sampling means and states it as the selection of
information-rich cases for study in depth insights and in-depth understanding rather than empirical
generalizations.” Information-rich cases are those from which one can learn a great deal about
issues of central importance to the purpose of the inquiry. To do this, the researcher must make
a decision a priori who or what will be required to meet the sample selection criteria. That is, what
characteristics will need to be reflected in the sample population to address the research question.

The second one relates to the error, which they demonstrate. They take purposeful sampling as a
type of non-random sampling among other types of non-random sampling and yet it is an umbrella
concept – a synonym of non-random sampling.

26
Bernard, H. R. (2000). Social research methods: Qualitative and quantitative approaches. Thousand Oaks,
CA: Sage.
27
Patton, MQ. (2002). Qualitative research and evaluation methods. 3rd. Sage Publications; Thousand Oaks,
CA.
28
Schutt, R. K. (2006). Investigating the social world: The process and practice of research (5 ed.) Thousand
Oaks, CA: Pine Forge.
29
Ary, D., Jacobs, L. C., Razavieh, A., and Sorensen, C. K. (2010). Introduction to research in education (8
ed.). New York, NY: Hult Rinchart and Wiston.
30
Cohen, L., Manion, L., and Morrison, K. (2011). Research methods in education (7 Ed.). New York, NY:
Routledge.
Page 31
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

The third one is about the failure to show a transparent audit trail on how researchers have made
a decision to pick the selected purposive technique and further describe the process of enlisting or
identifying the sample units (Mason, 201031; Carlsen and Glenton, 201132Malterud et al., 200533).
This is a very common problem in dissertations than in journal articles. The subject of the study is
phenomena that, by their very nature, do not easily submit to generalization.

The fourth one is associated with failure to address candidly the sample size. Qualitative research
experts argue that there is no straightforward answer to the question of ‘how many’ and that
sample size is contingent on a number of factors relating to epistemological, methodological and
practical issues (Baker and Edwards, 201234).

One of the core arguments supporting a purposeful sampling approach is that it is not meant at
all times to be comprehensive in terms of accessing all potentially relevant units of analysis, mainly
because the interest of the authors may not be in seeking a single ‘correct’ answer, but rather in
examining the complexity of different conceptualizations. In this case then, they will have to be
exhaustive in selecting their sample elements. However, there are times when researchers do not
want to examine the complexity of different conceptualizations but require finding sufficient cases
to explore patterns and as such would not necessarily attempt to be exhaustive in their searching
of sample elements of analysis. In this case, they may state the sample size.

Noting these two arguments, Patton (200235), Suri (2011)36and Palinkas et al (2015)37 advise
researchers to consider selecting different options of purposive sampling designs.

Despite this promising effort by these authors to theoretically present the different options of
purposive sampling, researchers who claim to have used a purposeful sampling approach often fail
to create a transparent audit trail on how they made a decision to pick the technique and further
describe the process. However, many studies — probably most of those undertaken in African
communities — do not obtain probability samples in the sense that they are idiographic in nature.
The idiographic (qualitative) approach concentrates on the descriptive and interpretative — sensing
and understanding—characteristics of the examined reality. The subject of the study is phenomena
that, by their very nature, do not easily submit to generalization. It is for this reason that the
researcher focuses on an exhaustive description and an understanding of individual phenomena,
their singularity and uniqueness, inclusive of an attempt at grasping and explaining those external
events that might mould or qualitatively change the phenomena. Such study is of an individual

31
Mason, M. (2010). Sample size and saturation in PhD studies using qualitative interviews. Forum:
Qualitative Social Research, 11, Article 8.
32
Carlsen, B., and Glenton, C. (2011). What about N? A methodological study of sample-size reporting in
focus group studies. BMC Medical Research Methodology, 11, Article 26.doi:10.1186/1471-2288-11-26.
33
Malterud1,K., Siersma,V.D., Guassora, A.D. (2015). Sample Size in Qualitative Interview Studies: Guided
by Information Power. Qualitative Health Research 1–8.
34
Baker SE, Edwards R. (2012). How many qualitative interviews is enough?:Expert voices and early career
reflections on sampling and cases in qualitative research. National Centre for Research Methods Review
Paper. 2012; http://eprints.ncrm.ac.uk/2273/4/how_many_interviews.pdf.Accessed 21st June 2016.
35
Patton MQ.(2002). Qualitative Evaluation and Research Methods (2nd Ed.).
36
Suri H. (2011). Purposeful sampling in qualitative research synthesis. Qual Res J. 11:63–75.
37
Palinkas, L.A., Horwitz, S.M., Green, C.A., Wisdom, J.P., Duan, N., Hoagwood, K. 2015). Purposeful
sampling for qualitative data collection and analysis in mixed method implementation research Adm Policy
Ment Health. 42(5): 533–544.
Page 32
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

character and does not need the determination of a representative sample as in nomothetic
research. Of necessity and practicality, idiographic researchers adopt other sampling strategies to
achieve acceptable accuracy at an acceptable cost and these can be part of a good research design.

In idiographic research that is predominantly qualitative, samples are not treated in the manner a
nomothetic researcher does and this is because the epistemologies and ontologies are different.
The main stay in pure idiographic research is non-random sampling. Non-random sampling is
widely used as a case selection method in idiographic research of a preliminary and exploratory
nature where random sampling is not feasible or an alternative to obtain a case with particular
attributes for the selected or desired information.

There are three reasons why we use non-probability sampling schemes and these are:
a) We cannot use the mathematics of probability to analyse the findings.
b) In general, we cannot count on a non-probability sampling scheme to produce
representative samples for this is not the focus of idiographic research.
c) We are not committed to make generalisations when we are using idiographic methods.

Since idiographic sampling is qualitative in nature and data sources are usually spatially and
temporarily non-independent and they have population distributions that are non-normal and are
generally unknown (non pre-specifiable), this type of sampling is necessary (Miles and Huberman,
199438; Patton, 200239; Marshall et al., 200840) when probability sampling would not be
appropriate. Idiographic sampling is suited to study one case or small samples. In qualitative
studies, one case or small samples are ideal given that from them rich, copious, intensive and
lengthy amounts of data may emerge. The focus here is not presenting quantitative issues but
outlining qualitative issues (Fielding, 1993; Hoagwood et al., 200741). Having said so, we can then
examine the various types of idiographic sampling designs.

Advantages of Purposive Sampling


Dane (1990) points out the advantage of purposive sampling is that it allows the researcher to
home in on people or events, which have good grounds in what they believe, will be critical for the
research. Instead of going for the typical instances, a cross-section or a balanced choice, the
researcher will be able to concentrate on instances which display wide variety – possible even
focus on extreme cases to illuminate the research question at hand. In this sense it might not only
be economical but might also be informative in a way that conventional probability sampling cannot
be (Descombe, 1998). With a non-probability sampling methods, the researcher feels that it is not
feasible to include a sufficiently large number of examples in the study, this very much goes hand
in hand with qualitative research. The aim of the study is to explore the quality of the data not the
quantity (Nachmias, 1996).

38
Miles, M.B., and Huberman,A.M. (1994), Qualitative data analysis: an expanded sourcebook, 2nd ed.
California: Sage.
39
Patton, MQ. (2002). Qualitative research and evaluation methods. 3rd. Sage Publications; Thousand Oaks,
CA.
40
Marshall T, Rapp CA, Becker DR, Bond GR. (2008). Key factors for implementing supported employment.
Psychiatric Services. 59:886–892.
41
Hoagwood KE, Vogel JM, Levitt JM, D’Amico PJ, Paisner WI, Kaplan SJ. (2007). Implementing an evidence-
based trauma treatment in a state system after September 11: the CATS Project. Journal of the American
Academy of Child and Adolescent Psychiatry. 46(6):773–779.
Page 33
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

There are, however, some sound theoretical reasons why most qualitative research uses purposive
sampling designs and good practical reasons why qualitative researcher deals with small numbers
of instances to be researched. There are in fact two things, which can be said about sample size
in qualitative research. Firstly, it is unlikely to be known with precision or certainty at the start of
a research project. Second, the sample size will generally be very small. Both points can be
unnerving. They go against the grain as far as conventional survey approaches are concerned, and
open up the prospect of accusations of sloppy and biased research design. The researcher is quite
explicit about the use of non-probability sampling (Miles and Huberman, 199442). Another point is
that phenomenology is well suited to purposeful sampling. This type of sampling permits the
selection of interviewees whose qualities or experiences permit an understanding of the
phenomena in question, and are therefore valuable. This is the strength of purposive sampling.

One justification for using the non-probability purposive sampling is that it stems from the idea
that the research process is one of "discovery" rather than testing of hypotheses. It is a strategy
where Lincoln and Guba (198543) describe as ‘emergent and sequential’. Almost like detective, the
researcher follows a trail of clues, which leads the researcher in a particular direction until the
questions have been answered and things can be explained (Robson, 200244).

The major problem with purposive sampling is that the type of people who are available for study
may be different from those in the population who cannot be located and this might introduce a
source of bias. In purposive sampling, we sample with a purpose in mind based on either the skill
or need or judgement of the typical sample element that can give us what we want. Purposive
sampling can be very useful for situations where you need to reach a targeted sample quickly and
where sampling for proportionality is not the primary concern. With a purposive sample, you are
likely to get the opinions of your target population, but you are also likely to overweight subgroups
in your population that are more readily accessible.

When setting off to sample, Miles and Huberman (199445), Kemper and colleagues (200346) identify
seven principles:

1) The sampling strategy should stem logically from the theoretical or conceptual framework
(in case of modified abduction) or the research questions being addressed by the study
(in case of classical abduction);
2) The sample should be able to generate a thorough database on the type of phenomenon
under study;

42
Miles, M.B., and Huberman,A.M. (1994), Qualitative data analysis: an expanded sourcebook, 2nd ed.
California: Sage.
43
Lincoln, Y. S. and Guba, E. G. (1985) Naturalistic Inquiry. Newbury Park, CA: SAGE.
Maxwell, J. A. (2005) Qualitative Research Design: An Interactive Approach, 2nd edn.
Thousand Oaks, CA: SAGE.
44
Robson, C. (2002). Real world research: A resource for social scientists and practitioner researchers (2nd
ed.). Oxford, UK: Blackwell.
45
Miles, M. B. and Huberman, A. M. (1994) Qualitative Data Analysis: An Expanded Sourcebook, 2nd edn.
Thousand Oaks, CA: SAGE.

46
Kemper, EA., Stringfield, S., Teddlie, C. (2003). Mixed methods sampling strategies in social science
research. In: Tashakkori, A., Teddlie, C. Eds. Handbook of mixed methods in the social and behavioral
sciences. Sage; Thousand Oaks.
Page 34
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

3) The sample should at least allow the possibility of drawing clear inferences and credible
explanations from the data;
4) The sampling strategy must be ethical;
5) The sampling plan should be feasible and transparent;
6) The sampling plan should allow the researcher to transfer the conclusions of the study to
settings or populations with similar characteristics; and
7) The sampling scheme should be as efficient as practical.

When these are settled, the researcher could then choose any one or a combination of the following
variants of purposive sampling:

Convenience Sampling

Convenience sampling is used in exploratory research where the researcher is interested in getting
an inexpensive approximation of the truth. As the name implies, the sample is selected because
they are convenient. This nonprobability method is often used during preliminary research efforts
to get a gross estimate of the findings, without incurring the cost or time required to select a
random sample. Accidental samples are the favourite person on the street who is given a
questionnaire or interviewed. Also called haphazard sampling, examples include interviewing
people who emerge from an event or location, interviewing a captive audience such as one's
students, and mail-in surveys printed in magazines and newspapers. There are subtypes of
availability sampling and these may include the following:

Quota Sampling

Quota sampling is the nonprobability equivalent of stratified sampling. According to this sampling
technique, the population is first classified by characteristics such as gender, age, etc.
Subsequently, sampling units are selected to complete each quota. For example, in the study by
Larkin et al., the combination of vemurafenib and cobimetinib versus placebo was tested in patients
with locally-advanced melanoma, stage IIIC or IV, with BRAF mutation.47 The study recruited 495
patients from 135 health centers located in several countries. In this type of study, each center
has a "quota" of patients.

There are two types of quota sampling: proportional and non-proportional. In proportional quota
sampling you want to represent the major characteristics of the population by sampling a
proportional amount of each. For instance, if you know the population has 40% women and 60%
men, and that you want a total sample size of 100, you will continue sampling until you get those
percentages of men and then you will stop when you attain them. Therefore, if you already have
the 40 women for your sample, but not the sixty men, you will continue to sample men but even
if legitimate women respondents come along, you will not sample them because you have already
"met your quota." The problem here (as in much purposive sampling) is that you have to decide
the specific characteristics on which you will base the quota. Will it be by gender, age, education
race, religion, etc.?

47
Larkin J, Ascierto PA, Dréno B, Atkinson V, Liszkay G, Maio M, Mandalà M, Demidov L, Stroyakovskiy D,
Thomas L, de la Cruz-Merino L, Dutriaux C, Garbe C, Sovak MA, Chang I, Choong N, Hack SP, McArthur GA,
Ribas A (2014). New England Journal of Medicine. 13; 371(20):1867-1876.
Page 35
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Non-proportional quota sampling is a bit less restrictive. In this method, you specify the minimum
number of sampling units you want in each category. Here, you are not concerned with having
numbers that match the proportions in the population. Instead, you simply want to have enough
to assure that you will be able to reach even small groups in the population. This method is the
non-probabilistic is analogous of stratified random sampling in that it is typically used to assure
that smaller groups are adequately represented in your sample.

Like stratified sampling, the researcher first identifies the stratums and their proportions as they
are represented in the population. Then convenience or judgment sampling is used to select the
required number of subjects from each stratum. This differs from stratified sampling, where
random sampling fills the stratums. Quota sampling is a type of stratified availability sampling, but
with the constraint that proportionality by strata be preserved. Quota sampling is rarely used in
academic social science research. It is frequently used in opinion polls, market research. The aim
of quota sampling is to produce a sample that reflects proportions in different categories like
gender age ethnicity groups, social economic status etc., similar to what is obtaining in the
population. Once the categories have been created, the researcher then selects sample units from
each of these. Quota sampling has a number of variations and the notable ones include:

Maximum Variation (Heterogeneity) Sampling


A maximum variation sample also known maximum diversity or heterogeneous sampling is
constructed by identifying key dimensions of variations and then finding cases that vary from each
other as much as possible. A maximum variation sample, if carefully drawn, can be as
representative as a random sample. Despite what many people (with a little knowledge of
statistics) believe, a random sample is not necessarily the most representative, especially when
the sample size is small. In the event that you wanted to use this technique, you will have to
cconstruct criteria to help you. You need to do the following:

1) Identify key dimensions of variation (e,g social economic status, gender and region), and
then
2) Find cases that vary from each other as much as possible along these dimensions.

This sampling yields ‘high-quality, detailed descriptions of each case, which are useful for
documenting uniqueness, and important shared patterns that cut across cases and derive their
significance from having emerged out of heterogeneity’ (Patton, 2002: 235)

Employing maximum variation sampling, research synthesists can identify essential features and
variable features of a phenomenon as experienced by diverse stakeholders among varied contexts
to facilitate informed global decision-making. Presuming that different study designs illuminate
different aspects of a phenomenon, maximum variation sampling can be utilised to construct a
holistic understanding of the phenomenon by synthesizing studies that differ in their study designs
on several dimensions.

You will observe that maximum variation sampling is a variant of quota sampling, in which the
researcher purposively and non-randomly tries to select a set of cases that exhibit maximal
differences on variables of interest. Instead of seeking representativeness through equal
probabilities, maximum variation sampling seeks it by including a wide range of extremes. The
principle of maximum variation sampling is that if you deliberately try to interview a very different
selection of people, their aggregate answers can be close to the whole population.

Page 36
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

The method sounds odd, but works well in places where a random sample cannot be drawn. This
is an extension of the statistical principle of regression towards the mean - in other words, if a
group of people is extreme in several different ways, it will contain people who are average in
other ways. Therefore, if you sought a "minimum variation" sample by only trying to cover the
types of people who you thought were average, you would be likely to miss a number of different
groups that make up quite a high proportion of the population. However, by seeking maximum
variation, average people are automatically included.

Researchers would prefer to use important shared patterns of some phenomena that cut across
cases and prefer to derive their significance emerging out of heterogeneity go for this method. If
we decided to sample social welfare units in urban and rural areas in different parts of a province
to capture maximum variation in location to document, unique or diverse variations that have
emerged in service provision.

Homogenous Sampling
In direct contrast to maximum variation sampling is the strategy of picking a small, homogenous
sample, the purpose of which is to describe some particular subgroup in depth’ (Patton, 2002:
235). Research synthesists are frequently criticised for mixing apples and oranges. Research
synthesists can overcome this problem to some extent by selecting studies that are relatively
homogenous in their study designs and conceptual scope. Homogenous samples can facilitate
meaningful comparisons across studies. Underscoring the epistemological incommensurability of
different qualitative methods, some qualitative research synthesists recommend a certain level of
methodological homogeneity among primary research studies which are included in a qualitative
research synthesis (e.g. Eastabrooks et al., 1994; Paterson et al., 2001). Homogenous samples
are particularly suitable for participatory syntheses in which the synthesist co-synthesizes research
with practitioners about a phenomenon that has direct implications for their practice (for a detailed
discussion of participatory synthesis, see Suri, 2007). For instance, a group of secondary
mathematics teachers intending to introduce collaborative learning activities into their classroom
might benefit more from co-synthesizing collaborative learning research in secondary math rather
than collaborative learning research across all grade-levels and different disciplines. Another
example when homogenous sampling could be used is when the researcher desires to account or
describe a particular subgroup in depth, to reduce variation, simplify analysis and facilitate group
interviewing. The researcher may for instance select school managers to discuss challenges of
implementing the girl child re-entry policy. This is often used for selecting focus group participants.
Typical Case Sampling
The purpose of typical case sampling ‘is to describe and illustrate what is typical to those unfamiliar
with the setting’. Typical cases are selected ‘with the cooperation of key informants’ or using
‘statistical data… to identify “average-like” cases’. When employing typical case sampling, it is
crucial ‘to attempt to get broad consensus about which cases are typical–and what criteria are
being used to define typicality’ (Patton, 2002: 236). Research synthesists can select typical primary
research studies employed in the field with the cooperation of key researchers in the field to
describe typical methodologies and study designs employed to examine the phenomenon. This
would be particularly useful for studying how common themes recurring in the published literature
might be related to the relative strengths and weaknesses of the typical methodologies or theories
underpinning the typical studies.

Page 37
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

An example of typical sampling would be to illustrate or highlight what is typical, normal or average.
A student undergoing drug rehabilitation and the purpose could be to describe and illustrate what
is typical to those unfamiliar with drug rehabilitation.
Extreme Deviant Case Sampling
In the extreme deviant case sampling approach, the researcher focuses on cases that are rich in
information because they are unusual or special in some way. Extreme (or deviant) case sampling
is a type of purposive sampling that is used to focus on cases that are special or unusual, typically
in the sense that the cases highlight notable outcomes, failures or successes. These extreme (or
deviant) cases are useful because they often provide significant insight into a particular
phenomenon, which can act as lessons (or cases of best practice) that guide future research and
practice. In some cases, extreme (or deviant) case sampling is thought to reflect the purest form
of insight into the phenomenon being studied.
Unusual or special cases may be particularly troublesome or especially enlightening, such as
outstanding successes or notable failures. If, for example, the evaluation was aimed at gathering
data help a national program reach more clients, one might compare a few project sites that have
long waiting lists with those that have short waiting lists. If staff morale was an issue, one might
study and compare high-morale programs to low-morale programs. The main weakness of extreme
case sampling is its lack of generalisability through representativeness. This weakness is of less
concern for synthesists who focus on how things should be or could be rather than how things
are. This strategy would be particularly suitable for ‘realist syntheses’, proposed by Pawson (2006),
which investigate how a program is likely to work under particular circumstances by examining
successful as well as unsuccessful implementations of the program. Researchers who have an
interest in illuminating both the unusual and the typical would go for this strategy. An example will
suffice for us. If you desired to select teachers from public schools with best and worst performance
student records.
Intensity Sampling
Unlike extreme deviant which goes for the polar cases, intensity sampling in a research synthesis
would involve selecting studies that are average or median or frequent or ‘excellent or rich
examples of the phenomenon of interest, but not highly unusual cases (Patton, 2002: Kramer and
Burns, 200848). Intensity sampling in a research synthesis would involve selecting studies that are
‘excellent or rich examples of the phenomenon of interest, but not highly unusual cases… cases
that manifest sufficient intensity to illuminate the nature of success or failure, but not at the
extreme’ (Patton, 2002: 234).
Criterion Sampling
Criterion sampling involves reviewing and studying ‘all cases that meet some predetermined
criterion of importance’ (Patton, 2002: 238). This approach is frequently employed by research
synthesists to construct a comprehensive understanding of all the studies that meet certain pre-
determined criteria. Most research synthesists employ criterion sampling by stating explicit
inclusion/exclusion criteria, which includes specifications for methodological rigour. It is crucial to
reflect critically and realistically on the criteria being used, especially the criteria for methodological
rigour. Very strict criteria for methodological rigour can result in inclusion of such a small number
of studies that the transferability of synthesis findings becomes questionable. At the same time,

48
Kramer TF, Burns BJ. (2008). Implementing cognitive behavioral therapy in the real world: a case study
of two mental health centers. Implementation Science. 3 (14)
Page 38
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

including methodologically weak studies can also result in the synthesis findings being based on
questionable evidence (Patton, 2002; Marshal et al., 200849). There are two variants of criterion
sapling and these are criterion - i and criterion- e samling techniques. Below, we discuss the two
variants of criterion sampling.
Expert or Criterion-i Sampling
Criterion-i sampling as a purposeful sampling strategy shares many characteristics with random
probability sampling, despite having different aims and different procedures that you will have to
use to identify and select potential participants. It is used when the researcher desires to identify
and select important cases that meet some predetermined criterion of importance for an inquiry
(Patton, 2002; Palinkas et al., 2015).50 For you to use this technique, you are expected to set a
predetermined criterion of importance that is relevant to the study. Usually there is a population
that has the attributes of interest that can provide typical information to answer your research
question. Therefore, the information we would get from this population is both in depth and
generalizable to a larger group. What we do to set off is that we would have to select individuals
based on the assumption that they possess knowledge and experience with the phenomenon of
interest. These respondents will be in a position to provide information that is both detailed (depth)
and generalizable (breadth). These are usually drawn from a larger sample of participants except
that they meet the same criteria. For instance, you may consider selecting credit managers in a
commercial bank as they play a specific role in the organization and/or are involved in a particular
process of concern. To some extent, they are assumed to be “representative” of that role.
Expertise is any special knowledge, not necessarily formal training. Depending on the topic of
study, experts may be policy issue academics or devotees to a popular culture fad. An example
will be the selection of consultant trainers and program leaders at study (Marshall et al., 2008).
This expertise may be required during the exploratory phase of qualitative research, highlighting
potential new areas of interest or opening doors to other participants. Alternately, the particular
expertise that is being investigated may form the basis of your research, requiring a focus only on
individuals with such specific expertise.
Criterion-i sampling is particularly useful where there is a lack of empirical evidence in an area and
high levels of uncertainty, as well as situations where it may take a long period before the findings
from research can be uncovered. Therefore, expert sampling is a cornerstone of a research design
known as expert elicitation. Often, we convene such a sample under the auspices of a "panel of
experts."
There are actually two reasons you might employ criterion -i sampling. First, because it would be
the best way to elicit the views of persons who have specific expertise. In this case, Criterion-i is
essentially just a specific sub case of purposive sampling. Nevertheless, the other reason you might
use Criterion-i is to provide evidence for the validity of another sampling approach you have
chosen.
Criterion e Sampling

49
Marshall T, Rapp CA, Becker DR, Bond GR. (2008). Key factors for implementing supported employment.
Psychiatric Services. 59:886–892.
50
Palinkas, L.A., Horwitz, S.M., Green, C.., Wisdom, J.P. Duan, N., and Hoagwood, K. (2015). Purposeful
sampling for qualitative data collection and analysis in mixed method implementation research.Adm Policy
Ment Health. 42(5): 533–544. doi:10.1007/s10488-013-0528-y.

Page 39
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Critical case sampling is a type of purposive sampling technique that is particularly useful in
exploratory qualitative research, research with limited resources, as well as research where a single
case (or small number of cases) can be decisive in explaining the phenomenon of interest. It is
this decisive aspect of critical case sampling that is arguably the most important. To know if a case
is decisive, think about the following statements: If it happens there, it will happen anywhere; or?
if it doesn’t happen there, it won’t happen anywhere; and If that group is having problems, then
we can be sure all the groups are having problems? (Patton, 2002:237). Whilst such critical cases
should not be used to make statistical generalisations, it can be argued that they can help in
making logical generalisations. However, such logical generalisations should be made carefully.
An example is the selection of directors of agencies that failed to move to the next stage of
implementation within expected period of time. To identify and select all cases that exceed or fall
outside a specified criterion. The selection of directors of agencies that failed to move to the next
stage of implementation within expected period of time. A critical case is one that permits analytic
generalisation, as, if a theory can work in the conditions of the critical case, it's likely to be able to
work anywhere.

Stratified purposeful sampling

Following on from criterion sampling where each of the criteria would become a sample, stratified
purposive samples are samples within samples where each stratum, or group, is fairly homogenous
and are analysed within these groups. Stratified samples are samples within samples’ where each
stratum is ‘fairly homogenous’. The purpose of stratified purposeful sampling is ‘to capture major
variations’ even though ‘a common core… may also emerge in the analysis’ (Patton, 2002: 240).
Stratified purposeful sampling is useful for examining the variations in the manifestation of a
phenomenon as any key factor associated with the phenomenon is varied. In a research synthesis,
this factor may be contextual, methodological, or conceptual. It is particularly useful to study
different models of implementing a particular teaching and learning strategy, such as distinct
models of cooperative learning that are commonly used by teachers. Often, traditional reviewers
tacitly draw on stratified purposeful sampling by clustering studies according to a key dimension
of variation and then discussing each cluster in-depth. In developing the methodologically inclusive
research synthesis framework, I employed stratified purposeful sampling to select key publications
from many distinct qualitative research traditions. By seeking input from qualitative researchers
with diverse methodological orientations and reading general qualitative research method.
Snowball Sampling Chain or Referral Sampling
Treading an uneasy line between the dictates of replicable and representative research design and
the more flowing and theoretically led sampling designs of qualitative research, snowball sampling
lies somewhat at the margins of research practice. Snowball sampling may simply be defined as a
technique for finding research subjects whereby one subject gives the researcher the name of
another subject, who in turn provides the name of a third, and so on (Berg, 198851; Spreen, 1992;
Vogt, 199952). Snowball sampling is a special nonprobability method used when the desired sample
characteristic is rare. It may be extremely difficult or cost prohibitive to locate respondents in these

51
Berg, S. (1988) Snowball sampling, in Kotz, S. and Johnson, N. L. (Eds.) Encyclopaedia of Statistical
Sciences Vol. 8.
52
Vogt, W. P. (1999) Dictionary of Statistics and Methodology: A Nontechnical Guide for the Social Sciences,
London: Sag
Page 40
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

situations. Snowball sampling relies on referrals from initial subjects to generate additional
subjects. Snowball sampling can be placed within a wider set of link-tracing methodologies
(Spreen, 199253) which seek to take advantage of the social networks of identified respondents to
provide a researcher with an ever-expanding set of potential contacts (Thomson, 1997). This
process is based on the assumption that a ‘bond’ or ‘link’ exists between the initial sample and
others in the same target population, allowing a series of referrals to be made within a circle of
acquaintance (Berg, 1988).
Snowball sampling can be applied for two primary purposes. Firstly, and most easily, it is as an
‘informal’ method to reach a target population. If the aim of a study is primarily explorative,
qualitative and descriptive, then snowball sampling offers practical advantages (Hendricks, et al.,
199254). Snowball sampling is used most frequently to conduct qualitative research, primarily
through interviews. Secondly, snowball sampling may be applied as a more formal methodology
for making inferences about a population of individuals who have been difficult to enumerate with
descending methods such as household surveys (Snijders, 199255; Faugier and Sergeant, 199756).
Researchers opt for this strategy when they desire to get difficult members to locate in the
population. They use this method to identify cases of interest from sampling people who know
people that generally have similar characteristics who, in turn know people, also with similar
characteristics. This may take the form of asking recruited sex workers to identify their
regular customers, link persons.
However, the technique offers real benefits for studies which seek to access difficult to reach or
hidden populations. These are often obscured from the view of social researchers and policy
makers who are keen to obtain evidence of the experiences of some of the more marginal excluded
groups. Policy makers and academics have long been aware that certain ‘hidden’ populations, such
as the young, male and unemployed, are often hard to locate. Other groups such as criminals,
prostitutes, drug users and people with unusual or stigmatised conditions (e.g. AIDS sufferers)
pose a range of methodological challenges if we are to understand more about their lives.
Snowball samples have a number of deficiencies and these relate to problems of representative
ness and sampling principles. The quality of the data and in particular a selection bias that limits
the validity of the sample are the primary concerns of recent snowball sampling research (Van
Meter, 1990; Kaplan et al, 1987). Because elements are not randomly drawn, but are dependent
on the subjective choices of the respondents first accessed, most snowball samples are biased and
do not therefore allow researchers to make claims to generality from a particular sample (Griffiths
et al, 199357). Secondly, snowball samples will be biased towards the inclusion of individuals with
inter-relationships, and therefore will over-emphasise cohesiveness in social networks (Griffiths et

53
Spreen, M. (1992) ‘Rare populations, hidden populations and link-tracing designs: what and why?’, Bulletin
Methodologie Sociologique, vol. 36, 34-58.
54
Hendricks, V. M., Blanken, P. and Adriaans, N. (1992) Snowball Sampling: A Pilot Study on Cocaine Use,
Rotterdam: IVO
55
Snijders, T. (1992) ‘Estimation on the basis of snowball samples: how to weight’, Bulletin Methodologie
Sociologique, vol. 36, 59-70.
56
Faugier, J. and Sargeant, M. (1997) Sampling hard to reach populations, Journal of Advanced Nursing,
vol. 26, 790-797.
57
Griffiths, P., Gossop, M., Powis, B. and Strang, J. (1993) Reaching hidden populations of drug users by
privileged access interviewers: methodological and practical issues, Addiction, vol. 88, 1617-1626.
Page 41
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

al, 1993) and will miss ‘isolates’ who are not connected to any network that the researcher has
tapped into (Van Meter, 199058).
The problem of selection bias may be partially addressed, firstly through the generation of a large
sample and secondly by the replication of findings to strengthen any generalisations. At present,
a statistical formalisation of snowball sample biases is not available (Van Meter, 1990). The ideal
number of links in a referral chain will vary depending on the purpose of the study. More links in
each chain will generate substantial data about a particular sample, and may allow access to those
most difficult to identify (e.g. those respondents who require the greatest level of trust to be built
up before participating). However, it is also more likely that members of such a large single chain
sample will share similar and unique characteristics not shared by the wider population. Thus,
there may be a case for initiating several discrete chains with fewer links, particularly where any
inference about a wider hidden population is considered important.
By their very nature, members of a hidden population are difficult to locate. Often studies require
some previous ‘knowledge of insiders’ in order to identify initial respondents. Such prior knowledge
may not be readily available to researchers and it may be very time consuming and labour intensive
to acquire. Under these circumstances it is possible that people in positions of relative authority or
proximity may provide a route into the required population (Groger et al, 199959).
Respondent Driven Sampling
RDS is a type of snowball sampling used for analysing characteristics of hidden or hard-to-reach
populations. It was developed in 1997 by Dr. Douglas Heckathorn, a professor of Sociology at
Cornell and has been applied to groups ranging from men who have sex with men, injection drug
users, children living on the street and jazz musicians (Heckathorn, 199760). RDS relies on multiple
waves of peer-to-peer recruitment and statistical adjustments to try and approximate random
sampling. The extent to which RDS-derived estimates are valid and generalizable remains a source
of controversy in the peer-reviewed literature. RDS sampling consists of the following three steps:
Seed selection:
All RDS studies begin with a small number of seeds from the target population (e.g., 3-15 people).
Seeds should be diverse and well-networked, but they do not need to be chosen randomly.
Interviews and recruitment:
Seeds complete the interview process and receive a predetermined number of coupons that they
can use to recruit other people like them (Wave 1). The recruits of Wave 1 then complete the
interview process and recruit Wave 2. This referral chain continues until the desired sample size is
reached.
Incentives:
Participants receive two incentives: one for completing the interview, and one for each peer that
is successfully recruited. RDS only works in populations that are connected to one another.
Furthermore, the population has to be large enough to sustain long referral chains without
repeated participants.

58
Van Meter, K. (1990) Methodological and Design Issues: Techniques for Assessing the Representatives of
Snowball Samples, NIDA Research Monograph, 31-43.
59
Groger, L., Mayberry, P., and Straker, J. (1999). What we didn't learn because of who would not talk to
us, Qualitative Health Research, 9(6), 829-835.
60
Heckathorn, DD. 1997;Respondent-Driven Sampling: A New Approach to the Study of Hidden.
Page 42
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Once the sample has been recruited, statistical techniques have been developed to try and reduce
biases in the data. RDS inference or analysis focuses on two main sources of bias:

Differential social network sizes:


People with small social networks are weighted more heavily than people with large social networks
to compensate for the fact that people with small networks are likely underrepresented.
Differential recruitment:
People whose probability of recruitment is artificially increased due to homophily (e.g., same race
as the recruiter) are weighted less than people who may be left out of the sample simply because
they have certain characteristics that are different than the recruiters (Wejnert and Heckathorn,
2011. Johnston, 201361).62
Theoretical Purposive Sampling
Theoretical sampling is different by being part of the collection and analysis of the data, following
provisional sampling and analysis of some data (Strauss, 198763; Coyne, 199764; Robinson, 201465).
Theoretical sampling originally came from Grounded Theory but is applied to other methods as
well (Mason, 200266). According to Patton (2002), what may lead you as a qualitative researcher
to consider applying theoretical sampling are as follows:
a) When you desire to select cases that represent important theoretical or operational
constructs (broad concepts or topics) or concepts (terms) about the phenomenon of
interest;
b) When you desire to understand operational definitions or meanings of key theories or
constructs (broad concepts or topics) or concepts (terms) related to the phenomenon of
interest and
c) When you desire to develop boundaries for these by creating specific inclusion and
exclusion criteria in relation to selecting primary studies for the synthesis.
Grounded-theorists define theoretical sampling as the sampling that is based on the concepts
emerging from the data for the purpose of exploring ‘the dimensional range or varied conditions
along which the properties of concepts vary’ (Strauss and Corbin, 1998: 7367). Research synthesists

61
Johnston, LG. (2013). Introduction to HIV/AIDS and Sexually Transmitted Infection Surveillance, Module
4: Introduction to Respondent Driven Sampling. WHO: Geneva, Switzerland,
http://applications.emro.who.int/dsaf/EMRPUB_2013_EN_1539.pdf
62
Wejnert, C, Heckathorn, D. (2011). Chapter 22: Respondent-Driven Sampling: Operational Procedures,
Evolution of Estimators, and Topics for Future Research. In Williams, M, Vogt, W. (Eds.), The SAGE Handbook
of Innovation in Social Research Methods. p. 473-498. London: SAGE Publications, Ltd., doi:
http://dx.doi.org/10.4135/9781446268261.n27.
63
Strauss AL. (1987) Qualitative analysis for social scientists, New York: Cambridge University Press.
64
Coyne IT. (1997) Sampling in qualitative research. purposeful and theoretical sampling; merging or clear
boundaries?. Journal of Advanced Nursing 26(3): 623–630.
65
Robinson OC. (2014) Sampling in interview-based qualitative research: A theoretical and practical guide.
Qualitative Research in Psychology 11(1): 25–41.
66
Mason J. (2002) Qualitative researching, 2nd ed. London: Sage.
67
Strauss, A. and Corbin, J. (1998). Basics of qualitative research: Techniques and procedures for developing
grounded theory. (2nd ed.). Thousand Oaks, CA: Sage.
Page 43
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

who employ constant comparative methods or grounded-theory approaches can fruitfully utilise
theoretical sampling to systematically elucidate and refine the ‘variations in, manifestations of, and
meanings of a concept as it is found’ (Patton, 1978: 23868) in the selected primary research studies.
Many qualitative synthesists recommend theoretical sampling as a suitable option for research
syntheses (Dixon-Woods et al. 200569; Mays et al., 200570). For example, in their meta-study,
Paterson and her colleagues (200171) draw on theory-based sampling or operational construct
sampling by setting out operational definitions of the key constructs (broad concepts or topics) or
concepts (terms) about the phenomenon of interest. The boundaries of these operational
definitions are further articulated by explicitly stating inclusion/exclusion criteria in relation to
selecting primary research reports for the synthesis.
Building grounded theory requires an iterative process of data collection, coding, analysis, and
planning what to study next. The researcher needs to by theoretically sensitive as they are
collecting and coding data to sense where the data is taking them and what to do next. Coming
into a research program with an existing theoretical framework will merely blind them to the
richness of the incoming data. Researchers who want to build a theory that is grounded in the
data use theoretical purposive sampling.
Theoretical purposive sampling strives to identify typical or particular types of sample elements for
in depth investigation since doing so probabilistically overall alternative sample elements would
lead to inoculating atypical and inappropriate sample elements (Nkhata, 1988; Neuman, 2000). As
this iterative process continues, the researcher may explore the same group more deeply or in
different ways, or may seek out new groups. Comparison groups should be selected based on their
theoretical relevance to further the development of emerging categories. It is best to pick the
groups as you go along than choose them all beforehand -- let the data be your guide. In theory
generation, non-comparability of groups is irrelevant, but it can have an effect on the level of
substantive theory developed and it is thus important to pick the right group for the next part of
the comparative research. Theoretical purposive sampling is best suited for field studies, which
have a prime focus on the topic than the desire to meet the population’s representative ness
(Bernard, 2000). Theoretical purposive sampling has to be done upfront or progressively
sequentially by a rolling process interleafed with concurrent analysis of groups or categories and
the sample size will only be attained once theoretical saturation (not sample saturation) is attained.
Theoretical saturation is an important topic when dealing with purposive sampling. In broad terms,
saturation is used in qualitative research as a criterion for discontinuing data collection and/or
analysis. Its origins lie in grounded theory (Glaser and Strauss 196772), but in one form or another,
it now commands acceptance across a range of approaches to qualitative research. It usually refers
to reaching a point of informational redundancy where no additional data are needed whereby the
sociologist can develop properties of the category” (Glaser and Strauss, 1967: 61). Theoretical

68
Patton, M. Q. (1978). Utilization-focused evaluation. Thousand Oaks, CA: Sage
69
Dixon-Woods, M., Agarwal, S., Jones, D., Young, B. and Sutton, A. J. (2005). Synthesising qualitative and
quantitative evidence: A review of possible methods. Journal of Health Services Research and Policy, 10(1),
45-53.
70
Mays, N., Pope, C. and Popay, J. (2005). Systematically reviewing qualitative and quantitative evidence to
inform management and policy making in the health field. Journal of Health Services Research and Policy,
10(1), 6-20.
71
Paterson, B. L., Thorne, S. E., Canam, C. and Jillings, C. (2001). Meta-study of qualitative health research:
A practical guide to meta-analysis and meta-synthesis. Thousand Oaks, CA: Sage
72
Glaser, B.G. and Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative
research. New York, NY: Aldine de Gruyter.
Page 44
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

saturation has also become widely recognized as a guide or indicator that sufficient data collection
when developing a theory has been achieved. It is not necessary to strive for equally complete
theoretical saturation of all categories: “Core theoretical categories, those with the most
explanatory power, would be saturated as completely as possible” (Glaser and Strauss, 1967: 100).
Qualitative samples that are obtained by theoretical sampling are usually designed to make
possible analytic generalizations (those that have to be applied to wider theory on the basis of how
selected cases ‘fit’ with general constructs (broad concepts or topics) or concepts (terms)), but not
statistical generalizations (applied to wider populations on the basis of representative statistical
samples). For example, Miles and Huberman (1994: 27–2873), argue that qualitative sampling can
provide the opportunity to select and examine observations of generic processes, which are key to
our understanding of new theory about the phenomenon being studied.
However, Charmaz distinguishes theoretical saturation from the concept of data saturation,
emphasizing, “The common use of the term saturation (i.e., to imply data saturation) refers to
nothing new happening ….” (Charmaz, 2014: p. 213). In essence, data saturation is about reaching
a point of informational redundancy where additional data collection by recruiting other samples
contributes little or nothing new to the study. We are referring this particular form of saturation,
sample based data saturation to differentiate it from the grounded theory concept of theoretical
saturation, described above.
The concept of data saturation is dependent on the nature of the data source as well as the
synthesis question. There is a higher likelihood of reaching data saturation if the data collection is
purposeful. We must be quick to point out here that qualitative researchers are not occupied so
much with sample size but with the “thick descriptions”. Failure by the researcher in a research
project to reach data saturation, depending on the type of research design, could have an impact
on the quality of the research conducted and hampers trustworthiness of a research project
(Bowen, 200874; Kerr et al., 201075). Students who design a qualitative research study come up
against the dilemma of data saturation when interviewing study participants (Glaser and Strauss,
196776; Morse, 1995, 2015; Sandelowski, 1995; Bryant and Charmaz, 200777; O’Reilly and Parker,
201278; Walker, 201279). In particular, we urge you to address the question of how many interviews
are enough to reach data saturation (Guest et al., 200680). We shall get to Mason later on to
answer this question.

73
Miles, M.B., and Huberman, A.M. (1994), Qualitative data analysis: an expanded sourcebook, 2nd ed.
California: Sage.
74
Bowen, G. A. (2008). Naturalistic inquiry and the saturation concept: A research note. Qualitative
Research, 8(1), 137-152.
75
Kerr, C. (2010). Assessing and demonstrating data saturation in qualitative inquire supporting patient-
reported outcomes research. Expert Review of Pharmacoeconomics and Outcomes Research, 10(3), 269-
281.
76
Glaser, B., and Strauss, A. (1967). The Discovery of Grounded Theory: Strategies for qualitative Research
Aldine, Chicago.
77
Bryant, A., and Charmaz, K. (2007). (Eds.), The SAGE Handbook of Grounded Theory, Sage, London.
78
O’Reilly, M., and Parker, N. (2012, May). Unsatisfactory saturation: A critical exploration of the notion of
saturated sample sizes in qualitative research. Qualitative Research Journal, 1-8.
79
Walker, J. L. (2012). The use of saturation in qualitative research. Canadian Journal of Cardiovascular
Nursing, 22(2), 37-46.
80
Guest, G., Bunce, A., and Johnson, L. (2006). How many interviews are enough? An experiment with data
saturation and variability. Field Methods, 18(1), 59-82.
Page 45
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

It is important to point out at the outset that there is no one-size-fits-all method to reach data
saturation. This is because study designs are not universal and not uniform. However, researchers
do agree on some general principles and concepts: no new data, no new themes, no new coding,
and ability to replicate the study (Guest et al., 2006). Therefore, what we are saying here is that
when and how one reaches those levels of saturation will vary from study design to study design.
The idea of data saturation in studies is helpful; however, it does not provide any pragmatic
guidelines for when data saturation has been reached (Guest et al., 2006).
The process involves either identifying cases from new groups, which might amount to being a
comparison or a contrast with other groups, or reshaping the sample into a new set of criteria as
a result of the analysis, and in so doing replacing the original sampling strategy chosen a-priori
(Draucker et al., 200781; Robinson, 201482).
Data saturation may be attained by thinking around data in terms of rich and thick (Dibley, 201183)
rather than the size of the sample (Burmeister and Aitken, 201284). The easiest way to differentiate
between rich and thick data is to think of rich as quality and thick as quantity. Thick data is a lot
of data; rich data is many-layered, intricate, detailed, nuanced, and more. One can have a lot of
thick data that is not rich; conversely, one can have rich data but not a lot of it. The trick, if you
will, is to have both.
You will hear numerable numbers of researchers during seminar presentations claiming that the
data saturation was attained after conducting 12 interviews. This is a fallacy because one cannot
assume the numbers of interviews have reached data saturation per se. We urge you to consider
the depth of the data (Burmeister and Aitken, 2012). It is a fallacy to hold the view a given number
of interviews are sufficient to attain saturation (Guest, 2006; Dworkin, 201285) and a given number
of focus group discussions amounting to 3 groups (Guest, 2017) would allow 80% of all themes
to emerge within three to six groups, 90% of the themes would emerge. The following studies
subscribe to the misnomer of using the rule of thumb in GTM.

Author Numbers when saturation would be attained


Guest, Bunce and Johnson (2006) 2 to 60 focus groups
Bernard (2000) Samples between 30-60 interviews for ethno
science
Creswell (1998)86 Grounded theory methodology 20-30 interviews

81
Draucker CB, Martsolf DS, Ross R, et al. (2007) Theoretical sampling and category development in
Grounded Theory. Qualitative Health Research 17(8): 1137–1148.
82
Robinson OC. (2014) Sampling in interview-based qualitative research: A theoretical and practical guide.
Qualitative Research in Psychology 11(1): 25–41.
83
Dibley, L. (2011). Analysing narrative data using McCormack’s lenses. Nurse Researcher, 18(3), 13-19.
84
Burmeister, E., and Aitken, L. M. (2012). Sample size: How many is enough? Australian Critical Care, 25,
271-274.
85
Dworkin, S.L. (2012). Sample size policy for qualitative studies using in-depth interviews. Archives of
Sexual Behaviour. 41 : 1319–1320
86
Creswell, John (1998). Qualitative inquiry and research design: Choosing among five traditions. Thousand
Oaks, CA: Sage.
Page 46
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Creswell (1998) Phenomenology: 5 to 25


Morse (1994) At least 6 in phenomenological research
Bertaux (1981) 15 is the smallest
Ritchie et al. (2003) Qualitative samples often "lie under 50"
Green and Thorogood (2009) Interview studies more than 20

While these numbers are offered as guidance the authors do not tend to present empirical
arguments as to why these numbers and not others for example. Furthermore, the issue of why
some authors feel that certain methodological approaches call for more participants compared to
others, is also not explored in any detail.

We hold the theoretical position of classical abductive strategy that determining sample size a
priori is inherently problematic in qualitative research, given that sample size is often adaptive and
emergent, and – particularly if based on a grounded theory approach which calls for fulfilling the
philosophical dictum of saturation in qualitative research which demands linking sampling with
forward enlistment of respondents. This is what the gurus of theoretical sampling espouse (;
Sandelowski, 2008; O’Reilly and Parker 2013; Fusch and Ness, 2015; Saunders et al, 201987).
Saturation is an essential element within grounded theory and other qualitative research which
may desire to stand on the epistemological and methodological assumptions of GTM - implying
that sample size should always be determined a posteriori. For example, one should choose the
sample size that has the best opportunity as the researcher to reach data saturation.
A large sample size does not guarantee one will reach data saturation, nor does a small sample
size—rather, it is what constitutes the sample size (Burmeister and Aitken, 2012). What some do
not recognize is that no new themes go hand-in-hand with no new data and no new coding (O’Reilly
and Parker, 201288). If one has reached the point of no new data, one has also most likely reached
the point of no new themes; therefore, one has reached data saturation.
Morse et al. (201489) made the point that the concept of data saturation has many meaning to
many researchers; moreover, it is inconsistently assessed and reported. What is interesting about
their study findings is that the authors noted that in their review of 560 dissertations that sample
size was rarely if ever chosen for data saturation reasons. Instead, the sample size was chosen for
other reasons (Morse et al., 2014). We have seen to be very common in student dissertations,
defences, thesis year in, and year out. Data saturation is reached when there is enough information

87
Saunders, N. K., Lewis, P., and Thornhill, A. (2019). Research Methods for Business Students, 8th Edn.
London: Pearson Education

88
O’Reilly, M., and Parker, N. (2012, May). Unsatisfactory saturation: A critical exploration of the notion of
saturated sample sizes in qualitative research. Qualitative Research Journal, 1-8.
89
Morse, W. C., Lowery, D. R., and Steury, T. (2014). Exploring saturation of themes and spatial locations
in qualitative public participation geographic information systems research. Society and Natural Resources,
27(5), 557-571.
Page 47
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

to replicate the study (O’Reilly and Parker, 201290; Walker, 201291), when the ability to obtain
additional new information has been attained (Guest et al., 200692), and when further coding is no
longer feasible (Guest et al., 2006).
From research experience, data saturation has been achieved from interviews and focus group
discussions. Bernard (201293) states that data saturation in interview studies is attained when the
researcher takes what he can get. Moreover, interview questions should be structured to facilitate
asking multiple participants the same questions, otherwise one would not be able to achieve data
saturation as it would be a constantly moving target (Guest et al., 2006). To further enhance data
saturation, Bernard (2012) recommends including the interviewing of people that one would not
normally consider. Here, some caution to be taken note of. There is the shaman effect, in that
someone with specialised information on a topic can overshadow the data, whether intentionally
or inadvertently (Bernard, 2012). Finally, care should be taken when confronting gatekeepers at
the research site who may restrict access to key informants (Holloway et al., 201094) which would
hamper complete data collection and data saturation.
In terms of focus group discussions, the flexible, unstructured dialogue between the members of
a group and an experienced facilitator/moderator that meets in a convenient location will do the
job to attain data saturation (Brockman et al., 2010; Jayawardana and O’Donnell, 200995; Packer-
Muti, 201096). The focus group interview helps in eliciting multiple perspectives on a given topic.
However, we should be mindful that it might not be as effective for sensitive areas (Nepomuceno
and Porto, 201097). This method drives research through openness, which is about receiving
multiple perspectives about the meaning of truth in situations where the observer cannot be
separated from the phenomenon. This helps greatly in attaining data saturation. For focus groups
it is recommended that the size of the group include between six and 12 participants, so that the
group is small enough for all members to talk and share their thoughts, and yet large enough to
create a diverse group (Lasch et al., 201098; Onwuegbuzie et al., 201099).
Now that we have almost covered the aspect of saturation, let us look at the role of the researcher
in determining saturation. One of the challenges in addressing data saturation is about the use of

90
O’Reilly, M., and Parker, N. (2012, May). Unsatisfactory saturation: A critical exploration of the notion of
saturated sample sizes in qualitative research. Qualitative Research Journal, 1-8.
91
Walker, J. L. (2012). The use of saturation in qualitative research. Canadian Journal of Cardiovascular
Nursing, 22(2), 37-46.
92
Guest, G., Bunce, A., and Johnson, L. (2006). How many interviews are enough? An experiment with data
saturation and variability. Field Methods, 18(1), 59-82.
93
Bernard, R. H. (2012). Social research methods: Qualitative and quantitative approaches (2nd ed.).
Thousand Oaks, CA: Sage.
94
Holloway, I., Brown, L., and Shipway, R. (2010). Meaning not measurement: Using ethnography to bring
a deeper understanding to the participant experience of festivals and events. International Journal of Event
and Festival Management, 1(1), 74-85.
95
Jayawardana, A., and O’Donnell, M. (2009). Devolution, job enrichment and workplace performance in Sri
Lanka’s garment industry. The Economic and Labour Relations Review, 19(2), 107-122.
96
Packer-Muti, B. (2010). Conducting a focus group. The Qualitative Report, 15(4), 1023-1026.
97
Nepomuceno, M., and Porto, J., (2010). Human values and attitudes toward bank services in Brazil. The
International Journal of Bank Marketing, 28(3), 168-192.
98
Lasch, K. E., Marquis, P., Vigneux, M., Abetz, L., Arnould, B., and Bayliss, M. (2010). PRO development:
Rigorous qualitative research as the crucial foundation. Quality of Life Research, 19(8), 1087-1096.
99
Onwuegbuzie, A. J., Leech, N. L., and Collins, K. M. T. (2010). Innovative data collection strategies in
qualitative research. The Qualitative Report, 15(3), 696-726.
Page 48
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

a personal lens primarily because novice researchers (such as students) assume that they have no
bias in their data collection and may not recognise when the data is indeed saturated. However, it
is important to remember that a participant’s as well as the researcher’s interests and values or
worldview are present in all social research, both intentionally and unintentionally (Fields and Kafai,
2009100). We therefore need to control for these to some extent. This leads to examine the
researcher’s lens in the study.
In order to address the concept of a personal lens, in qualitative research, the researcher is the
data collection instrument and as such cannot separate themselves from the research (Jackson,
1990101). This brings up special concerns and students have challenges to answer questions during
their proposal or thesis as well as dissertation defence. To be clear here, what we are presenting
is that the researcher ought to be aware that they are operating between multiple worlds while
engaging in research, which includes the cultural world of the study participants as well as the
world of one’s own perspective (Denzin, 2009102). Hence, it becomes imperative that the
interpretation of the phenomena represent much more that of participants than of the researcher
(Holloway et al., 2010103) in order for the data to be saturated. Hearing and understanding the
perspective of others may be one of the most difficult dilemmas that face the researcher. The
better a researcher is able to recognise his/her personal view of the world and to discern the
presence of a personal lens, the better one is able to hear and interpret the behaviour and
reflections of others (Dibley, 2011104; Fields and Kafai, 2009105) and represent them in the data
that is collected. Therefore, how one addresses and mitigates a personal lens/worldview during
data collection and analysis is a key component for the study.
It is important that a novice researcher recognises own personal roles in the study and mitigate
any concerns during data collection (Chenail, 2011106). Part of the discussion should address how
this is demonstrated through understanding when the data is saturated by mitigating the use of
one’s personal lens during the data collection process of the study (Dibley, 2011). Hence, a
researcher's cultural and experiential background will contain biases, values, and ideologies
(Chenail, 2011) that can affect when the data is indeed saturated (Bernard, 2012107).
Critical Case Sampling
Critical case sampling is a type of purposive sampling technique that is particularly useful in
exploratory qualitative research, research with limited resources, as well as research where a single
case (or small number of cases) can be decisive in explaining the phenomenon of interest. Critical

100
Fields, D. A., and Kafai, Y. B. (2009). A connective ethnography of peer knowledge sharing and diffusion
in a tween virtual world. Computer Supported Collaborative Learning, 4(1), 47-69.
101
Jackson, J. E. (1990). I am a field note: Field notes as a symbol of professional identity. In R. Sanjek
(Ed.), Field notes: The making of anthropology (pp. 3-33). Ithaca, NY: Cornell University Press.
102
Denzin, N. K. (2009). The research act: A theoretical introduction to sociological methods. New York, NY:
Aldine Transaction.
103
Holloway, I., Brown, L., and Shipway, R. (2010). Meaning not measurement: Using ethnography to bring
a deeper understanding to the participant experience of festivals and events. International Journal of Event
and Festival Management, 1(1), 74-85.
104
Dibley, L. (2011). Analyzing narrative data using McCormack’s lenses. Nurse Researcher, 18(3), 13-19.
105
Fields, D. A., and Kafai, Y. B. (2009). A connective ethnography of peer knowledge sharing and diffusion
in a tween virtual world. Computer Supported Collaborative Learning, 4(1), 47-69.
106
Chenail, R. (2011). Interviewing the investigator: Strategies for addressing instrumentation and
researcher bias concerns in qualitative research. The Qualitative Report, 16(1), 255-262.
107
Bernard, R. H. (2012). Social research methods: Qualitative and quantitative approaches (2nd ed.).
Thousand Oaks, CA: Sage.
Page 49
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

case sampling in a research synthesis might be employed to assist stakeholders in making informed
decisions about the viability of an educational program. For example, consider an innovation that
produces desirable outcomes, but is being rejected by many practitioners, as they believe that its
implementation requires substantial resources. A synthesis of primary research studies, which
describe in detail successful implementation of the innovation with minimal resources, might be
useful to alleviate the practitioners’ resistance towards the innovation. Alternatively, consider an
innovation, which requires substantial financial resources. However, the proponents of the
innovation assert that the innovation is cost-effective provided sufficient resources are invested in
its implementation. In such an area, a research synthesist can selectively synthesise cases reported
in primary research studies that were sufficiently endowed with resources to logically verify, or
challenge, the claims made by those advocating the innovation.
Critical case sampling can facilitate ‘logical generalizations’ with the reasoning ‘that “if it happens
there, it will happen anywhere,” or, vice versa, “if it doesn’t happen there, it won’t happen
anywhere”’ (Patton, 2002: 236108).
Opportunistic or Emergent Sampling
Opportunistic, emergent sampling takes advantage of whatever unfolds as it unfolds’ by utilising
‘the option of adding to a sample to take advantage of unforeseen opportunities after fieldwork
has begun’ (Patton, 2002: 240, emphasis in original). Opportunistic or emergent sampling can be
useful for synthesizing a research area which is at its exploratory stage, such as mobile learning,
or when the synthesist does not have an emic or insider status in the relevant field of research.
Emergent sampling is also suited to participatory syntheses where the synthesis purpose evolves
in response to the changing needs of the participant cosynthesists (Suri, 2007109). For instance,
the purpose of a synthesis in the area of mobile learning might be guided by the key questions or
concerns of a group of professors who are teaching with mobile technologies. The synthesist might
then enter the field and search for reports to address these questions. When the synthesist feeds
this information back to the professors, their questions might also change. In response to their
changing questions, the synthesist might seek further studies with a different set of criteria. While
pursuing these searches, the synthesist is also likely, serendipitously, to find primary research
reports that will provide useful insights into the phenomenon of mobile learning. Given the
exploratory nature of the process of developing the methodologically inclusive research synthesis
framework, I employed opportunistic sampling at the broadest level.
Combination or mixed purposeful sampling
Choosing a combination or mix of sampling strategies to best fit your purpose. For some syntheses,
it may be useful to use a combination or mix of sampling strategies. For instance, by applying
theoretical sampling in a first stage and deviant case sampling in a second stage. This should be
guided by the review methods and purpose, and the time available
Size of Sample in Qualitative Studies
What the sample size should be is a critical question, as Anderson notes “size does matter”, but
“more is not always better” (2017: 4110). The rule in qualitative research sample size determination

108
Patton, M. Q. (2002). Qualitative research and evaluation methods (3rd ed.). Thousand Oaks, CA: Sage
109
Suri, H. (2007). Expanding possibilities within research syntheses: A methodologically inclusive research
synthesis framework. Unpublished PhD thesis. Melbourne: The University of Melbourne.
110
Anderson, V. (2017). Criteria for evaluating qualitative research. Human Resource Development
Quarterly, 1–9.
Page 50
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

is that generally, sample size should not be too large so that richness/depth of analysis is
undermined, nor too small to undermine the credibility of the analysis (see Onwuegbuzie and
Leech, 2007111). There is one most important criterion we desire you know. This is that of adequacy
of sample size. Adequacy should be considered to be relative we advise that you justify this before
the study and after the study when you are rendering your report. When you read journal articles
in peer reviewed journals, you will observe that authors spend time to outline the research purpose
(ethnographic, phenomenological inter alias). However, for qualitative interviews, questions
determining sample size are primarily related to the philosophical (ontological and/or
epistemological) assumptions and alignment of approach taken to the data analysis (Sim et al.,
2018112 ). Most of the researchers do not do not work with absolute or definitive number per se as
quantitative researchers do. Small samples can be useful in generating theoretical or hermeneutic
insights, theory-testing or construct-problematising, demonstration of possibility, illustration of
best practice and theory-exemplification (Robinson, 2014; Malterud et al., 2016113). Usually they
serve to develop idiographic descriptions to better understand the participants, evoke empathy,
present uniqueness of human action, etc. (e.g. Farkic, 2020114).
There however, different approaches that you may use in determining your sample size at the
outset. These include use of rules of thumb, conceptual approaches, numerical guidelines or
combinations thereof. Although recommendations exist for determining qualitative sample sizes
(e.g., Morse's, 1994115; Creswell , 2013116), the literature appears to contain few instances of
research on qualitative sample sizes. Practical guidance is needed to determine sample sizes for
rigorous qualitative research. The lack of guidance poses a problem because researchers planning
qualitative studies are expected at times need to estimate sample sizes (yet this is not a universal
rule) in order to:
a) Allocate resources and budget,
b) Develop acceptable proposals for funding,
c) Develop proposals that may be accepted by rigid institutional review boards, and
d) Conduct rigorous and systematic qualitative research.
Whenever there are graduate proposal or defence presentations, panellists tend to ask students
doing qualitative research ‘‘what is your sample size?’’ and students often ask each other ‘‘How
large should a qualitative sample size be?’’ These questions seem to plague students and novice
researchers into turmoil. A common misconception about sampling in qualitative research is that

111
Onwuegbuzie, A. J., and Leech, N. L. (2007). Sampling designs in qualitative research: Making the
sampling process more public. Qualitative Report, 12(2), 238–254.
112
Saunders, N. K., Lewis, P., and Thornhill, A. (2019). Research Methods for Business Students, 8th Edn.
London: Pearson Education.
Sim, J., Saunders, B., Waterfield, J., and Kingstone, T. (2018). Can sample size in qualitative research be
determined a priori? International Journal of Social Research Methodology, 21(5), 619–634.
113
Malterud, K., Siersma, V. D., and Guassora, A. D. (2016). Sample size in qualitative interview studies:
Guided by information power. Qualitative Health Research, 26,1753–1760.
114
Farkic, J. (2020). Consuming dystopic places: What answers are we looking for? Tourism Management
Perspectives, 33.
115
Morse, J. M. (1994). Designing funded qualitative research. In Norman K. Denzin and Yvonna S. Lincoln
(Eds.), Handbook of qualitative research (pp. 220-235). Thousand Oaks, CA. Sage.
116
Creswell, J.W. (2013) Research Design: Qualitative, Quantitative, and Mixed Methods Approaches. 4th
Edition, SAGE Publications,
Page 51
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

some people say numbers are unimportant in ensuring the adequacy of a sampling strategy. Others
give a range of numbers, which are needed and they argue that this depends on the type of data
collection tool, or qualitative approach that is being used. Others argue that determining adequate
sample size in qualitative research is ultimately a matter of judgment and experience in evaluating
the quality of the information collected against the uses to which it will be put. Others argue that
the particular research method and type of sampling strategy or design the researcher desires to
employ and the research product intended will determine the sap le size.
For these inconclusive positions, Curtis et al (2000: 1002117) note that sampling in qualitative
research ‘needs to be addressed rigorously and is fundamental to our understanding of the
trustworthiness of qualitative research’, but also suggest that it is a topic that has received
insufficient attention in comparison to methods of data collection and analysis. There are numerous
contradictions around sample sizes in a qualitative project. Some argue that sample size must be
determined a priori. There are however, no well-established and scientific proven guidelines to
allow formal estimation of sample size a priori for a qualitative research project. In addition, others
argue that sample sizes could be established a posteriori. The rule of thumb based guidelines,
which some researchers follow in determining sample sizes a priori, which appear below have been
used. However, these rules of thumbs have been challenged by those arguing that such a priori
sample size decisions are incompatible with conceptual and methodological notions underpinning
qualitative research.
Rules of thumb based Approach
This is seemingly popular, approach where numerical guidelines are used (Morse, 1994;
Onwuegbuzie and Leech, 2007118; Dworkin, 2012; Creswell, 2013; Guest et al., 2006; 2017119). A
number of authors have proposed rules of thumb for sample size in qualitative research, based on
methodological considerations. However, these rules of thumb commonly lack a clear and detailed
rationale, and whilst there is a degree of similarity in what they propose. The rules thumb that are
used have been adopted and Morse (1994) and Creswell (2013) are the ardent proponents.
Empirical reviews have shown varying sample size estimates for individualised as well as group
based studies as shown in table 21.3 below.

117
Curtis., S., Gesler, W., Smith, G., Washburn, S. (2000). Approaches to Sampling and Case Selection in
Qualitative Research: Examples in the Geography of Health. Social Science and Medicine 50 (7-8):1001-
1014..
118
Onwuegbuzie, A. J., and Leech, N. L. (2007). A call for qualitative power analyses. Quality and Quantity.
41: 105-121.
119
Guest, G., Namey, E., and McKenna, K. (2017). How many focus groups are enough? Building an evidence
base for nonprobability sample sizes. Field Methods, 29, 3–22.
Page 52
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Table 23.3: Sample size and rule of thumb Approach

Study Qualitative design Estimated sample size


One cultural-sharing group in Three to five cases in a case study
ethnography
Creswell (2002) A grounded theory study Interview 15-20 people
Of one individual in
Narrative stories Narrative research and could go up
to 10 when one is developing a
collective story.
Creswell (1998) In phenomenological research Up to 10 people
Grounded theory Interviews with 20-30
Morgan (1997) and  Usually contain 6-12 persons per
Langford et al. Focus group discussion (FGD) FGD
(2002J) Johnson based study designs120  Interviewing key informants 5 are
and Christensen ideal
(2004)
Recommends 6-9 focus group
Krueger (2000) Focus group discussion (FGD) members and groups with more
based study designs than 12 participants tend to “limit
each person’s opportunity to share
insights and observations
Grounded theory FGD based 3-5 focus groups typically are
Morgan study sufficient to reach
saturation
Phenomenological design six participants in investigations
where the goal is to understand the
Morse (1994) essence of experience
Grounded theory research 30-50 interviews
In large ethnographical
studies, we need to select a
large and representative 100-200 units of observations
sample (purposeful or random
based on purpose) with
numbers similar to those in a
quantitative study.
The use of numerical guidelines in the rule of thumb appears to be compelling as the thinking
stems from quantitative paradigmatic thinking as the research environment surrounding qualitative
research continues to be dominated by the quantitative paradigm. In fact, those who use numerical
guidelines are not well trained in post positivist standpoints and especially qualitative research,
and less aware of the interpretive and constructivist paradigms, which originally are the core of
qualitative research. The question classical qualitative researchers ought to ask is ‘‘Why should we
care about sample size determination a priori in qualitative studies then?” Moreover, although
there is such a need, there is also a need to ensure that the whole issue of sample size does not
assume a disproportionate prominence and overshadow other essential elements within the

120
Create groups that average 5-10 people each. In addition, consider the number of focus groups you need
based on “groupings” represented in the research question. That is, when studying males and females of
three different age groupings, plan for six focus groups, giving you one for each gender and three age
groups for each gender
Page 53
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

process of qualitative data collection and analysis relating to epistemological, methodological and
practical issues (Baker and Edwards, 2012)121. It is no wonder Morse (2000122) posits that the more
useable data are collected from each person, the fewer participants are needed. She invites
researchers to take into account parameters, such as the scope of study, the nature of topic (i.e.
complexity, accessibility), the quality of data, and the study design.

While rules of thumbs like those Morse's (1994123) and Creswell (2013) have been taken as bibles
providing researchers with concrete numbers, Emmel (2013124) a number of researchers have
cautioned against reliance on these suggested sizes and urged researchers to consider additional
factors like the demand for saturation in some studies. Some authors caution regarding the use of
sample size based on these rules of thumb are worth noting. The arguments are that researchers
determine these sample sizes not by claiming a good fit between their research problem and the
composition of the sample. Numerical justification by pre-set benchmarks is rather less sought in
qualitative research. Furthermore, it is the essence of qualitative research that samples are highly
context-based, as universal numerical recommendations have weak explanatory power (Morse
2000; 125Guetterman, 2015;126 Kindsiko and Poltimäe, 2019127). Thus the Meta summaries and
metasyntheses are needed to gather evident-based data regarding suitable sample sizes.

Conceptual models
Some authors have used a rather more formal conceptual model, based upon specific
characteristics of the proposed study, such as its aim, its underlying theoretical framework, and
the type of analysis intended. Morse (2000128), for example, argues that sample size will depend
upon: the scope of the research question (the broader the scope, the larger the sample size
needed); the nature of the topic (the more ‘obvious’, the smaller the sample size); the quality of
the data (the richer the data, the smaller the sample size); the study design (a longitudinal design
in which a group is the unit of analysis will require a smaller sample size than one in which there
is one interview per participant); and shadowed data (if interviews reveal something about others’
perspectives, in addition to the interviewee’s own, this may require a smaller sample size).
Use of Previous Research

121
Baker, S.E., Edwards, R. (2012).How many qualitative interviews is enough?: expert voices and early
career reflections on sampling and cases in qualitative research. National Centre for Research Methods
Review Paper.
122
Morse, J.M. (2000). Determining sample size. Qualitative Health Research 10(1):3–5.
123
Morse, J. M. (1994). Designing funded qualitative research. In Norman K. Denzin and Yvonna S. Lincoln
(Eds.), Handbook of qualitative research (pp. 220-235). Thousand Oaks, CA. Sage.
124
Emmel, N. (2013). Sampling and choosing cases in qualitative research: A realist approach. London. Sage.
125
Morse, J. M.(2000). Editorial. Qualitative Health Research. 10 (1): 3-5.
126
Kindsiko, E., and Poltimäe, H. (2019). The Poor and Embarrassing Cousin to the Gentrified Quantitative
Academics: What Determines the Sample Size in Qualitative Interview-Based Organization Studies? Forum
Qualitative Research. 20 (3): Art. 1.
127
Guetterman, T.C. (2015 ). Descriptions of Sampling Practices Within Five Approaches to Qualitative
Research in Education and the Health Sciences. Forum Qualitative Sozialforschung / Forum: Qualitative Social
Research. 16 (2) Art. 25.
128
Morse, J. M. (2000). Determining sample size. Qualitative Health Research, 10, 3–5.
Page 54
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

One way to approach the topic of qualitative sampling is to focus on methods by describing
sampling practices in recent studies.
Saturation

By far the most widespread instrument for limiting sample size in qualitative research has been
saturation point. Saturation has been used and its origins are in classical Glaserian and Starussian
(1967) grounded theory when researchers empirically- generate an idiographic theory. Saturation
is inextricably linked to theoretical sampling (Glaser and Strauss, 1967129; Bowen, 2008130; O’Reilly
and Parker, 2013; 131Fusch and Ness, 2015; Morse, 2015132). Saturation is a product of grounded
theory methods and according to the original Grounded Theory texts, data collection should
continue until there are no new discoveries (Glaser and Strauss, 1967). However, recent revisions
of this process have discussed how it is rare that data collection is an exhaustive process and
researchers should rely on how well their data are able to create a sufficient theoretical account
or ‘theoretical sufficiency’ (Dey, 1999133). Theoretical sampling is a process of data collection for
generating theory whereby the analyst jointly collects codes and analyses data and decides what
data to collect next and where to find them, in order to develop a theory as it emerges (Glaser,
1978134). The initial stage of data collection depends largely on a general subject or problem area,
which is based on the analyst's general perspective of the subject area. The initial decisions are
not based on a preconceived theoretical framework. This means that ‘idiocy’ or abstract
wonderment (Glaser, 1978).

Achieving the theoretical sampling starts with data collection from people and/or data sources
considered relevant to answer the research question and the research objectives. As the first data
collected are analysed, the next subjects or data sources can be listed according to specific need
to deepen the knowledge or the gaps to be filled, where it is possible to change the characteristics
of subjects, situations or events (Gomes et al., 2015; Dantas et al., 2009135). One of the strategies

129
Glaser, B.G. and Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative
research. New York, NY: Aldine de Gruyter.
130
Bowen, G. A. (2008). Naturalistic inquiry and the saturation concept: A research note. Qualitative
Research, 8(1): 137-152.
131
Charmaz K. (2006). Constructing grounded theory: a practical guide through qualitative analysis. London:
Sage.
Bowen, G.A. (2008). Naturalistic inquiry and the saturation concept: a research note. Qualitative Research.
8(1):137–52. 48.
Morse, J.M. (2015). Data were saturated. Qualitative Health Research. 25(5):587–8. 49.
O’Reilly, M,and Parker, N. (2013). Unsatisfactory saturation’: a critical exploration of the notion of saturated
sample sizes in qualitative research. Qualitative Research. 3(2):190–197.
132
Fusch, P.I, and Ness, L.R. (2015). Are we there yet? Data saturation in qualitative research Qualitative
Report. 20(9):1408–1416.
133
Dey I. (1999). Grounding grounded theory. San Francisco, CA: Academic Press.
134
Glaser, B. G. (1978) Theoretical Sensitivity. California: Sociology Press.
135
Gomes IM, Hermann AP, Wolff LDG, Peres AM, Lacerda, MR. (2015). Grounded theory in nursing: an
integrative review. J Nurs UFPE On line [Internet]. 2015 [cited 2017 Apr 29];9(supl.1):466-74. Available
from: http://www.revista.ufpe.br/revistaenfermagem/index.php/revista/ article/viewArticle/5380
Kenny M, Fourie R. (2015). Contrasting classic, straussian, and constructivist grounded theory:
methodological and philosophical conflicts. Qual Rep [Internet]. 2015 [cited 2017 Mar 10]; 20(8):1270-89.
Available from: http://nsuworks.nova.edu/tqr/vol20/iss8/9.
Page 55
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

to obtain theoretical sampling is to conduct sampling composed of groups with different


participants, but with relevant experiences in relation to the research phenomenon.

Dey (2007136) cautions researchers not to confuse an “open mind with an empty head” (p.176).
Initial ideas or haunches can benefit theoretical development by providing a point of departure
and by raising important preliminary questions (Walker and Myrick 2006137). Coyne (1997138) has
explained that “the researcher must have some idea of where to sample, not necessarily what to
sample for, or where it will lead” (p.625). In this sense, theoretical sampling may involve the
purposeful selection of an initial starting point before moving into theoretical sampling when data
analysis begins to yield theoretical concepts. Therefore, the researcher begins sampling by
collecting data based on the demands of the research questions in a rather loose manner using
one or a few cases to start with. The researcher analyses data from this sample comparing cases
as much as possible. Based on the emerging data that is being analysed, the researcher categorises
the emergent phenomena into what are called categories (lower level codes) to core theoretical
categories (higher-level codes) ensuring that categories are saturated. Saturation is the
determinant of whether to sample or not.
Beyond these initial decisions of where to go is impossible to anticipate or even the direction in
which sampling will proceed in advance until the emergence of a preliminary theoretical framework
(Glaser and Holton, 2004139). It is pertinent to remember that the starting point is only that, and
the researcher should avoid formulating a preconceived conclusion that these initially sampled
characteristics will contribute to theoretical variation (Glaser, 1978). For example, to sample only
according to demographic characteristics is to deduce that they will be relevant to the emerging
theory (Glaser 1978; Morse, 1991). It is important to recognise that deductive logic does have a
legitimate place in classic grounded theory; themes emerge inductively from the data but in
following up these themes through further inquiry the researcher is essentially engaged in a
process of ‘deducing’ who else to sample or what to sample (Dey 2007). Glaser (1978) has referred
to this deductive logic as ‘conceptual elaboration’ whereby theoretical possibilities and probabilities
are deduced from the emerging theory. However, because points of departure such as
demographic characteristics have not emerged from the theory, they must be considered merely
another phenomenon awaiting a verdict as to its relevance. Indeed, descriptive data may be
elevated into abstract theory only by way of comparing theoretical categories and properties, not
mere demographic opposites (Hood, 2007140). By saturating categories that seem to have the most
explanatory power and integrating these into and around a core category (core variable to borrow
from quantitative research), the grounded theorist is able to present the theoretical essence of a

Dantas CC, Leite JL, Lima SBB, Stipp MAC. (2009). Grounded theory - conceptual and operational aspects:
a method possible to be applied in nursing research. Rev Latino Am Enfermagem [Internet]. 2009 [cited
2017 Mar 10];17(4):573-9. Available from: http://dx.doi.org/10.1590/S0104-11692009000400021.
136
Dey, I. (2007) Grounding Categories. in: Bryant, A., Charmaz, K. (Eds.). The Sage Handbook of Grounded
Theory (pp.167-190). London: Sage Pub. Ltd.
137
Walker, D. and Myrick, F. (2006) Grounded Theory: An Exploration of Process and Procedure, Qualitative
Health Research 16, 547-559.
138
Coyne, I. T. (1997) Sampling in Qualitative Research. Purposeful and Theoretical Sampling; Merging or
Clear Boundaries? Journal of Advanced Nursing, 26, 623-630.
139
Glaser, B. G. and Holton, J. (2004) Remodelling Grounded Theory. Forum: Qualitative Social Research,
5. Retrieved April 25, 2019, from http://www.qualitativeresearch. net/fqs-texte/2-04/2-04glaser-e-htm.
140
Hood, J. C. (2007) Orthodoxy vs. Power: The Defining Traits of Grounded Theory. in: Bryant, A., Charmaz,
K. (Eds.) The Sage Handbook of Grounded Theory (pp.151-164). London: Sage Pub. Ltd
Page 56
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

substantive area and decide to stop enlisting new samples. In figure 23.1 below, is a core category
that is used to decide theoretical sampling with saturation. The figure shows how the core category
is connected to slices of data which are in categories. The categories are taken to be subordinate
categories.
Category 1
Core Category Category 2
Sick role Performance
Category 3
Barriers
Category 4
Figure 23.1 Slices of reality and its properties

It is for this reason that researcher ought to be theoretically sensitive so that a theory can be
conceptualized and formulated as it emerges from the data being collected (Glaser and Strauss,
1967141). For the novice grounded theorist, the initial concern about where to start is often
accompanied by a similar concern regarding the decision to stop data collection. Given the
inductive nature of theory generation, it is understood that theoretical sampling, including the
point at which sampling will cease, is controlled throughout the study by the emerging theory. This
happens as the researcher is writing memoirs, determining adding slices of data and constant
tracking back and forth checking subordinate conceptual categories and their properties and then
welding them following theoretical sampling and saturation as shown in figure 23.2.

Constant tracking back and


forth checking subordinate
conceptual categories and
their properties and then
welding them following “Creating a story line
then welding them
Theoretical theoretical sampling and which can be taken
sampling by way saturation as an organised
To
of ideational whole theory made
constructs, and up of core categories
extant theory such
or a core category
as ‘hunches’
bring dense

Slices

Memoing and
Additional developing
Collection of Slices categories
of Data Sample size

Additional Slice(s)
of Data Collected
will determine
sample size

141
Glaser, B.G. and Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative
research. New York, NY: Aldine de Gruyter.
Page 57
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

Figure 23.2: Memoing and Constant Comparison with Sampling


Therefore, sampling is discontinued once a point of saturation has been reached, whereby
categories and their properties are considered sufficiently dense and data collection no longer
generates new leads (Glaser and Strauss, 1967). Glaser (1992) has described this as the point at
which the researcher has reached the full extent of the data, and thus “sampling is over when the
study is over” (p.107). As such, sample size in grounded theory cannot be determined a priori as
it is contingent on the evolving theoretical categories that are being added in form of slices of data.

Information Power

Looking at the rules of thumb presented above, simply, when considering sampling, researchers
need to move beyond "how many?" to address the questions of "how?" and "why?" We instead
propose the concept “information power” power’ that a given sample holds as a pragmatic guiding
principle. We suggest that the size of a sample with sufficient information power depends on (a)
the aim of the study, (b) sample specificity, (c) use of established theory, (d) quality of dialogue,
and (e) analysis strategy (Malterud et al. (2016142). As Emmel (2013: 154143) reminds us, ‘it is not
the number of cases that matters; it is what you do with them that counts’. Malterud et al. (2016144)
reason that sample size can be determined in relation to what they refer to as the ‘information
power’ that a given sample holds.145 This information power is influenced by:

1) The aim of the study (the broader the aim, the greater the required sample size); the
specificity of the sample (the more specific the characteristics of the participants in relation
to the study aims, the smaller the sample size);
2) The theoretical background (the less developed the underlying theory, the greater the sample
size); the quality of dialogue (the richer the dialogue in the interviews, the smaller the sample
size); and
3) The analysis strategy (a study aiming for an exploratory cross-case analysis will require a
larger sample size than one aiming for in-depth analysis of a few informants).
Statistical Calculation of Size
A more recent, and particularly spirited, discussion of these issues centred on a paper by Fugard
and Potts (2015146), in which a statistical calculation of sample size for qualitative research is
proposed. Fugard and Potts (2015) present tables based on a binomial distribution to show the

142
Malterud, K., Siersma, V. D., and Guassora, A. D. (2016). Sample size in qualitative interview studies:
Guided by information power. Qualitative Health Research. 26: 1753–1760.
143
Emmel, N. (2013). Sampling and choosing cases in qualitative research: A realist approach. London:
Sage.
144
Malterud, K., Siersma, V. D., and Guassora, A. D. (2016). Sample size in qualitative interview studies:
Guided by information power. Qualitative Health Research. 26: 1753–1760.
145
Although these authors indicate that their model applies to the planning of a study, it is not solely focussed
on the prior determination of sample size; they note that the adequacy of the sample size should be
continuously reassessed during a study.
146
Fugard, A.J.B., and Potts, H.W.W. (2015a). Supporting thinking on sample sizes for thematic analyses:
a quantitative tool. International Journal of Social Research Methodology, 18, 669– 684. doi:
10.1080/13645579.2015.1005453
Page 58
From Dr. Jason Mwanza’s Chronicles - Practical Handbook for Social Science
Research Methods

minimum number of participants needed in order to detect, with a stated level of confidence (e.g.
80%), a given number of instances of a theme with an assumed prevalence in the population of
interest.
Key takeaways 23

It is important to understand the different sampling methods used in social research.


We have just discovered that we can study one unit or all units. It is also more or
less impossible to study every single unit in a target population so researchers select
a sample or sub-group of the population that is likely to render the sought answers
to the research questions. We must have strong theoretical reasons for our choice of
units (or cases) to be included in their sample.

Activity 23

1) In five ways, could you explain the importance of sampling in research?


2) What are the variants of sampling designs, which are appropriate in a
qualitative and as well as a quantitative design?
3) Describe the theoretical basis of determining sap le size of probability and
non-probability sampling types.

Page 59

You might also like