Enfermería Intensiva 33 (2022) 44---47
[Link]/ei
SPECIAL ARTICLE: EDUCATION
Sampling techniques and sample size calculation: How
and how many participants should I select for my
research?夽
Técnicas de muestreo y cálculo del tamaño muestral: Cómo y cuántos
participantes debo seleccionar para mi investigación
O. Arrogante (RN, MSc, PhD)
Fundación San Juan de Dios, Centro de Ciencias de la Salud San Rafael, Universidad Nebrija, Madrid, Spain
The way the sample is selected and calculated is crucial select it (probability and non-probability) and the size of the
for the generalisability of the results of our research. A sample.
poor choice of sampling technique and/or miscalculation
of the sample size can lead to the results being limited
to only those participants we have included in our study. Probabilistic sampling techniques
Since we cannot study the entire target population as it
is practically inaccessible, we must select a sample that The participants selected using these techniques have a
allows us to infer, extrapolate and generalise our results known non-zero probability of being included in the sample.
to the reference population (more accessible under inclu- In this way, they avoid possible researcher bias in sample
sion and exclusion criteria defined by the researcher). This selection. Therefore, the sample selected tends to be more
sample must be representative of that population in order representative of the reference population. Another advan-
for the results of our study to have external validity and, tage of these techniques is that they involve the application
furthermore, it must be of an adequate size. However, the of statistical techniques capable of quantifying the random
sample must be large enough to ensure that it represents error we make in selecting the sample due to chance.3---5
the reference population, and small enough to facilitate its However, it is possible that chance itself may cause the dis-
analysis.1---5 Therefore, the representativeness of our sam- tribution of the variable obtained in our sample not to be
ple will be conditioned by the sampling technique used to the same as in the reference.6
Probability sampling techniques are divided into3---5 :
DOI of original article: [Link]
03.004 • Simple random sampling: participants are selected ran-
夽 Please cite this article as: Arrogante O. Técnicas de muestreo
domly using random number tables or software (freely
y cálculo del tamaño muestral: Cómo y cuántos partici- available on the internet), so everyone has the same prob-
pantes debo seleccionar para mi investigación. Enferm Intensiva.
ability of being selected. In addition to being the quickest
2022;33:44---47.
and easiest method, as only randomness is involved, more
E-mail address: oarrogan@[Link]
representative samples are achieved. However, it requires
2529-9840/© 2021 Sociedad Española de Enfermerı́a Intensiva y Unidades Coronarias (SEEIUC). Published by Elsevier España, S.L.U. All rights
reserved.
Enfermería Intensiva 33 (2022) 44---47
listing the entire reference population, so it is rarely used The most common non-probabilistic techniques are:3---5
unless the reference population is small.
• Stratified random sampling: this is a variant of the pre- • Consecutive sampling: this is the most commonly used
vious technique that is used when the variable we wish technique, especially in clinical trials. It consists of
to study is not distributed homogeneously within the selecting participants who meet our selection criteria dur-
reference population but is distributed within groups or ing the recruitment period in which we are going to carry
strata that are mutually exclusive. In this way, an attempt out the study. It is usually used to recruit patients who
is made to ensure the same distribution of said vari- come to the clinic and are diagnosed or admitted within
able in the reference population. It is recommended that a certain time period.
these strata be determined according to some confound- • Convenience, accidental or chance sampling: in this
ing variable that may influence the results. Subsequently, case, participants are selected because they are easily
a random sample is selected from each stratum. accessible to the researcher or because they wish to par-
• Systematic random sampling: In this technique, the first ticipate voluntarily. In this way, the researcher chooses
participant is chosen randomly, and the following partici- participants based on their availability (proximity, friend-
pants are selected by adding a previously defined sampling ship, etc.). It is recommended that the distribution of the
constant (k) until the sample size is reached. variable under study is sufficiently homogeneous within
• Multi-stage sampling: when the reference population is the reference population, as there is a high risk that the
very large or dispersed and a complete list of the refe- sample will be biased.
rence population is not available, in a first stage it is • Purposive or intentional sampling: here the researcher
convenient to select sampling units from the reference selects the participants that he/she believes can con-
population (primary units) and, in subsequent stages, to tribute the most to the study. This ensures that he/she
select samples from each previously selected unit (sec- does not miss important participants if he/she were to
ondary units). In this way, the sample is selected in the choose a random or convenience technique. This tech-
stages deemed necessary, and more than one probabil- nique is mainly used in qualitative studies or when you
ity sampling technique can be applied (simple, stratified, want to select a sample of experts.
systematic). As many stages as necessary can be used, • Quota sampling: firstly, the composition of the reference
and a different sampling method can be applied at each population is determined according to a characteristic or
stage. If all secondary units are included in the sampling, variable (frequently sex or age) and, subsequently, the
it is known as cluster sampling. Therefore, although we quota or number of participants who meet that charac-
do not have the list of the entire reference population, teristic or variable is determined. The aim is to achieve
we can have the list of groups or clusters of the same. the appropriate number to complete each of the quotas
determined.
In general, we will choose a probability sampling • Avalanche, snowballing or chain sampling: this tech-
technique when the reference population is sufficiently nique is particularly useful and efficient when participants
accessible and well differentiated before starting our are difficult to reach and is more practical than con-
study.3---5 But, once we have opted for probability sampling, venience sampling, which is mainly used in qualitative
which technique should we choose within it? If the refe- studies. It consists of selecting a participant who meets
rence population is very large, dispersed and grouped by the selection criteria and who is asked to inform the
some characteristic, we will choose a multi-stage sampling researcher about other participants, and so on until a
technique. If this is not the case, and we are interested in sufficient sample is obtained.
controlling the distribution of some confounding variable, it • Theoretical sampling: this technique is mainly used in
would be more convenient to use stratified sampling. Within qualitative studies whose theoretical framework is based
it, if we decide to include all groups or clusters of the refe- on grounded theory. Participants are selected gradually in
rence population, we will choose cluster sampling. However, order to capture all possible meanings in order to develop
if we are not interested in controlling for any confounding a theory.7
variable, the reference population is small and we have it
adequately enumerated in a list, it is best to choose a simple
random or systematic sampling technique.3 Sample size
By calculating the sample size, we aim to define an approxi-
Non-probabilistic sampling techniques mate number of participants that need to be included in the
sample in order for it to be representative of the reference
If, on the other hand, the reference population is not easily population.3,4 If, on the one hand, we include an insufficient
accessible and is not sufficiently differentiated, it is most number of subjects, we run the risk of not finding significant
convenient to use non-probability sampling techniques.3---5 In differences when in fact they do exist (type II error or ).
these techniques, the probability of each participant being On the other hand, if we include too many participants, we
included in the sample is unknown, and they are selected will be wasting time and resources in our research.1,2,8,9
using techniques that do not involve chance, and random It should be noted that it is generally not necessary to
error cannot be calculated.3---5 Therefore, participants are calculate the sample size in qualitative studies, since the
selected largely on the basis of the researcher’s judgement, main aim is to achieve information saturation, which occurs
assuming that the samples selected are free of bias, and that when the information collected becomes redundant, and no
they are representative of the reference population.3---5 new information is collected from the study participants.10
45
O. Arrogante
However, in quantitative studies it is necessary to per- o Accepted risk of committing type II (): i.e., of not
form this calculation very carefully, as the design of the rejecting the null hypothesis, when it should have been
study will depend on it (e.g., whether the sample recruit- rejected because it is false in the population. Gener-
ment period needs to be extended to achieve the calculated ally, it is set between 5% and 20%. However, it is easier
size).3,4 A number of standard error formulas are used for to make this decision based on statistical power (1 ---
this purpose, which can be cumbersome and depend on the ), since accepting an error of 20% implies that our
statistical test to be used in the study. Fortunately, there study has an 80% chance of detecting the difference if
are freely available tables and software that facilitate their it exists in reality.
calculation from the estimated parameters. Some of these o Magnitude of the expected difference, effect or asso-
epidemiological calculators are available online (such as ciation: the estimate of what we expect to obtain in
GRANMO or [Link]), others are free soft- our research should be realistic and based on previously
ware that can be downloaded on personal computers (such conducted studies.
as Epidat, or G*Power), and others are even applications for o Variability of the response variable in the reference
mobile devices (n4Studies). population: an approximation of this should be taken,
Depending on the objective of our research, sample size based on existing literature and previous research.
determination is possible3,4 :
Of these five values, only the last one needs to be known
• Estimating population parameters: from the values col- in order to calculate the sample size, as all the others are
lected in the sample, researchers aim to estimate the set by the researcher according to his/her own interests.
value of a parameter in the reference population. These
parameters are statistically inferred and may be propor- Loss-adjusted sample size
tions (e.g., the proportion or percentage of critically ill
patients presenting with a given complication) or means
Finally, all of the above calculations should be extended
(e.g., the mean of a physiological variable collected in
to include possible losses that may occur during the con-
critically ill patients). To estimate these parameters,
duct of our research. This ensures that the study will end
investigators must determine the following values:
with the calculated sample. For this purpose, the expected
o The variability of the estimated parameter: this is
proportion of losses (R) is defined and the formula Na = N
usually unknown, so the researcher must make an
[1/(1-R)] is applied, where N is the theoretical number of
approximation of it by carrying out a pilot study or by
participants without losses and Na is the adjusted number
taking data from previous research.
of participants.3
o The precision of the estimate: this consists of the
width of the confidence interval (CI), with greater pre-
cision (i) being achieved the narrower the interval, so Financing
the sample size will be larger.
o The confidence level or statistical significance of the The author has no source of funding to declare.
estimate: as a minimum, and as a rule, it is set at 95%
(␣ = 0.05). The higher the confidence level (Z) we want,
the lower the value of will be, so a larger number of Conflict of interests
samples will be needed.
The author has no conflict of interests to declare.
To calculate the sample size in these cases, only the vari-
ability of the parameter under investigation needs to be References
known, as both the precision and the confidence level are
set by the researcher himself according to his own interests. 1. Altman DG. Statistics and ethics in medical research:
III How large a sample? Br Med J. 1980;281:1336---8,
[Link]
• Hypothesis testing: researchers aim to evaluate the
2. Kraemer HC, Theimann S. How many subjects? Statistical power
results obtained in terms of previously established analysis in research. Newbury Park, CA, USA: Sage Publications;
hypotheses (e.g. to assess which of two nursing interven- 1987.
tions or care is more effective in critically ill patients). 3. Argimon Pallás JM, Jiménez Villa P. Métodos de investigación
Therefore, this type of sample calculation is often applied clínica y epidemiológica. 5a ed. Barcelona, España: Elsevier;
mainly in clinical trials. For this purpose, researchers can 2019.
compare whether the proportions or means obtained are 4. Grove S, Gray J. Investigación en enfermería: desarrollo de la
different, according to the intervention applied. In this práctica enfermera basada en la evidencia. 7a ed Barcelona,
case, researchers need to determine the following values España: Elsevier; 2019.
o Direction of the alternative hypothesis (unilateral 5. León OG, Montero I. Métodos de investigación en psicología
y educación: las tradiciones cuantitativa y cualitativa. 4a ed.
or bilateral): in general, it is recommended that the
Madrid, España: McGraw-Hill Interamericana; 2020.
hypothesis be bilateral, as it is more conservative. 6. Odgaard-Jensen J, Vist GE, Timmer A, Kunz R, Akl
o Accepted risk of committing type I error or ␣: i.e., EA, Schünemann H, et al. Randomisation to pro-
of rejecting the null hypothesis, when it should not tect against selection bias in healthcare trials.
have been rejected because it is true in the population. Cochrane Database Syst Rev. 2011;2011:MR000012,
Generally, a risk of 5% (␣ = .05). is accepted. [Link]
46
Enfermería Intensiva 33 (2022) 44---47
7. Strauss A, Corbin J. Basics of qualitative research, techniques, 9. Wittes J. Sample size calculations for randomized
and procedures for developing grounded theory. 2nd Ed. Thou- controlled trials. Epidemiol Rev. 2002;24:39---53,
sand Oaks, CA: Sage Publications; 1998. [Link]
8. Bland JM. The tyranny of power: is there a better 10. Sandelowski M. Sample size in qualitative
way to calculate sample size? BMJ. 2009;339:b3985, research. Res Nurs Health. 1995;18:179---83,
[Link] [Link]
47