🟪 1.
Introduction (5 marks)
Sample size estimation is a scientific process used in research to
determine the minimum number of participants or units required to
achieve accurate, reliable, and valid results while maintaining
statistical power and minimizing error.
🔹 It ensures:
Efficiency of resources
Valid statistical inference
Reduced Type I & Type II errors
✏️Definition (APA Style):
“Sample size estimation refers to the process of calculating the minimum
number of observations required to detect an effect of a given size with a
given degree of confidence.” — (Polit & Beck, 2012)
🟪 2. Why Is Sample Size Estimation Important? (5 marks)
Use the mnemonic: "VAPES"
V – Validity of results
A – Avoids under/over-sampling
P – Power maintained
E – Ethical responsibility
S – Saves time and resources
🟪 3. Key Concepts Before Estimation (10 marks)
Understanding these is crucial before estimating sample size:
Concept Description
Population size
Total number of individuals in the target group
(N)
Confidence level
Degree of certainty (commonly 95% or 99%)
(Z)
Concept Description
Margin of error
Precision of the estimate (e.g., ±5%)
(E)
Standard
Variability within the population
deviation (σ)
Ability to detect an effect if it exists (usually
Power (1−β)
80% or 90%)
Effect size (d) Magnitude of difference you're trying to detect
🧠 Memory trick: “ZEDS-P” — Z-score, Error, Deviation, Sample Size, Power
🟪 4. Sample Size Estimation Process (20 marks)
Break into 6 major steps using the mnemonic "DEEPER":
✅ D – Define research objective
Type of study (e.g., prevalence, experimental, correlational)
One-tailed or two-tailed test?
✅ E – Estimate effect size (d)
Use previous research or pilot studies
Cohen’s d conventions:
o 0.2 (small), 0.5 (medium), 0.8 (large)
✅ E – Estimate standard deviation (σ)
Use known data or conduct pilot study
✅ P – Power level (1–β)
Usually 0.80 (80%) or higher
Power tables (e.g., Cohen’s Power Primer) or software (G*Power)
✅ E – Error margin and confidence level
Confidence interval: Z = 1.96 (95%), Z = 2.58 (99%)
Error margin (±5%) usually standard
✅ R – Run the calculation
Use statistical formula or software
🟪 5. Common Formulae (10 marks)
For Proportions (Cochran’s formula):
n=Z2⋅p(1−p)e2n = \frac{Z^2 \cdot p(1-p)}{e^2}n=e2Z2⋅p(1−p)
Where:
nnn = required sample size
ZZZ = z-score (e.g., 1.96 for 95%)
ppp = estimated proportion (often 0.5 if unknown)
eee = margin of error
For Means:
n=(Z⋅σE)2n = \left(\frac{Z \cdot \sigma}{E}\right)^2n=(EZ⋅σ)2
σ\sigmaσ = population SD
EEE = margin of error
🧠 Memory trick: “Zp over e or Zσ over E squared!”
🟪 6. Adjustments
Finite population correction if N is small:
nadj=n1+(n−1N)n_{adj} = \frac{n}{1 + \left(\frac{n - 1}{N}\right)}nadj
=1+(Nn−1)n
Non-response rate or dropout rate:
nfinal=n(1−dropout%)n_{final} = \frac{n}{(1 - dropout\% )}nfinal
=(1−dropout%)n
🟪 7. Tools & Software
G*Power
Epi Info
WHO Sample Size Calculators
SPSS/Excel plugins
🟪 8. Conclusion
Sample size estimation is the backbone of any valid research design. A poor
estimation leads to type errors, invalid generalizations, and ethical
issues. Tools, theory, and pilot data make this process scientific and
replicable.
✅ QUESTION 2 (50 marks): What is Sample Size? What Are the Ways
to Estimate It?
🟪 1. Introduction: What is Sample Size? (5 marks)
Sample size is the number of observations or participants selected from
the population to participate in a study.
📘 According to Siegel & Castellan (1988):
“Sample size refers to the number of units selected from a population for
inclusion in a study.”
🟪 2. Why is Sample Size Important? (5 marks)
Ensures representativeness
Maintains power
Reduces bias and error
Allows valid inferences to population
🟪 3. Factors Affecting Sample Size (10 marks)
Use the mnemonic: “ZEPSI”
Factor Description
Z-score related to desired
Z – Confidence level
confidence
E – Margin of error Precision required
P – Population Standard deviation or
variability proportion
S – Sample type Probability vs non-probability
I – Intended power Higher power = bigger sample
🟪 4. Methods to Estimate Sample Size (20 marks)
A. Using Standard Formulae
As explained in Q1 (for mean/proportion)
B. Using Pilot Study
Conduct a mini-study (N=30–50)
Estimate variance, effect size
C. Using Previous Studies
Borrow SD or p-values from published research
D. Using Rules of Thumb
For regression: n≥50+8mn ≥ 50 + 8mn≥50+8m (m = predictors)
For factor analysis: 5–10 cases per variable
E. Using Power Analysis
With software like G*Power
Choose:
o Effect size
o Alpha (α)
o Power (1–β)
o Type of test (t-test, ANOVA, etc.)
🟪 5. Sample Size Estimation for Different Designs (5 marks)
Study Type Suggested Method
Cross-
Cochran’s formula
sectional
Power analysis with expected effect
Clinical trial
size
Thematic saturation (10–30 interviews
Qualitative
typical)
Correlational Based on expected r values
🟪 6. Sample Size in Qualitative Research (5 marks)
📌 Not calculated statistically; based on saturation.
Phenomenology: 5–10
Grounded theory: 20–30
Thematic analysis: ~15–25
Case study: 1–10
🧠 Trick: “Small but deep” — qualitative = fewer participants, richer insights
🟪 7. Conclusion
Sample size and its estimation are central to the scientific rigor of
research. Various quantitative formulae, software tools, and pilot
methods help researchers make this choice appropriately. In qualitative
designs, saturation is the guiding principle rather than numbers
✅ UNIT X OVERVIEW: SAMPLE SIZE ESTIMATION
This unit focuses on how we determine the number of participants
needed in various statistical situations.
📚 POSSIBLE THEORY QUESTIONS
🟪 1. What is Sample Size? Why is it Important? (5–10 marks)
Key Points to Cover:
Definition: Sample size = number of observations in a study
Importance:
o Reduces bias
o Increases accuracy
o Ensures reliability
o Avoids under/over-sampling
Linked with power, error margin, and confidence level
🟪 2. Factors Affecting Sample Size Estimation (10 marks)
List & explain:
1. Confidence Level (Z-value) – typically 95% (Z = 1.96)
2. Margin of Error (e) – smaller error = larger sample
3. Standard Deviation (σ) – more variability = larger sample
4. Effect Size – smaller effect = larger sample
5. Power of the study – usually 80% or 90%
6. Expected Proportion (p) – used in categorical data
7. Dropout rate – add 10–20% to the sample
🟪 3. Steps/Process of Sample Size Estimation (10–20 marks)
🪷 Step-by-step (Use this to write structured answers):
1. Define the research objective (mean/proportion/comparison)
2. Choose the acceptable margin of error (e)
3. Select desired confidence level (Z-score)
4. Estimate population standard deviation (σ) or proportion (p)
5. Decide the statistical power
6. Use the appropriate formula or software
7. Adjust for non-response or dropout
🧠 Mnemonic: “R-E-C-A-P-S-A”
(Research – Error – Confidence – Assumption – Power – Sample size –
Adjustment)
🟪 4. Cochran’s Formula for Estimating a Proportion (5–10 marks)
n=Z2⋅p(1−p)e2n = \frac{Z^2 \cdot p(1 - p)}{e^2}n=e2Z2⋅p(1−p)
Used when estimating proportions (e.g., % of smokers)
Use p = 0.5 if no estimate is available
Z = 1.96 for 95% confidence
e = margin of error (e.g., 0.05)
🧠 Memory Trick: "Z squared p one minus p over e squared"
🟪 5. Formula for Estimating Sample Size for Mean (5–10 marks)
n=(Z⋅σE)2n = \left(\frac{Z \cdot \sigma}{E}\right)^2n=(EZ⋅σ)2
For quantitative data (e.g., mean blood pressure)
σ = population SD
E = margin of error
🧠 Trick: “Z sigma over E squared”
🟪 6. Estimating Sample Size for Comparing Two Means (20–25 marks)
💡 When? → To test if two groups differ in average (e.g., CBT vs no CBT)
Formula:
n=2(Zα/2+Zβ)2⋅σ2(μ1−μ2)2n = \frac{2(Z_{\alpha/2} + Z_\beta)^2 \cdot \
sigma^2}{(μ_1 - μ_2)^2}n=(μ1−μ2)22(Zα/2+Zβ)2⋅σ2
Where:
Symb
Meaning
ol
Zα/2 Z-score for significance level
Zβ Z-score for power
σ² Pooled variance
Difference between means
μ₁ - μ₂
(effect size)
🧠 Trick: "Double Z squared sigma over difference squared!"
📌 Mention assumptions: normality, equal variances, independent samples.
🟪 7. Estimating Sample Size for Comparing Two Proportions (20–25
marks)
💡 When? → E.g., % of depression in males vs females
Formula:
n=(Zα/2+Zβ)2⋅[p1(1−p1)+p2(1−p2)](p1−p2)2n = \frac{(Z_{\alpha/2} + Z_\
beta)^2 \cdot [p_1(1 - p_1) + p_2(1 - p_2)]}{(p_1 - p_2)^2}n=(p1−p2
)2(Zα/2+Zβ)2⋅[p1(1−p1)+p2(1−p2)]
🧠 Trick: “Z combo squared times p1q1 plus p2q2 over difference
squared”
🟪 8. Short Note: Software Tools for Estimation (5 marks)
G*Power
OpenEpi
Raosoft calculator
EpiInfo
Easy to use for complex tests
Graphical interfaces
🟪 9. Short Note: Thumb Rules vs Formula-Based Estimation (5
marks)
Thumb Rules Formula-Based
General rules of Precise, based on
30–50/group math
Quick and easy More accurate
Not suitable for complex
Best for research
analysis
🟪 10. Short Note: Adjusting for Dropout/Attrition (5 marks)
Add 10–20% more participants
Especially for longitudinal studies
Prevents underpowering due to loss