BCB303/BIO301: Experimental Design, Research
Methods and Biostatistics 1
Dr. Md. Rezaul Karim
PhD(KULeuven & UHasselt), MS(Biostatistics), MS(Statistics)
Associate Professor, Department of Statistics
Jahangirnagar University (JU), Savar, Dhaka - 1342, Bangladesh
Mobile: 01912605556, Email: mrkarim5556sets@[Link]
Summer - 2023
1 These course slides should not be reproduced nor used by others (without
permission).
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 1 / 37
Lecture Outline I
1 Chapter 5: Design of Experiments
1.1 Problem & Motivation
1.2 Design of Experiment
1.3 Principles of Experimental Design
1.4 Completely Randomized Design (CRD)
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 2 / 37
Chapter 5: Design of Experiments
Chapter 5: Design of Experiments
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 3 / 37
Chapter 5: Design of Experiments
1 Chapter 5: Design of Experiments
1.1 Problem & Motivation
1.2 Design of Experiment
1.3 Principles of Experimental Design
1.4 Completely Randomized Design (CRD)
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 4 / 37
Chapter 5: Design of Experiments Problem & Motivation
Problem & Motivation
Problem 1
To investigate the eect of dierent feeds on the weight gain of rats, six
litters of rats of the same strain were selected. Three rats were randomly
selected from each litter, and each was fed with feed A, B, or C. The
experimental conditions other than the feeds were kept consistent. After 4
weeks, the weight gain (g) of all rats was recorded (Table 10.1). Are there
dierences in the rats' weight gain according to their feed? Are there
dierences in the weight gain of rats in dierent litters? (Assume that the
rats' weights at the beginning of the experiment had no signicant
dierence and the weight gain follows a normal distribution.)
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 5 / 37
Chapter 5: Design of Experiments Problem & Motivation
How to analyze these data? How to generate similar data from the innite
population?
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 6 / 37
Chapter 5: Design of Experiments Problem & Motivation
Another Example : CRD for Chemitech Assembly Method
Experiment
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 7 / 37
Chapter 5: Design of Experiments Problem & Motivation
let
▸ µ1 = mean number of units produced per week using method A
▸ µ2 = mean number of units produced per week using method B
▸ µ3 = mean number of units produced per week using method C
the hypothesis is
⎧
⎪
⎪H0 ∶ µ1 = µ2 = µ3
Not all population means are equal
⎨
⎪
⎩H1 ∶
⎪
how to test more than two population means are equal
analysis of variance (ANOVA) is the statistical procedure used to
determine whether the observed dierences in the three sample means
are large enough to reject H0
the t test methodology generalizes nicely in this case to a procedure
called the one-way analysis of variance (ANOVA)
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 8 / 37
Chapter 5: Design of Experiments Design of Experiment
Observational and Experimental Study
In an observational study, data are usually obtained through sample
surveys and not a controlled experiment
in an experimental statistical study, an experiment is conducted to
generate the data
an experiment begins with identifying a variable of interest and then
one or more other variables, thought to be related, are identied and
controlled, and data are collected about how those variables inuence
the variable of interest
for instance, in a study of the relationship between smoking and lung
cancer the researcher cannot assign a smoking habit to subjects
the researcher is restricted to simply observing the eects of
smoking on people who already smoke and the eects of not smoking
on people who do not already smoke
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 9 / 37
Chapter 5: Design of Experiments Design of Experiment
Design of experiment (DoE)
1 experiment
▸ a scientic way of getting an answer to a question which the
experimenter wants to know
2 design of experiment means how to design an experiment in the sense
that how the observations or measurements should be obtained to
answer a query in a valid, ecient and economical way
3 data collection from non-existing or innite population
4 is a systematic method to determine the relationship between factors
aecting a process and the output of that process
5 is used to nd cause-and-eect relationships
6 one of the main objectives of designing an experiment is how to verify
the hypothesis in an ecient and economical way
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 10 / 37
Chapter 5: Design of Experiments Design of Experiment
1 so the main question is how to obtain the data such that the
statistical assumptions are met and the data is readily available for the
application of tools like analysis of variance
2 the designing of such a mechanism to obtain such data is achieved by
the design of the experiment
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 11 / 37
Chapter 5: Design of Experiments Design of Experiment
experiment unit
▸ for conducting an experiment, the experimental material is divided into
smaller parts and each part is referred to as an experimental unit
▸ the experimental unit is randomly assigned to treatment
▸ the phrase randomly assigned is very important in this denition
treatment
▸ dierent objects or procedures which are to be compared in an
experiment are called treatments
replication
▸ it is the repetition of the experimental situation by replicating the
experimental unit
experimental error
▸ the unexplained random part of the variation in any experiment is
termed as experimental error
▸ an estimate of experimental error can be obtained by replication
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 12 / 37
Chapter 5: Design of Experiments Design of Experiment
Example
suppose some varieties of sh food is to be investigated on some
species of shes
the food is placed in the water tanks containing the shes
the response is the increase in the weight of sh
the experimental unit is the tank, as the treatment is applied to the
tank, not to the sh
note that if the experimenter had taken the sh in hand and placed
the food in the mouth of sh, then the sh would have been the
experimental unit as long as each of the sh got an independent scoop
of food
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 13 / 37
Chapter 5: Design of Experiments Design of Experiment
factor
▸ a factor is a variable dening a categorization
treatments and level
▸ a factor can be xed or random in nature
xed
specic treatment levels are selected and are of interest
∎
∎ random
individual levels are randomly selected from a population
▸ a factor is termed as a xed factor if all the levels of interest are
included in the experiment
▸ a factor is termed as a random factor if all the levels of interest are not
included in the experiment and those that are can be considered to be
randomly chosen from all the levels of interest
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 14 / 37
Chapter 5: Design of Experiments Design of Experiment
Fixed and Random Factors
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 15 / 37
Chapter 5: Design of Experiments Design of Experiment
Outputs/Results
response variable called yield
eect of factor(s)
▸ main eects
▸ interaction
▸ nonlinear
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 16 / 37
Chapter 5: Design of Experiments Principles of Experimental Design
Principles of experimental design (Fisher's principles)
1 randomization
2 repetition
3 local control (e.g., blocking)
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 17 / 37
Chapter 5: Design of Experiments Principles of Experimental Design
Some Popular Experimental Design
1 Completely Randomized Design (CRD)
2 Randomized Block Design (RBD)
3 Latin Square Design (LSD)
4 Factorial Design
5 Split-plot Design
6 Incomplete Block Design
7 ...
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 18 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Completely Randomized Design (CRD)
all experimental units are considered the same and no division or
grouping among them exist
a design in which the selected treatments are allocated or distributed
to the experimental units completely at random
this is the simplest design involving the principles of replication and
randomization without local control
the number of replications for dierent treatments need not be equal
and may vary from treatment to treatment depending on the
knowledge (if any) on the variability of the observations on individual
treatments as well as on the accuracy required for the estimate of
individual treatment eect
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 19 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Layout of CRD
Following steps are needed to design a CRD
divide the entire experimental material or area into a number of
experimental units, say n
x the number of replications for dierent treatments in advance (for
given total number of available experimental units)
no local control measure is provided as such except that the error
variance can be reduced by choosing a homogeneous set of
experimental units
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 20 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Procedure
let the k treatments are numbered from 1, 2, ..., k and n be the number of
replications required for i th treatment such that ∑ 1 n = n
j
k
select n1 units out of n units randomly and apply treatment 1 to these
j= j
n1 units
select n2 units out of (n − n1) units randomly and apply treatment 2
to these n2 units
continue with this procedure until all the treatments have been utilized
generally, the equal number of treatments are allocated to all the
experimental units unless no practical limitation dictates or some
treatments are more variable or/and of more interest
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 21 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Layout of CRD/Layout of One-Way ANOVA
the data set is arranged as follows:
Treatment
1 2 ... j ... k
y11 y21 ... yj 1 ... yk 1
y12 y21 ... yj 2 ... yk 1
⋯ ... ...
y1n1 y2n2 ⋯ yjnj ⋯ yknk
Mean ȳ1. ȳ2. ... ȳj. ... ȳk.
Standard Deviation s1. s2. ... sj. ... sk.
Sample Size n1 n2 ... nj ... nk
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 22 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Hypothesis for CRD
there is only one factor which is aecting the outcome by the treatment
eect
y : individual measurement of i th experimental units for j th treatment
j = 1, 2, . . . , k , i = 1, 2, . . . , n
ij
µ : j th treatment eect
j
µ: overall mean
j
the statistical hypothesis is
⎧
⎪
⎪ H0 ∶ µ1 = µ2 = ⋯ = µk
⎨
⎪
⎩ H1 ∶
⎪ all 's are not equal
µj
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 23 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Example
Problem 2
Pulmonary Disease A topic of public-health interest is whether passive
smoking (exposure among nonsmokers to cigarette smoke in the
atmosphere) has a measurable eect on pulmonary health. White and
Froeb studied this question by measuring pulmonary function in several
ways in the following six groups: (i) Non-smokers (NS) (ii) Passive smokers
(PS) (iii) Non-inhaling smokers (NI) (iv) Light smokers (LS) (v) Moderate
smokers (MS) and (vi) Heavy smokers (HS)
A principal measure used by White and Froeb to assess pulmonary function
was forced mid-expiratory ow (FEF). They were interested in comparing
mean FEF among the six groups. How can the means of these six groups
be compared?
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 24 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
How to analyze these data?
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 25 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Advantage of CRD
CRD is the basic and the simplest design
number of treatment can be repeated i.e. complete exibility
regarding the replication
error variance is minimum
in case of CRD experimental units are homogeneous
analysis of data is quite simple and straight forward even if dierent
treatments have unequal number of replication
the analysis is easier if there is some missing observations. In fact
properly of orthogonality is not lost by missing values in a CRD
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 26 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Disadvantage of CRD
relatively inecient design as local control method is not adopted to
reduce error variation in this design
it is seldom used in eld experiments because homogeneous unis over
the whole experimental area is rarely available in practice
due to inated error variations there is greater change of wrongly
accepting null hypothesis
precision of CRD is less than other design
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 27 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Application or uses of CRD
the most useful in laboratory technique and methodological studies
e.g. in physics, chemistry, in chemical and biological experiments, in
some green house studies etc.
conveniently used in situation having homogeneous experimental units
suitable in situations where a large fraction of experimental units may
not respond or may be lost in course of experiment
advantages for small experiments because it furnishes maximum
number of error degrees of freedom
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 28 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Randomized Block Design (RBD)
a design in which the whole set of experimental units arranged in
several blocks which are internally homogeneous and extremely
heterogeneous and then the selected treatments are randomly
allocated to the experimental units within each block such that each
treatment occurs one or same number of times in each block
example
▸ to determine how a new type of short wave UVA-blocking sunscreen
aects the general health of skin in comparison to a regular long wave
UVA-blocking sunscreen, 40 trial participants were randomly separated
into equal groups of 20: an experimental group and a control group
▸ all participants' skin health was then initially evaluated. The
experimental group wore the short wave UVA-blocking sunscreen daily,
and the control group wore the long wave UVA-blocking sunscreen daily
▸ after one year, the general health of the skin was measured in both
groups and statistically analyzed. In the control group, wearing long
wave UVA-blocking sunscreen daily led to improvements in general skin
health for 60% of the participants
▸ in the experimental group, wearing short wave UVA-blocking sunscreen
daily led to improvements in general skin health for 75% of the
participants.
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 29 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
the RBD model with a single observation per cell can be written as
y = µ + α + β + ϵ j = 1, . . . , k; i = 1, . . . , b
ij j i ij (1)
assumptions for one-way ANOVA model (1)
▸ all observations of random error ϵij are independent
▸ all the eects are additive in nature
▸ αj is the xed eect of j th treatment
▸ βi is the xed eect of i th block
▸ ϵij are independent and identically distributed following N (0, σ 2 )
k b
▸ the restriction for the estimation is ∑ αj = 0
i=1
and ∑ βi = 0
i=1
yij: independently distributed following N (µ + α + β , σ2) with j i
∑nα =0
k
j j
j=1
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 30 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Hypothesis for the RBD model
the RBD model with a single observation per cell can be written as
y = µ + α + β + ϵ j = 1, . . . , k; i = 1, . . . , b
ij j i ij (2)
for the model (2) the statistical hypothesis is
⎪ H0 ∶ α1 = α2 = ⋯ = α = 0
⎧
⎪
⎩ H1 ∶ all α 's are not equal
k
⎨
⎪
⎪ j
for testing the block eect the hypothesis is
⎪ H0 ∶ β1 = β2 = ⋯ = β = 0
⎧
⎪
⎩ H1 ∶ all β 's are not equal
b
⎨
⎪
⎪ b
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 31 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
How to analyze these data?
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 32 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Advantages of RBD
RBD is more ecient than CRD and thus provides more accurate and
precise results than CRD
any number of blocks and any number of treatment can be used in
RBD except the restriction that at least two replicates are needed to a
carry out the test of signicance
analysis of data is simple and straight forward in RBD
RBD provides a method of eliminating or reducing the eects of trends
it is exible readily adoptable and easy to analyze
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 33 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Disadvantages of RBD
RBD is not suitable for large number of treatments as large error
variation may arise in such case and when the blocks are within
heterogeneous
property of orthogonality is lost by missing values in a RBD leading to
complicated analysis of data
RBD has less error than comparable to CRD
since RBD controls variability due to one extraneous factor it is
unsatisfactory when several extraneous factor exists among the
experimental unit
the eciency of RBD decreases as the block size increases
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 34 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Reasons of Blocking
one major reason for use of blocks is to make inferences over a large
number of environmental conditions
another major reason is to reduce error variation by removing an
unwanted source of variation from error variation
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 35 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
Uses of Blocking
RBD removes one extraneous source of variation from experimental
error and so increases precision. Thus RBD is used to increases
precision.
its use is found to be satisfactory in many experimental situations and
thus it avoids the necessity of using more complex designs.
RBD provides unbiased estimates of block means in addition to that of
treatment means and thus furnishes additional information from the
experiment. It is not necessary that all blocks be conducted at the
same location or at same time.
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 36 / 37
Chapter 5: Design of Experiments Completely Randomized Design (CRD)
In what situation we have to apply RBD?
CRD is appropriate only for experiments having homogeneous
experimental units and a small number of treatments. In experiments
having heterogeneous experimental units and large number of
treatments randomized block design is appropriate. It specially is used
to control the heterogeneity and variability among the experimental
units involving blocks as local control measure. If CRD is used in
situations having heterogeneous experimental units, then the variation
in yields form dierent experimental units can no longer be
appropriate.
Dr. Md. Rezaul Karim (Associate Prof., Dept. of Statistics, JU) Statistics Summer - 2023 37 / 37