Intro To Stat Module 1
Intro To Stat Module 1
RESOURCES
Module Number: 1
1
OWNERSHIP
(COPYRIGHTS)
June, 2016
2
Module Overview...................................................................................................8
3
4
Table of contents
Unit 1: Descriptive Statistics..............................................................................11
1.0 Introduction......................................................................................................11
1.1 Unit Objectives.................................................................................................11
1.2 Basic statistical terms.......................................................................................11
1.3 Summary of data using tables..........................................................................12
1.4 Graphical summary of data.............................................................................14
1.4.1 Histogram.....................................................................................................14
1.4.2 Frequency polygon.......................................................................................18
1.4.3 Ogive............................................................................................................19
1.4.4 Stem and leaf diagram..................................................................................20
1.4.5 Bar graph......................................................................................................22
1.4.6 Pie chart........................................................................................................23
1.5 Numerical summaries of data..........................................................................24
1.5.1 Measures of central tendency.......................................................................24
1.5.1.1 The mode...................................................................................................24
1.5.1.2 The median................................................................................................25
1.5.1.3 The mean...................................................................................................26
1.5.2 Measures of location.....................................................................................28
1.5.2.1 Quartiles.....................................................................................................28
1.5.2.2 Percentiles..................................................................................................29
1.5.2.3 Deciles.......................................................................................................30
1.5.3. Measures of data variability.........................................................................30
1.5.3.1 The range....................................................................................................30
1.5.3.3 Variance and standard deviation.................................................................31
1.5.3.4 Standard Deviation (s, sd)........................................................................32
1.5.4 The Five Number Summary..........................................................................33
1.7 Reflection.......................................................................................................34
Unit summary.........................................................................................................34
2.0 Introduction......................................................................................................37
2.1 Unit Objectives................................................................................................37
2.2 Scatter diagrams and correlations....................................................................37
2.3 Pearson and Spearman rank correlation coefficient.........................................39
2.3.1 Pearson correlation coefficient......................................................................39
2.3.2 Spearman’s rank correlation coefficient........................................................40
2.4 Linear regression..............................................................................................43
2.6 Reflection.........................................................................................................48
Unit Summary .......................................................................................................48
5
Unit 3: Elementary probability theory................................................................51
3.0 Introduction........................................................................................................51
3.1 Unit Objectives...................................................................................................51
3.2 Basic probability terms.......................................................................................51
3.3 Basic probability rules........................................................................................53
3.4 Intersection and union of events.........................................................................54
3.5 Conditional Probabilities....................................................................................56
3.6 Law of total probability and Bayes rule.............................................................59
3.6.1 Law of total probability...................................................................................59
3.8 Reflection............................................................................................................62
Summary...................................................................................................................62
4.0 Introduction........................................................................................................66
4.1 Unit Objectives...................................................................................................66
4.2 Random variable and probability distribution....................................................66
4.3 Discrete probability distributions.......................................................................69
4.3.1 Properties of discrete distributions..................................................................69
4.3.2 Mean and variance of a discrete probability distribution...............................70
4.3.3 Examples of discrete random variables and their distributions....................71
4.3.3.1 Discrete uniform random variable...............................................................71
4.3.3.2 Binomial random variable............................................................................71
4.3.3.3 Bernoulli random variable and probability distribution..............................73
4.3.3.4 Poisson random variable and distribution....................................................73
4.3.3.5 Multinomial random variable.......................................................................74
4.3.3.6 Hypergeometric random variable.................................................................75
4.4 Property of continuous random variable and distribution.................................76
4.5 Some continuous random variables and distributions.......................................79
4.5.1 Normal random variable and distribution.......................................................79
4.5.2 Standard Normal Distribution........................................................................80
4.5.2.1 Using standard normal to find probability of normal random variable........84
4.5.3 The student t-distribution...............................................................................85
4.5.4 The Chi-square (ᵪ2) random variable and distribution...................................86
4.5.5 The F random variable and distribution ..........................................................86
4.8 Sampling Distributions......................................................................................87
4.8.1 Distribution of sample mean (X).....................................................................87
4.8.2 The sampling distribution of X-μs/n...............................................................89
4.9 Reflection............................................................................................................90
Summary...................................................................................................................90
6
Unit 5: Inferential Statistics.................................................................................93
5.0 Introduction........................................................................................................93
5.1 Unit Objectives...................................................................................................93
5.2 Hypothesis Testing.............................................................................................93
5.3 Hypothesis test about population mean.............................................................94
5.3.1 Hypothesis about one population mean..........................................................94
5.3.2 Hypothesis about difference between two population means........................97
5.4 Testing hypothesis about population proportion..............................................102
5.4.1 Hypothesis about single population proportion ............................................102
5.4.2 Hypothesis about the difference between two population proportions.........102
5.5 Testing for equality of multiple populations means using F-test....................104
5.6 Testing for association in contingency tables using the Pearson chi-square....107
5.7 Point and interval estimation of population mean and proportion..................109
5.8 Reflection..........................................................................................................110
Unit Summary.........................................................................................................110
References..............................................................................................................117
7
Module Overview
The aim of this module is to introduce you to the knowledge of organizing and sum-
marizing of data. It further introduces you to the basic knowledge of correlation and
regression. The module further looks at introductory probability theory which forms
the basis for statistical inference. The module has been written so as to equip you with
basic statistical knowledge and skills so that you can be able to organize and summarize
data which will enable you to have first impression of the data. Also the module will en-
able you to have basic statistical inference skills, particularly in hypothesis testing and
interval estimation. Specifically in statistical inference and let alone hypothesis testing
and interval estimation, the module will enable you to set up research hypothesis, select
appropriate sample statistic eg sample mean and then make conclusion about the re-
search hypothesis. Knowledge and skills learnt in the module provide basic statistical
knowledge and skills in the later courses like experimental designs, research project
courses, and also in real life situations particularly in research projects. In a nutshell
this module will enable you to :
In this unit you will learn introductory issues regarding descriptive statistics. First you
will learn how to organize and presents data using frequency tables and graphs. You
will also learn describing data using numerical summaries like measures of the centre
and spread. Apart from measures of the centre you will look at measures of location
which are not necessarily measures of the centre like quartiles and percentiles.
In this unit you will learn about assessing correlation using scatter plots. You will then
learn how to compute and interpret Pearson and Spearman correlation coefficients as
numerical measures of correlation. You will finally learn about simple linear regression
model as a means to investigate linear relationship between two variables.
8
Unit 3: Elementary Probability Theory
In this unit you will learn elementary probability theory. You will specifically learn
elementary probability rules like the addition and multiplication rule. This will be
followed by conditional probability, law of total probability and the Bayes rule. What
you will learn in this unit will help you in working out basic probability problems and
knowledge will provide you foundation to Bayesian inference which stems from Bayes
theorem.
The unit continues from previous unit by looking at random variables and their
probability distributions. In the unit you will learn two types of random variables:
discrete and continuous. For the discreet random variables, you will look at the
uniform, Poisson, binomial and hypergoemetric and for continuous random variables,
you will learn about normal, t, chi-square and the f-random variables. Finally in this
unit, you will learn about sampling distribution of the sample mean and the related
quantities. What is presented in this unit is important for statistical inference proce-
dures presented in unit 5.
This unit introduces you to statistical inference in terms of hypothesis testing and in-
terval estimation. In the unit you will learn about hypothesis test about the mean and
proportion. You will further learn about hypothesis test about more than two means
using the F test in ANOVA. In the in the unit, you will also learn about hypothesis test
about independence or association in a cross table. Finally you will learn about interval
estimation of population mean and proportion.
Method of Assessment
This module is divided into a number of units. Each unit addresses some of the learning
outcomes. In each unit there are a number of concepts to be learnt and corresponding
to help you internalize the concepts discussed therein. The activities vary in nature to
address the three levels of Blooms learning taxonomy. The module details will help you
to succeed in your all your lesson tasks.
All the unit tasks which are self-reflection, self-assessment, group work or unit test will
account for your 40%. In each unit as part of continuous assessment; to promote your
learning, you will also do activities throughout this module that will help you prepare
for your major assignment or for the final examination. Depending on the way your
9
course lecturer wishes to assess you, you might be given a written examination set
by the university. The module test at the end of it will account for 60% of your final
grade.
This module gives you a unit-by-unit guide to the course you are studying. Each unit
includes information, case studies, activities, self-help questions and readings for you
to comprehend the unit. These are all designed to help you achieve the learning out-
comes that are stated at the beginning of the module.
The activities, self-help questions and case studies are part of the course activities.
They help you make your learning more active, enjoyable and effective. The tasks
will help you to engage with concepts being discussed and help you check your own
understanding. Where you are not sure, work with your peers or if possible consult the
course lecturer.
Please note that the activities may be reflective exercises designed to get you thinking
about the issues raised in it. They may be practical tasks to undertake on your own or
with fellow students. The self-help questions are usually more specific and require a
brief written response. The answers are given at the end of each unit, however, the
answers are not biblical; you are free to think of your own. If you wish, you can also
record your answers to the self-help questions in your learning journal, or you may use
a separate notebook
10
Unit 1:
Descriptive Statistics
1.0 Introduction
In this unit you will learn about descriptive statistics where basically you will learn how
to summarize and organize data. Specifically under organization and summarization
of data you will learn about organizing and summarizing data using tables and graphs.
In the unit you will also specifically learn about summarizing data using numerical
summaries like measures of centre, location and dispersion. The unit is important as it
provides you preminary data analysis tools before carrying out inferential statistics.
Key terms
• Frequency distribution
• Measure of centre
• Measure of location
• Measure of dispersion
11
standard deviation which describe population variation.
• A statistic is a numerical description of a sample characteristic. Examples
include sample mean describing sample centre and sample standard deviation
which describe how sample values vary.
• Descriptive Statistics is the branch of statistics that involves the organization,
summarization, and display of data.
• Inferential statistics is the branch of statistics that involves using a sample to
draw conclusions about a population. Some of the problems concerned in
inferential statistics are those of estimation and hypothesis testing.
• Data types: There are two types of data
o Qualitative and quantitative
o Qualitative data is non numeric data and is usually treated as categorical
data.
This means the observed data values can be put into categories. Examples of
qualitative/categorical data include: gender data(male/female), college(poly/chanco/
bunda)
Activity 1.1
State the type of the following data
a) Fertilizer type
b) Fish weight
Example
Frequency distribution table for political affiliation
12
A frequency table: is a list of possible values and their frequencies.
It is used to summarize categorical data into tables to see frequency distribution for each
category
Example
Frequency distribution table for political affiliation
Political Frequency Relative Cumulative Cumulative
affiliation frequency frequency relative
frequency
PP 70 70/300=0.23 70 0.23
AFORD 30 30/300=0.1 100 0.33
DPP 100 100/300=0.33 200 0.66
UDF 20 20/300=0.07 220 0.73
MCP 80 80/300=0.27 300 1
IfIfyou
youhave
havecontinuous
continuous data
data and
and want
want to
to have
have frequency
frequency table
tableput
putthe
thedata
datainto
intogroups
groupsor
orcategories.
categories.
Example
4
Group internet use data into group frequency table
7,7,11,17,17,18,19,20,21,22,23,28,29,29,30,30,31,31,33,34,36,37,39,39,39,40,41,41,4
2,44,44,46,50,51,53,54,54,56,56,56,59,62,67,69,72,73,77,78,80,88
Step 2: Find class width by dividing the range by the number of classes: Class width
Step 3: Find class boundaries starting from lowest observation and adding class width
to find upper limit. This gives boundaries of 7, 19, 31, 43, 55, 67, 79, and 91
NOTE: We will have the classes running from 7 to 19, etc where the upper bound is
exclusive, i.e it does not include 19
13
Step 3: Find class boundaries starting from lowest observation and adding class width to
upper limit. This gives boundaries of 7, 19, 31, 43, 55, 67, 79, and 91
NOTE: We will have the classes running from 7 to 19, etc where the upper bound is
exclusive, i.e it does not include 19
Class Freq Rel freq Cum freq Cum rel
freq
7 -19 6 6/50=0.12 6 0.12
19 - 31 10 10/50=0.20 16 0.32
31 -43 13 0.26 29 0.58
43 - 55 8 0.16 37 0.74
55 - 67 5 0.10 42 0.84
67 - 79 6 0.12 48 0.96
79 - 91 2 0.04 50 1.00
Step 3: For each class, count the number of observations that fall in that class. This numb
Step 3: For each
called class,
the class count the number of observations that fall in that class. This
frequency.
number is called the class frequency.
Step 4:Step
The4:relative
The relative frequency
frequency of aisclass
of a class is calculated
calculated by f/n fwhere
by f/n where is the ffrequency
is the frequency o
of the class and nn is the number of observations in the data set.
Step 5:Step
Find5:the
Find the cumulative
cumulative frequency
frequency and cumulative
and cumulative relative
relative cumulative
cumulative frequency
frequency
ActivityActivity
1.2 1.2
A sample of the variable x assumes the following values:
1 2 6 7 12 13 2 6 9 5
18 7 3 15 15 4 17 1 14 5
A4 sample
16 of the
4 variable
5 x assumes
8 the
6 following
5 values:
18 5 2
Generate
1 a frequency
2 distribution
6 7 indicating
12 group
13 of
2 x and frequency
6 9 of x,5
18 7 3 15 15 4 17 1 14 5
1.4 Graphical4 summary
16 of data
4 5 8 6 5 18 5 2
Generate a frequency distribution indicating group of x and frequency of x,
1.4.1 Histogram
5
A histogram displays the frequency distribution of a quantitative variable by showing
the frequency (count) or percent of the values that are in various classes.
The classes are typically intervals of numbers that cover the full range of the variable.
Histograms are used to assess the distribution of the quantitative variable.
To construct a histogram, group the data into a frequency table, i.e. put data into class-
es, and determine frequency for each class.
Plot class frequency against class intervals to have a histogram.
14
13
16
In the right skewed distribution it means most data is to the left and in the left skewed
distribution it means most data is to the right.
(c) Bimodal
The bimodal distribution looks like the back of a two-humped camel.
17
distribution it means most data is to the right.
(c) Bimodal
The bimodal distribution looks like the back of a two-humped camel.
25
20
Frequency
15
10
5
0
-3 6
Temperature
Fig 1.5: Bimodal shape of histogram
The bimodal distribution means that the data come from two distinct populations.
Fig 1.5: Bimodal shape of histogram
1.4.2 The bimodalpolygon
Frequency distribution means that the data come from two distinct populations.
Is a line graph plotted by plotting frequency against class interval. It shows frequency
1.4.2 Frequency
distribution of classes bypolygon
a line graph. For example the frequency
polygon for internet use data is as shown below:
Is a line graph plotted by plotting frequency against class interval. It shows
frequency distribution of classes by a line graph. For example the frequency polygon
for internet use data is as shown below:
18
1.4.3 Ogive
Is a plot of cumulative frequency versus class intervals. The diagram below shows an
ogive for internet use data.
19
1.4.4 Stem and leaf diagram
It is used to display the frequency distribution of quantitative data by showing actual
data rather than frequencies.
Example
Draw stem and leaf diagram for the students scores:
35, 36, 38, 40, 42, 42, 44, 45, 45, 47, 48, 49, 50, 50, 50.
Solution
Range: 35 to 50
A stem-and-leaf plot shows the shape and distribution of data. It can be clearly seen in
the diagram above that the data clusters around the row with a stem of 4. It is some how
normal(ie most data is at the centre and a few towards the ends).
20
Example
A teacher asked 10 of her students how many books they had read in the last 12 months.
Their answers were as follows:
Prepare a stem and leaf plot for these data. Tip: The number 6 can be written as 06,
which means that it has a stem of 0 and a leaf of 6. The stem and leaf plot should look
like this:
Using the data from table above, we make the ordered stem and leaf plot shown below:
Example
The weights (to the nearest tenth of a kilogram) of 30 students were measured and
recorded as follows:
59.2, 61.5, 62.3, 61.4, 60.9, 59.8, 60.5, 59.0, 61.1, 60.7, 61.6, 56.3, 61.9, 65.7, 60.4,
58.9, 59.0, 61.2, 62.1, 61.4, 58.4, 60.8, 60.2, 62.7, 60.0, 59.3, 61.9, 61.7, 58.4, 62.2
Prepare an ordered stem and leaf plot for the data. Briefly comment on what the
analysis shows.
Solution
In this case, the stems will be the whole number values and the leaves will be the
decimal values. The data range from 56.3 to 65.7, so the stems should start at 56 and
finish at 65
21
1.4.5 Bar graph
It is used to show graphical frequency distribution for a categorical variable.
Example
Draw the bar graph for the data below:
22
1.4.6 Pie chart
A pie is an alternative display of bar graph
Example
Draw the pie chart for the data in example above.
A pie chart is a circle with slices proportional to the relative frequency of each category.
A Pie chart is made by representing the relative frequency of a category by an angle of
a circle determined by:
Relative frequency=f/n
Example
A shop sells different sizes of gloves. The table shows the percentage of gloves sold in
a year that were each size.
As the values are percentages, the total must be 100% (but check to make sure)
23
Note: Sometimes rounding the angles leads to a total angle of more or less than 360º.
If this happens, adjust the angle of the largest sector so that the total is correct e.g if the
total comes to 361º, take 1º away from the largest angle.
Activity 1.3
Measures of the centre indicate where the center or the most typical value of the
variable lies in collected set of measurements. Measures of center are often referred to
as averages.
24
Example
(a)
Student mid term scores
1 94
2 90
3 90
4 90
5 81
6 70
7 65
8 56
9 30
(b)
Solution
(a) mode is 90 (b) mode is category no anaemia
Note that for a categorical variable, look at the category with highest frequency and for
a quantitative (numeric) variable look at number with frequent occurrence.
If the number of observation is odd, then the sample median is the observed value
exactly in the middle of the ordered list. If the number of observation is even, then the
sample median is the number halfway between the two middle observed values in the
ordered list. In both cases, if we let n denote the number of observations in a data set,
then the sample median is at position (n+1) in the ordered list.
2
25
Example
Seven participants in bike race had the following finishing times in minutes:
28,22,26,29,21,23,24. What is the median?
Solution
Medium is the middle observation in the ordered list 21,22,23,24,26,28,29 i.e 24.
What it means, half of the observations are less than or equal to 24 and half of the ob-
servations are more than or equal to 24(check).
Example
Eight participants in bike race had the following finishing times in minutes:
28,22,26,29,21,23,24,50. What is the median?
Solution
Ordered list: 21,22,23,24,26,28,29,50, and median is the sum of two observed values
middle values dived by 2 i.e. (24+26) = 50 = 25
2 2
In both cases the median is at position in the ordered list (check).
Definition (Mean). The sample mean of the variable is the sum of observed values in
a data divided by the number of observations.
Example
Seven participants in bike race had the following finishing times in minutes:
28,22,26,29,21,23,24. What is the mean?
Solution
Mean=(28+22+26+29+21+23+24)=25
7
26
In case of this symmetric distribution, the mean=mode=median (see Fig 11).
27
In a case of quantitative variable with skewed distribution, the median is good choice
for the measure of center. This is related to the fact that the mean can be highly
influenced by an observation that falls far from the rest of the data, called an outlier.
Activity 1.4
1.5.2.1 Quartiles
The quartiles of the variable divide the observed values into quarters, or 4 equal parts.
The variable has 4 quartiles, denoted by Q1, Q2 and Q3, and Q4
28
Roughly speaking, the first quartile, Q1, is the number that divides the bottom 25% of
the observed values from the top 75%; second quartile, Q2, is the median, which is the
number that divides the bottom 50% of the observed values from the top 50%; and the
third quartile, Q3, is the number that divides the bottom 75% of the observed values
from the top 25%. The fourth quartile (Q4) is the maximum value in the data set.
Definition (Quartiles). Let n denote the number of observations in a data set. Arrange
the observed values of variable in a data in increasing order.
1.5.2.2 Percentiles divide the data into 100 parts, i.e 1st, 2nd, 3rd, 4th , ….100th
percentile.
If data is divided into hundred parts by percentiles it means 1% of data is below the
1st percentile, 2% below 2nd percentile, 3% below 3rd , 50% of data below the 50th
percentile, 90% of the data falls below the 90th percentile. To find percentiles sort data
from low to high. Apply the formula for percentiles to locate the percentile you are
interested in,
Example
Find
a) the 50th percentile
b) 20th percentile in this ordered data: 5 7 9 10 11 13 14 15 16 17 18 18 20 21 37
Solution
Decimal positions of percentiles do not make sense, so interpolate the actual values,
that is, at position 4, there is 10 and at position 3 there is 9, therefore the 20th percentile
is 9+0.2* (10-9)=9.2
29
Relationship between quartiles and percentiles
1st. quartile→ 25th percentile
2nd. quartile→ 50th percentile (MEDIAN)
3 . quartile→
rd
75th percentile
4 . Quartile→
th
100th percentile
Notice that the 2nd quartile, or 50th percentile is the same as the median. Thus quartiles
operate with the same formula for percentiles.
1.5.2.3 Deciles
Deciles are those values which divide the data series into ten equal parts. There are nine
deciles i.e. D1, D2, D3, …, D9 in a series and 5th decile(D5) is same as median and 2nd
quartile, because those values divide the series in two equal parts.
Example
Seven participants in bike race had the following finishing times in minutes:
30
28,22,26,29,21,23,24. What is the range?
Solution
Range=Max-Min=29-21=8
Example
Eight participants in bike race had the following finishing times in minutes:
28,22,26,29,21,23,24,50. What is the range?
Solution
Range=Max-Min=50-21=28
Example
Seven participants in bike race had the following finishing times in minutes:
28,22,26,29,21,23,24. What is the interquartile range?
Solution
Ordered list: 21,22,23,24,26,28,29
Interquartile range=Q3-Q1=28-22=6
Like variance the larger the sd, the lager the distance the measurements have from the
mean.
Example
Find the variance and standard deviation of the following data: 2,3,5
Solution
32
Measurements in A are closer to the mean, than measurements in C which are far from
the mean. Standard deviation is used instead of variance because variance has squared
units where standard deviation has the same units as the original data.
Boxplots are basically a way of graphing the five-number summary. The simplest type
of boxplot is of the form:
The box is the interquartile range, that is, difference between 1st quartile and 3rd
quartile.
The line between Q1 and Q3 denote the median or 2nd quartile or 5th decile.
Activity 1.5
33
1.7 Reflection
An exam is given to students in an introductory statistics course. What is likely to be
true of the shape of the histogram of scores if the exam is quite easy?
Unit summary
In this unit you have learnt about ways of describing the population using the
sample. You have learnt about table, graphical and numerical measures in describing the
population. What you have learnt in this unit is important in understanding the first
impression of the data before having statistical significant tests which are presented in
unit 4.
3.
Which of these statements is false?
A. A parameter, in practice, is an unknown number describing the
population.
B. A statistic is used to estimate an unknown parameter.
C. A parameter is used to estimate an unknown statistic.
D. Statistics can change from sample to sample.
4.Which of the following is/are true about a skewed right distribution with
extreme outliers?
I) The mean is greater than the median
II) The median should be used as the measure of center because
it is more resistant to extreme observations than the mean
III) The standard deviation should be used as the measure of spread because
it is more resistant to extreme observations than the range or inter-
quartile range
A. I and II only
34
B. I and III only
C. II only
D. I, II, and III
5. The lengths (to the nearest tenth of a cm) of 30 fish in a pond were measured
and recorded as follows: 59.2, 61.5, 62.3, 61.4, 60.9, 59.8, 60.5, 59.0, 61.1,
60.7, 61.6, 56.3, 61.9, 65.7, 60.4, 58.9, 59.0, 61.2, 62.1, 61.4, 58.4, 60.8, 60.2,
62.7, 60.0, 59.3, 61.9, 61.7, 58.4, 62.2. Prepare an ordered stem and leaf plot
for the data. Briefly describe the distribution of lengths.
Range=18-1=17
Number of classes=
Class width=17/5=4
Class boundaries: 1 to 4, 4 to 7, 7 to 10, 10 to 13, 13 to 16.
Class Limits Frequency
______________________________________
1–5 9
5 – 9 11
9– 13 2
13 – 17 5
17 – 21 3
______________________________________
Answer to activity 1.3
A histogram display continuous data while bar graph displays categorical data. Fur-
thermore a histogram has no gaps between bars while a bar graph has gaps between
bars.
35
Answer to unit activity
36
Unit 2:
In this unit you will learn about correlation using scatter plots, Pearson, and Spearman
rank correlation coefficient. Correlation is related to linear regression as both look at
relationship between variables. Linear regression forms the basis for most modeling
techniques and hence it will help you to easily understand them.
Key terms
• Correlation
• Regression
Scatter plots are plots of two quantitative variables, one on vertical and another on the
horizontal axis. Patterns of scatter diagrams are used to show relationship between two
variables. Scatter diagram correlations are shown and described below:
37
Fig 2.1: Scatter patterns and correlations
Strong Positive: If one variable increases at the same time the other variable increases,
they are said to be positively correlated.
Strong Negative: If one variable decreases at the same time the other variable
increases, or vice versa, they are said to be negatively correlated.
Complex: The data points are scattered in a curved pattern. The shape may look like
a rainbow or an arch. The two variables are correlated, though not linearly. As X
increases, Y first increases, then it decreases (or vice versa).
Weak Relationships: A weak correlation does not necessarily mean that the factor
being studied is not a cause. It may simply be a weak cause or a cause that requires the
presence of another contributing factor to bring about the effect. In this latter case, both
the factor under study and the contributing factor are perfectly good causes; you just
need them both to be active simultaneously to get the effect.
No Relationship: The data points are scattered in a shapeless pattern. You can
conclude that the two variables are not correlated over the ranges for which the data
was collected.
Activity 2.1
A scatter plot allows one to see:
A. whether there is any relationship between two variables
B. what type of relationship there is between two variables
C. both a and b
D. neither a nor b
38
2.3 Pearson and Spearman rank correlation coefficient
Note that -1 > r >1, i.e r is between -1 and 1. If r>0, it means there is positive correla-
tion between two data sets and if r<0, it means there is negative correlation. If r=1 it
means there is perfect positive correlation and if r= -1, it means there is perfect negative
correlation. If r is close to 1, there is strong positive correlation and if r is close to -1 it
means there is strong negative correlation.
Example
Find Pearson correlation coefficient between Kips’ fat and calories using data below:
Solution
39
How can you describe the correlation between X and Y data?
Solution
There is strong positive linear relationship, i.e. as x increase y increase as well i.e the
scatter plot of y against x confirms
40
Note that this value is particularly useful when the exact data values are not known but
have already been ranked (eg positions in a class or competition). Also the spearman
rank correlation coefficient is used when the two sets of measurements are asymmetri-
cally distributed (non normal).
Example
These are the marks obtained by 8 pupils in a Mathematics and Physics. Calculate
Spearman’s coefficient of rank correlation.
Note that one may rank the observations by assigning the largest data point the smallest
rank (1) instead of assigning the largest rank to the largest data point. See next example.
Example
The marks of 12 pupils in geography and history essays are as follows:
41
Calculate Spearman’s rank correlation coefficient.
Solution
First we must rank the data.
Geography
19 = 1
18 = 1 (2 + 3 + 4) = 3
3
17 = ½ (5 + 6) = 5.5
16 = ½ (7 + 8) = 7.5
15 = ½ (9 + 10) = 9.5
14 = 11
10 = 12
History
13 = ½(1+2) = 1.5
12 = 1 (3+4+5) = 4
3
11= 1 (6+7+8) = 7
3
10 = 9
9 = 10
8 = 11
7 = 12
42
There is some positive correlation between the geography and history results.
Activity 2.2
At the Agriculture show 10 countries sheep were ranked by the qualified
judge and by a trainee judge. Their rankings are shown below:
Calculate the Spearman rank correlation coefficient between the ranks for
qualified judge and trainee judge, and coment on the correlation.
The central purpose of linear regression is to create a linear equation relating the in-
dependent variable X, to the dependent variable Y. The linear equation can be used to
predict Y given X.
43
44
Example
45
Solution
(a)
(c)
46
28 530 14840 784
̅ 20.6 ̅ 416 ∑ =46950 ∑ 2421
n
Y X i i nY X
b1 i 1
n
2
X
2
i nX
i 1
13.70989
̅ ̅
13.70×20.6
133.5762
Thus linear regression equation/model between Y and X is
Y 133.57 13.7 X
(c) To predict y given x=2 we substitute the x value into the regression e
Y 133.57 13.7 X
133.57 13.7 2
160.97
Note that the assumption to fit a linear regression line to the given data is
Note that the assumption to fit a linear regression line to the given data is that the
response(Y) must be response(Y) must be normally
normally distributed. distributed.
Now suppose the Now suppose
scatter the scatter plot b
plot between
variables
two variables is not linear, thatisis,notcomplex
linear, that
likeis, complex Then
quadratic. like quadratic.
the modelThen the model bet
between
and predictor(X)
response(Y) and predictor(X) has to be has to be modified
modified by including
by including the term inthethe
term in the
model to model to
the complexity
take into account the complexity e.g e.g a quadratic
a quadratic relationship
relationship between
between Y and Y and X would be re
X would
be represented by the model:
model: Y b0 b1 X 2
36
48
Answers to unit activities
Answers
Answers to unittoactivities
unit activities
Answer
Answer
Answer totoactivity
2.1 2.1
activity
to activity 2.1
C C C
Answer
Answer
Answer totoactivity
2.2 2.2
activity
to activity 2.2
AnswersQualified
to unit activities
12 23 34 45 56 67 78 89 9 10 10
Qualified 1
Answer to activity 2.1
CJudge Judge
AnswerTrainee
to activity 2.2
12 25 56 67 78 810 104 43 39 9
Trainee 1
Qualified 1 2 3 4 5 6 7 8 9 10
Judge
Judge Judge
d d
Trainee0
0
1 0 2
0-2 5 -2-26 -2-27 -28-2 -2
-3
10
-3
4 4
4
3 6 9
61 1
2
d2 d
Judge 0 00 04 44 44 44 49 9 16 1636 361 1
d 0 0 -2 -2 -2 -2 -3 4 6 1
6r 78 6d 78 00.47 0
2
4 4 4 4 9 16 36 1
r 0.47
210(10 2 1)
10(10 1)
Coment:
Coment: there
6 is
there 78 is positive
positive correlation
correlation between between
trainedtrained and trainee
and trainee judge. judge.
r 0.47
10(10 2 1)
Answer
Answer toto
Coment: activity
activity
there 2.3
2.3 correlation between trained and trainee judge.
is positive
Answer to activity 2.3
Y=29.25+0.57×78=73.71
Answer to activity 2.3
AnswerAnswer
to Unittopractice
Unit practice
activity activity
Answer
a) to Unit
Based on practice
scatter plot,activity
a) Based
Answerontoscatter plot, there
Unit practice isthere
activity
is a strong
a strong positive
positive linear correlation
linear correlation between
between yield and
yield and
a) Based on
fertilizer scatter
concentration.plot, there is a strong positive linear correlation between
fertilizer
a) Based concentration.
on scatter plot, there is a strong positive linear correlation between yield and
yield and fertilizer concentration.
fertilizer concentration.
8
8
8
7
7
7
6
6
6
yield
Crop yield
Crop yield
5
5
5
Crop
4
4
4
3
3
3
2
2
2
1
1
1
10 15 20 25
10 10 15 15 20
Fertilizer concentration
20 25 25
Fertilizer
Fertilizer concentration
concentration
37
49
37
(b) y=4+3x
(c) y=4+3(12)=40
(b)
Answer
(c) to Unit test
Answer to Unit test
STUDENT HOURS SCORE
(X) (Y) X2 Y2 XY
A 1 1 1 1 1
B 1 3 1 9 3
C 3 2 9 4 6
D 4 5 16 25 20
E 6 4 36 16 24
F 7 5 49 25 35
G 8 7 64 49 56
H 8 8 64 64 64
2 2
X = 38 Y = 35 X =240 Y =193 XY= 209
N XY X Y
r
[ N X 2 ( X ) 2 ][ N Y 2 ( Y 2 )]
8(209) (38)(35)
[8(240) (38) 2 ][8(193) (35) 2 ]
342
151844
342
389.6717
0.878
Interpretation: There is strong positive linear correlation between students study time and
Interpretation:
grade. There is strong positive linear correlation between students study time
and grade.
3850
Unit 3:
Statistical inference is based on probability theory. The aim of this unit is to intro-
duce to you basic elements of probability. In the unit you will learn about elementary
probability rules. You will actually learn about multiplication and addition rule. This
will be followed by law of total probability and the Bayes theorem. The Bayes rule
provides the starting point for Bayesian inference where in estimation of parameters
there is inclusion of prior knowledge.
3.1 Unit Objectives
Key terms
• Probability
• Conditional probability
• Bayes rule
Example
In throwing a coin, sample space is the set s={H,T} where H means head and T means
tail.
In throwing a die sample space is the set s={1,2,3,4,5,6}
51
Example
A family has three children. Write the sample space.
Solution
We may be helped by the tree diagram:
The
The entry
entry (2,(2,
5),5),
forfor example,
example, indicates
indicates that
that thethe
redred
diedie shows
shows a two,
a two, andand
thethe green
green a 5.
a 5.
Event: event is
Event: An event is aasubset
subsetofofa asample
samplespace
space
or or
it isit an
is outcome
an outcome of interest.
of interest. For
For example if
example if you have thrown a coin twice, and you are looking at probability
you have thrown a coin twice, and you are looking at probability that you have two that you
have two heads(HH),
heads(HH), then yourthen your
event event
is the the40
twoisheads(HH).
two heads(HH).
n( E )
Probability:
Probability: P( E )
n( S )
Activity 3.1
Activity 3.1
A coin and a die are thrown, write the sample space.
A coin and a die are thrown, write the sample space.
3.3 Basic probability rules
3.3 Basic probability rules
Suppose P(A) is the probability of some event defined by A. P(A) is always in the interval
Suppose P(A) is the probability of some event defined by A. P(A) is always in the
[0, 1] i.e[0,
interval 0≤1] P(A)≤1.
i.e 0≤ Now P(A)Now
P(A)≤1. = 1 means
P(A) =the event occurs
1 means withoccurs
the event certainty
withandcertainty
P(A) = 0
means
and P(A) that=it0will certainly
means that itnot
willoccur. The not
certainly probability of all
occur. The events putoftogether
probability must
all events putadd up
to 1, so long
together mustasaddweupdon’t
to 1,double-count
so long as we bydon’t
including events that
double-count byoverlap.
including Forevents
example,
that
considerFor
overlap. a sample space
example, of an experiment
consider of throwing
a sample space a coin, S
of an experiment H , T } . Now
of{throwing a coin, .
Now p(H)+p(T)=1.
p(H)+p(T)=1. The complement
The complement of event
of event A alsoAknown
also known
as notasevent
not event A is denoted
A is denoted by ~A (or in
by ~A (or in other books A’). This is the probability that A will not occur.
other books A’). This is the probability that A will not occur. P(~A) = 1 – P(A), and P(~A) = 1 –P(A) +
P(A), and
P(~A) = 1. P(A) + P(~A) = 1.
Activity 3.2
Activity 3.2
The metrological department says that probability that it will rain in cen-
tral region today is 0.8. What do you think is the probability that it will
Self- not rain today?
The metrological department says that probability that it will rain in central region today is
0.8. What do you think is the probability that it will not rain today?
53
3.4 Intersection and union of events
0.8. What do you think is the probability that it will not rain today?
If theIftwo events
the two A and
events B are
A and mutually
B are mutuallyexclusive,
exclusive, in the
the equation
equation
P(A U B) = P(A) + P(B) – P(A ∩ B) meaning54they cannot happen
P(A U B) = P(A) + P(B) – P(A ∩ B) meaning they cannot happenatatthe
thesame time,
same then
time, the the
then
formula
formula simplifies
simplifies to to
P(A U B) = P(A) + P(B).
If the two events A and B are mutually exclusive, in the equation
If the two events A and B are mutually exclusive, in the equation
P(AUUB)B)==P(A)
P(A P(A)++P(B)
P(B)––P(AP(A∩∩B) B) meaning
meaning they
they cannot
cannot happen
happen atat the
the same
same time,
time,then the
formula
then simplifies
the formula to
simplifies to
P(A U B) = P(A)
P(A U B) = P(A) + P(B).+ P(B).
Notethat
Note thatififevent
eventAAandandB Barearemutually
mutuallyexclusive
exclusivetheir
theirintersection
intersectionis is0,0,i.ei.e
AA∩∩B=0,B=0,seeseebelow
below
Solution
Sketch of events:
55
statistics
Solution
Sketch of events:
(a)event
(a) Let the Let the event ofmathematics
of picking picking mathematics
student be student be M
M and that of and thatstatistics
picking of picking statistics
student be
student be S, then probability of picking Mathematics or statistics student
S, then probability of picking Mathematics or statistics student is denoted as p(M or S) is denoted as
sometimes p(M or U
p(M S)S)sometimes p(M U S)
which is defined aswhich is defined as
p(M U S)=p(M)+p(S)-P(M and S)
p(M U S)=p(M)+p(S)-P(M and S)
=100/170+90/170-60/170
=100/170+90/170 60/170
(b) Let event of neither M nor S be N, we need p(M or N) also denoted as p(M U N)
(b) Let event of neither
defined as p(MM U nor S be N, we need p(M
N)=p(M)+p(N)-p(M and N)or N) also denoted as p(M U N) defined
as p(M U N)=p(M)+p(N) p(M and
=p(M)+p(N), N) event M and N is impossible
because
=p(M)+p(N), because event M and N is impossible
=100/170+40/170
=100/170+40/170
ActivityActivity
3.3 3.3
Self- Let E1 be an event a cow gives birth to male cow and E2 be an event a cow
∩
gives birth to a female cow, how can we find p(E1 E2)
Let E1 be an event a cow gives birth to male cow and E2 be an event a cow gives birth to a
female cow, how can we find p(E1 E2)
3.5 Conditional Probabilities
A3.5
conditional probability
Conditional tells you the probability of some event given that you already
Probabilities
A conditional
know probability
that another tells
event has you the
43 probability
occurred. The formula of for
some event given
conditional that you already
probability of A know
given B is:
that another event has occurred. The formula for conditional probability of A given B is:
P( A B)
P( A | B)
P( B)
What about conditional probability of B given A denoted as P(B|A)?
P( B A)
P( B | A)
We have P( A)
Example
Example
Consider situation above
Consider situation above
56
Example
Example
Consider situation above
Consider situation above
Find probability of picking a mathematics pupil given that he also take statistics
Find probability of picking a mathematics pupil given that he also take statistics
Solution
Solution
P( M S )
We need P( M | S ) P( M S )
We need P( M | S ) P( S )
P( S )
=60/170 90/170
=60/170 90/170
Example
Example
Example
The following table shows the distribution by gender of students at Bunda Campus who use
The following
The following table
tableshows
showsthethedistribution
distribution bygender
genderofofstudents
studentsatatBunda
BundaCampus
Campus who use
public transport and the ones who drive tobyschool. who
public
use transport
public andand
transport the ones
the oneswhowho
drive to school.
drive to school.
The events M, F, P, and D are self explanatory. Find the following probabilities.
The events M, F, P, and D are self explanatory. Find the following probabilities.
a. P(D | M) b. P(F | D) c. P(M | P)
a. P(D | M)
Solution b. P(F | D) c. P(M | P)
P(E F)
We use the conditional probability formula P(E | F) = P(F) .
P(D M) 39/100 39
a. P(D | M) = P(M) = 47/100 = 47 .
44
44
P(F D) 40/100 40
b. P(F | D) = = =
P(D) 79/100 79 .
P(M P) 8/100 8
c. P(M | P) = = =
P(P) 21/100 21 .
57
Multiplication Rule
Events are called independent if the occurrence of one does not affect the probability of the
D) P)40/100
P(F P(M 8/100 40 8
b. P(F | D) |=P) =P(D)
c. P(M = 79/100
= = .
P(P) 21/10079= 21 .
Multiplication
MultiplicationRule Rule P(M P) 8/100 8
Events c. P(M | P) = = = . probability of
Eventsare arecalled
calledindependent
independentififthe theoccurrence
occurrenceP(P)ofofone
onedoes
doesnot
21/100 notaffect
21 the
affect the probability of the
the other. That is, P(A|B) = P(A).
other. That is, P(A|B) = P(A).
Multiplication Rule P( A B)
SoSo are
Events P( Acalled
| B) independentifPthe ( A)occurrence of one does not affect the probability of the
P( B)
other. That is, P(A|B) = P(A).
This would mean, P(A ∩ B) = P(A) ∙ P(B). Thus the multiplication rules says that if A and B
This would P( A P(AB) ∩ B) = P(A) ∙ P(B). Thus the multiplication rules says that if A
So Pare | B) mean,events,
( Aindependent Pthen
( A) you can multiply the probabilities together to get the joint
and B are P ( B )
independent events,
probability (the probability of the thenintersection).
you can multiply That the probabilities
is, P(A ∩ B) = P(A) together
∙ P(B)to get
Thisthe joint mean,
Inwould
generalprobability
P(A ∩(the
if events B) probability
for= of the
P(A)2,∙ 3,….,n
i=1, P(B). areintersection).
Thus That
then is,
the multiplication
independent, P(A
rules ∩ B)
says that=ifP(A) ∙ B
A and
P(B)
are independent events, then you can multiply the probabilities together to get the joint
probability
Example (the probability of the intersection). That is, P(A ∩ B) = P(A) ∙ P(B)
In general
In general if events for
if events A_ii=1,
for i=1, 2, 3,….,n
2,together
3,….,n are are independent,
independent, then
A die and a coin are thrown once, what is the then
probability of getting an even number
on a die and a head on a coin.
Example
Solution
A dieLetand a coin
event
Example are thrown
of even on die together
be denoted once, what
as E andishead
the probability of Then
on coin as H. gettingweanneed
evenevent
number
E and H
on aAalso
die
dieand
andaahead
denoted as on
coin ∩a H.
Eare coin.
thrown together
But these twoonce,
eventswhat is the probability
are independent, of can
that is, getting an even each
not influence
Solution
number on aare
other since dieoccurring
and a head onon a coin.
two different objects. Thus
Let event
P(E ∩ofH)=P(E)×P(H)
even on die be denoted as E and head on coin as H. Then we need event E and H
Solution
also denoted as E ∩ H./2But these two events are independent, that is, can not influence each
=3/6
Let event
otherExample of even on die
since are occurring be denoted
on two differentasobjects.
E and head
Thuson coin as H. Then we need event
E
P(E Theand H also
∩ H)=P(E)×P(H) denoted as E ∩ H. But these two events are independent, that is, can not
probability that Jaime will visit his aunt in Mzuzu this year is 0.30, and the probability
influence each other since are occurring on two different objects. Thus
that∩he=3/6
will go/2river rafting on Rukuru river is 0.50. If the two events are independent, what
P(E H)=P(E)×P(H)
Example
is the probability
=3/6×1/2that Jaime will do both?
TheExample
probability that Jaime will visit his aunt in Mzuzu this year is 0.30, and the probability
thatThe
he will go riverthat
probability rafting
Jaimeon will
Rukuruvisitriver is 0.50.
his aunt If the two
in Mzuzu events
this year is are independent,
0.30, what
and the prob-
is the probability
ability that he that
will Jaime
go riverwill do both?
rafting on Rukuru river is 0.50. If the two events are inde-
pendent, what is the probability that Jaime will do both?
Solution:
Let A be the event that Jaime will visit his aunt this year, and R be the event that he will
go river rafting. We are given P(A) = 0.30 and P(R) = 0.50, and we want to find P(A
R). Since we are told that the events A and R are independent,
P(A R) = P(A)× P(R) = (0.30)(0.50) = 0.15. 45
Example
Given P(B | A) = 0.4. If A and B are independent, find P(B).
45
Solution
If A and B are independent, then by definition P(B | A) = P(B). Therefore, P (B) =0 .4
58
Given P(B | A) = 0.4. If A and B are independent, find P(B).
Solution
If A and B are independent, then by definition P(B | A) = P(B). Therefore, P (B) =0 .4
Activ Activity 3.4
Activity 3.4
Self-
A cow gives birth twice in ten years. How can you get probability of a
female
A cow gives birth
birth at second
twice birth given
in ten years. a male
How can youbirth at first birth?
get probability of a female birth at second
birth given a male birth at first birth?
3.6 Law of total probability and Bayes rule
3.6 Law of total probability and Bayes rule
3.6.1 Law
3.6.1 Law of
of total
total probability
probability
Let the sample space S be partitioned by events A1 , A2 , . . . An . i. e
S A1 A2 ..... An and Ai Aj for i j. Let B, be an event in S such that
B Ai , that is,
46
59
Proof:
Proof:
Note B=B
P(B)=p(B B , by addition rule of
Proof:
probability. Note the last part is 0 as for i=1,2,3,…..n are disjoint
Note Proof:
=B=B
p(B B ,
P(B)=p(B
Proof:Proof:
Note B=B B , by addition rule of
Proof: =Note
probability. p(B|A B=B 1)p(A
Note 1)+p(B|A
the last part 2)p(A
is 0 )+……+p(B|A
2as for n)p(An), Since
i=1,2,3,…..n are P(B|A)=
disjoint
Note P(B)=p(B
B=B B , by addition rule of
Note=P(B)=p(B
B=B
p(B
which meansNote P(Bthe last part isB|0 Bas B , ,disjoint
by addition rule
probability.
P(B)=p(B for i=1,2,3,…..n are , by addition rule of of
P(B)=p(B probability. Note the last part B
is 0 as for i=1,2,3,…..n ,areby addition
disjoint rule of
=∑ = p(B|A
probability. = p(B 1)p(A | 1)+p(B|A
Note the last,2in )p(A
part compact 0 as form.
2is)+……+p(B|A B forn)p(A n), Since P(B|A)=
i=1,2,3,…..n are,disjoint
probability. = p(B Note the last part is 0 as B for i=1,2,3,…..n are disjoint , P(B|A)=on probability of all
In law =ofp(B
which =total
p(B|A probability
)p(A )+p(B|A the 2total )p(A | B probability
)+……+p(B|A of nan )p(A event
n), Since B, depends
= p(B means1 P(B 1 B2 ,
In law
other of
=∑events =total
= p(B|A p(B|A
which |1probability
that )p(A 1)p(A
results
means 1)+p(B|A
1)+p(B|A , in
P(B the
into 2total
its
2)p(A
compact )p(A probability
)+……+p(B|A
occurrence.
2)+……+p(B|A
2form.
| nof
)p(A nan
)p(A nevent
), Since
), nSince BP(B|A)=
depends
P(B|A)=on probability of
all other = p(B|Aevents 1)p(A that 1)+p(B|A
results 2)p(A
into 2)+……+p(B|A
its occurrence. n)p(An), Since P(B|A)=
Example
In law of =∑
whichtotal
which probability
means means | P(B P(Bthe ,total in compact probability
| | form. of an event B depends on probability of all
The which
otherfollowing
events means
that P(B
contingency ,table |
gives the results of of an operations in a hospital according of allto the
In
=∑law =∑ of |results
total into
|probability , inits occurrence.
inthe
compactcompact
total form.
probability
form. event B depends on probability
Example
Example
complexity=∑ oftotal |
the operation. , in compact form.
In In other
law law of
ofevents
total probability
that
probability resultsthe intothetotal total
its probability
occurrence.
probability of of an an event event B depends
B depends on on probability
probability of ofall all
The
The following
In law of
following
other total
events contingency
probability
contingency
that results the table
table
intototal gives
its gives
probability
the
occurrence. the
resultsresults
of an
of of operations
event
operations B depends
in a in
on
hospital aprobability
hospital
according according
of to all
the
other events that results into its Simple
Example occurrence. Complex Total
tocomplexity
theExample
other complexity
events of that
the of theinto
results
operation. operation.
its occurrence.
ExampleThe
Successfulfollowing contingency table 1990 gives the results of operations
950 in a hospital according 2940to the
Example The following of contingency Simple
table Complex Total
The complexity
following
Unsuccessful the operation.
contingency table 10gives
gives thethe results
results of of operations
operations 50 in ainhospital a hospital according
according to the
to the
60
The complexity
following
Successful contingency
of the operation. table1990 gives Simple the results of operations
950 Complex in a hospital according
2940 Totalto the
complexity Total of the operation. 2000 1000 3000
complexity
Unsuccessful of the operation.
Successful 10 Simple
1990 50 Complex950 60 Total
2940
Simple Complex Total
SolutionTotal Simple
2000 Complex
1000 Total2940
3000
Successful
Unsuccessful
Successful 1990 199010 950950 50 294060
Let A
Solution be the
Successful
Unsuccessful
event that an operation 1990 10 is simple and B the
950 event
50 that an operation2940 is successful.
60
Solution Unsuccessful Total C 102000 C 501000 603000
PLet(A)=P(A|B)×P(B)+P(A|
AUnsuccessful
be the event that an operation
Total B )×p(B 10is2000 ) , because
simple and B the forevent A50to1000 occur
that anits either due
operation 60to3000 occurrence
is successful.
Let A Solution
be Total
the event that an 2000 1000 3000
C operation is simple and B the event that an operation is
of B Solution
or
Let not
Total
P (A)=P(A|B)×P(B)+P(A|
A be B.the event thatB an)×p(B operation
C
2000) , because is simple forand A toB1000 occur
the eventits either
that andue 3000is successful.
to occurrence
operation
Solution
successful. P (A)=P(A|B)×P(B)+P(A| BC )×p(BC) , because for A to occur its either
Solution
of B=Letor not
A be B. the event that an C
operation C
is) simple
due Let
to PA (A)=P(A|B)×P(B)+P(A|
be
occurrence the event of that
B orannot B
operation
B. )×p(B is simple , becauseandand Bfor B the
the toevent
Aevent occurthatthat an an
its operation
either
operation due is is successful.
tosuccessful.
occurrence
Let =A Pof be the event
(A)=P(A|B)×P(B)+P(A|
B or not that an operationB C
)×p(B is simple
C
) , and
because B the
for event
A to that
occur anits operation
either due istosuccessful.
occurrence
ruleB.
C C
3.6.2P (A)=P(A|B)×P(B)+P(A|
Bayes B )×p(B ) , because for A to occur its either due to occurrence
P (A)=P(A|B)×P(B)+P(A|
of B or not B. BC )×p(BC) , because for A to occur its either due to occurrence
of BBayes
3.6.2
The or =notrule B.
of BBayes or not rule B. of probability was coined by Thomas Bayes (1761).
The Bayes =
= Bayes
3.6.2 rule ofrule probability was coined by Thomas Bayes (1761).
The Bayes = theorem is stated as follows: suppose A1 , A2 , . . . An again partition S i. e
3.6.2
The 3.6.2 Bayes
Bayes
3.6.2
The BayesBayes
Bayes rule
theorem rule rule
rule isofstated
probabilityas follows: was coined suppose by AThomas, ,
1 A2 . .Bayes . An again (1761). partition S i. e
The3.6.2
STheBayesABayes
The
The Bayes rule
Bayes
.....
rule
A2..... of
rule A and
ofAprobability
theorem of probabilityAi
isAstated was Awas
as
coined
coined
follows:
for i
by
by
suppose
j.
Thomas
Thomas Let ,A B ,
Bayes
Bayes be
, .event an event
(1761).
(1761). inpartition
S such that S i. B e
S A 1Bayes
A rule probability
nand
A was coined
j for i by
jThomas
. Let B A,Bayes
be an (1761).
. . A n again
in S such that B
TheTheBayes Bayes rule
theorem of probability
isis i was j coined by Thomas Bayes (1761). 1 2
bestated as follows: suppose
1 2 n
A The
i The SBayes .
Bayes
Now A Btheorem
theorem
A may .....
isoccur
stated
A
stated
and asafter
A
as follows:
follows:
Aoccurrence
suppose
forofi Aof
suppose A
j,,.1A,A
A A11,,2,,A
Let AAB , ..A. 3A
2. 2.,,be .,…..,
an n again
Aevent
n again or partition
inASnthus
partition andbySthus
such
S
i. eB
that
i.by
e law of
AThe
i Bayes
. Now B
1theorem may 2 be occur
is stated n after i
as follows: j suppose
occurrence ,
A1 A2 . . . An
1 2 3 ,….., or
againA n and
partition S i.law
e of
total probability
SiA1 A1 A2 ..... P(B)=
AP(B)=
2
..... p(B|A
Aand and )p(A
A j)+p(B|A )+p(B|A
Ajoccurrence for
for2i)p(A i )p(A
j .
j.2)+……+p(B|A
Let )+……+p(B|A
Let B , be an event )p(A
in S ).
such that B
totalS probability An p(B|A A1)p(A of BAB ,, be an eventn)p(Ain n). S such that Bby law of
1 1 2 2 n n
A . Now B may i i A1after
andbeAioccur Aan 2, A 3,….., inorS A n and thatthus
n
Now S A
suppose A .....
B has A occurred, Aafter Afor i j. Let
occurred, we be
,1want event
probability such
of any B
of the event Ak
Now Ai A suppose
1
total . Now
i . probability
Now
2 BBhas B occurred,
may may
n
be be
P(B)= occur after
occur
p(B|A after
j Akoccurrence
1after
)p(A occurred,
k
occurrence
1)+p(B|Aof
we
2)p(AAwant
of probability
A)+……+p(B|A
1,2A 2,3A 3,….., orofor any
An nof
)p(A and
). thethus
event
byby A
lawkof of
1,2A ,A ,….., nA n and thus law
for
Ai k=1,2,3,…,n
for . Now
k=1,2,3,…,n
total
Now probability
suppose B may that
that B madebemade
P(B)=
has occur B
Bp(B|A
occurred, to
toafteroccur occur
1)p(A
occurrence
after given Agiven
1)+p(B|A
of
that)p(AthatAhas
B)p(A 1,Bwe A has
occurred,Aoccurred,
2, want
2)+……+p(B|A 3,….., or Andenoted
denoted n)p(Aand anyas
).thus
asnp(A byp(A
k|B). law
This |B).
of This
kevent
total probability P(B)= p(B|A 1)p(A 1)+p(B|A k occurred,
2 2 2)+……+p(B|A probability
n)p(An).of of the Ak
probability
totalNow
probability
Now
probability
forsuppose isis
suppose
k=1,2,3,…,n found
found P(B)=
B has by
B has by the
that the
p(B|A
made B
occurred, Bayes
Bayes
occurred, 1 )p(A Theorem:
after to 1 Theorem:
)+p(B|A
after AkAoccurred,
occur k given
2 )p(A
occurred, )+……+p(B|A
wewe
that
2 Bwant haswant occurred,
probability n )p(A
probability ).
of of
denoted
n
anyany as of
of p(A
the the event
k|B).
event AkAk
This
Now forsuppose
probability
| | B has occurred, after Ak occurred, we want probability of any of the event Ak
k=1,2,3,…,n is that
found that made
by the B to
Bayes occur given that B has occurred, denoted as p(A |B). This
p(A
p(A |B)=
forkk|B)=
k=1,2,3,…,n made B to occurTheorem: given that B has occurred, denoted as p(Ak|B). k This
for k=1,2,3,…,n
probability is |that
found made by Bthe toBayesoccur Theorem: given that B has occurred, denoted as p(Ak|B). This
probability
p(A | is found by the Bayes Theorem:
probability ==∑k|B)= is||found| | by the Bayes Theorem:
p(A
p(Ak|B)= ∑
k |B)= | | |
p(Ak|B)=
Note p(B) has =∑ to be |total probability of B. The Bayes theorem seeks to find the probability of
Note p(B) =has | to| be total probability of B. The Bayes theorem seeks to find the probability of
causeNote (Ak=)∑p(B)given ∑| has the effect
|to be total (B).probability of B. The Bayes theorem seeks to find the probability of
cause (A =∑k) given | the effect (B).
Note Note
causep(B) p(B)
(Ahas k) has
|
given
to be to the be
totaltotal
effect probability
(B). of of
probability B. B. The The Bayes Bayes theorem
theorem seeks seeks to find
to find thethe probability
probability of of
Notecause
Example p(B) (A has to
k) given
be total probability of B. The Bayes theorem seeks to find the probability of
cause (Ak) given
Example thethe effect
effect (B). (B). 60
cause
In a binary (A
Example k ) given
communication the effect (B).
system a zero and a one are transmitted with probability 0.6 and
In arespectively.
binary communication
0.4Example Due to error system in the a zero and a one
communication are transmitted with probability
with a 0.60.6 and
Example
In a binary communication system a zero and asystem one areatransmitted zero becomes with a one
probability and
Note p(Bt) has to be total probability of B. The Bayes theorem seeks to find the prob-
ability of cause (Ak) given the effect (B).
Example
In a binary communication system a zero and a one are transmitted with probability 0.6
and 0.4 respectively. Due to error in the communication system a zero becomes a one
with
(i) ofa receiving
probability 0.1 and
a one and a one becomes a zero with a probability 0.08. Determine the
probability
(ii)of
that a one was transmitted when the received message is one.
(i) receiving a one and
(i) of receiving a one and
(ii) that a one was transmitted when the received message is one.
(ii) that a one was transmitted when the received message is one.
Solution
Solution
Let S be the sample space corresponding to binary communication. Suppose T0 be event of
Let S be the sample space corresponding to binary communication. Suppose T0 be event of
transmitting 0 and T1 be the event of transmitting 1 and R0 and R1 be corresponding events
transmitting 0 and T1 be the event of transmitting 1 and R0 and R1 be corresponding events
of receiving 0 and 1 respectively.
of receiving
Given P(T0 0) and 1 respectively.
0.6, P(T1 ) 0.4, P( R1 / T0 ) 0.1 and P( R0 / T1 ) 0.08.
Given P(T0 ) 0.6, P(T1 ) 0.4, P( R1 / T0 ) 0.1 and P( R0 / T1 ) 0.08.
(i) P ( R1 ) Probabilty of receiving 'one'
(i) P ( R1 ) Probabilty of receiving 'one'
P (T1 ) P ( R1 / T1 ) P (T0 ) P ( R1 / T0 )
P (T1 ) P ( R1 / T1 ) P (T0 ) P ( R1 / T0 )
0.4 0.92 0.6 0.1
0.4 0.92 0.6 0.1
0.448
0.448
(ii) Using the Baye's rule
(ii) Using the Baye's rule
P (T ) P ( R / T )
P (T / R ) P (T1 )1P ( R1 /1 T1 )1
P (T11/ R11)
PP(R ( R)1 )
1
Activity 3.5
48
A chicken has birds’ flue in Malawi if it is from Madagascar or Indonesia.
It is known that 1% chickens are from Madagascar and 15% chickens
are from Indonesia. The probability that a chicken has flue that it is from
Madagascar is 0.2 and that it has flue given it is from Indonesia is 0.06.
Find probability that chicken has flue.
3.8 Reflection
Which of the two rules, Bayes rule or Law of total probability is used to find probability
of cause given effect?
Unit Summary
In this unit you have learnt basic elements of probability theory. You have learnt that
probability of an event is always positive and that it is less than or equal to one. You
have also learnt about basic rules of probability like multiplication and addition rule,
law of total probability and the Bayes theorem. What you have learnt in this unit forms
the foundation to further probability theory and statistical inference like Bayesian in-
ference which stems from Bayes rule.
62
End of unit test
2. A prisoner has just escaped from jail. There are three roads leading away from
the jail. If the prisoner selects road A to make good her escape, the probability
that she succeeds is 0.25. If she selects road B, the probability that she succeeds
is 0.2. If she selects road C, the probability that she succeeds is (1/6).
Furthermore, the probability that she selects each of these roads is the same. It
is (1/3). If the prisoner succeeds in her escape, what is the probability that she
made good escape by using road B?
63
jail. IfIfthe
e) Suppose
jail. Ifroad thewe know that
prisoner P(F|E)=0.9.
selects road A thetoWhat
make isgood
P(E|F)? hersheescape, the probab
jail. theprisoner
prisoner selectsselects A roadto make
A togood make hergood
escape, probability
her escape, thethat
probability that she
2. A prisoner
succeeds is 0.25. If succeeds has
she selects isjust roadescaped
0.25.
B, the from
If probabilityjail.
she selectsthat There
road
she B, are three roads
the probability
succeeds leading
is 0.2. If she away
that she from
succeed
succeeds is 0.25. If she selects road B, the probability that she succeeds is 0.2. If she
selects road jail. If the
C, the prisoner
selects
probability roadthatselects
C,
shethe road A
probability
succeeds to make
is (1/6). that good hertheescape,
she succeeds
Furthermore, the probability
is (1/6).
probability Furthermore, tha
selects road C, the probability that she succeeds is (1/6). Furthermore, the probability
that she selects
Answers succeeds eachthatofisthese
0.25.
she roads If she
selects selects
is the
each same.of road B,
It is (1/3).
these the
roads probability
If the
is prisoner
the same. that
It isshe
succeeds in succeeds
(1/3). is 0.2.
If the prison
that sheto
her escape,
unit
selects
what is
activities
each
the of these that
probability roads she ismade
the good
same. It is by
escape (1/3).
using Ifroad
the B?
prisoner succeeds in
selects road C, the probability
her escape, what is the that she succeeds
probability that she is (1/6).
made Furthermore,
good escape by theusing
prob
her escape, what is the probability that she made good escape by using road B?
Answers to Unitthat she selects each of these roads is the same. It is (1/3). If the prisoner succe
3 Activities
Answers to Unit 3 Activities
Answer to Activity
Answer totoActivity her
Answers3.1 towhat
escape, Unitis3 the probability that she made good escape by using road B
Activities
Answers Unit 33.1 Activities
Answer to Activity 3.1 die
Answer to Activity 3.1
Answers to 1Unit 3 Activities 2 3 4 5 6 die
coin Answer H H,1 H,2 H,3 H,4 die H,5 H,6
to Activity 3.1 1 2 3 4 5
T H,1 1 H,2 2 H,3 3 H,4 4 H,5 die H,6 5 6
coin H H,1 H,2 H,3 H,4 H,5
coin
The sample H space is the setH,1 H,2 H,3 H,4 H,5 H,6
The sample T space is theH,1 set T 1 H,1 2 H,2 3 H,3 4 H,4 5 H,5 6
S={(H,1),(H,2),(H,3),(H,4),(H,5),(H,6),(T,1),(T,2),(T,3),(T,4),(T,5),(T,6)}
H,2 H,3 H,4 H,5 H,6
Answer tocoin Hsample space H,1is the set H,2
S={(H,1),(H,2),(H,3),(H,4),(H,5),(H,6),(T,1),(T,2),(T,3),(T,4),(T,5),(T,6)}
Activity The 3.2 H,3 H,4 H,5 H
The sample space is the set
T today is H,1 H,2
Probability of not S={(H,1),(H,2),(H,3),(H,4),(H,5),(H,6),(T,1),(T,2),(T,3),(T,4),(T,5),(T,6)}
learning H,3 H,4 H,5 H
S={(H,1),(H,2),(H,3),(H,4),(H,5),(H,6),(T,1),(T,2),(T,3),(T,4),(T,5),(T,6)}
Answer
Answer to to Activity
Activity 3.33.2
The sampleAnswer spacetoisActivitythe set 3.2
Answer
Probability to Activity
of notProbability 3.2
learning today since the intersection part is 0 i.e we cannot have birth a male
of notis learning
1-0.8=0.2
S={(H,1),(H,2),(H,3),(H,4),(H,5),(H,6),(T,1),(T,2),(T,3),(T,4),(T,5),(T,6)}
today is
Probability of not learning today is
at same time a female.
Answer to
Answer
to Activity
to3.4Activity
Answer 3.2
to Activity 3.3
Answer
Answer toActivity
Activity 3.3
3.3
Probability of
Let F be event of female birth at second not learning today
birth isM ansinceevent the intersection part is We
0 i.e we cannot ha
since
since theand
the intersection
intersection ofpart
part male
isis0birth
0i.ei.eatwe
first
we birth.
cannot
cannot havehavebirth a male
need Answer
| which atto is Activity
same |time 3.3
a female. since these two events are independent.
birth
atAnswer a male
same time at same
a female. time a female.
Answer
to Activity
to Activity
3.5
Answer 3.4 to Activity 3.4since the intersection part is 0 i.e we cannot have birth
Answer
Let B beat to
eventActivity
that chicken3.4 has flue, and let A1 be event of chicken from Madagascar, and A2
Let F same Let
be event
event time F abefemale.
event of at female birth at second
and Mbirth an and M of an event
male of male birth at
Let
be an F be
event ofoffemale
that chicken
female birth
is from
birthat second
Indonesia.
second
Webirth
birth event
need and M an event of male birth at first birth. We
birth
atNow Answer
first birth. We need to Activity
p | 3.4
which is | since
since these two events events areareindependen
need | which| is | | since these two events are independent.
independent. Let F beAnswer event oftofemale birth
Activity 3.5 at second birth and M an event of male birth at first bir
Answer to Activity 3.5
need Let | B be which event is that |chicken has flue, sinceand these let two
A1 be events
eventare of independent.
chicken from Mad
Let
Answer B be event that chicken has flue, and let A be event of chicken from Madagascar, and A2
Answer to to Activity 3.5 1
Answer
Unit Activityto Activity 3.5
be an event that chicken is from Indonesia. We need
beA1an
Let B event
be
: the event
students thatthatchicken
attend chicken isonfrom
classthat has Indonesia.
flue,
Thursday and let We A1 be needevent of chicken from Madagas-
Let B be Nowevent chicken | has flue, and let|A1 be event of chicken from Madagascar
Now
car,
A2:and Abe be andoevent
the students |
not attend thatclasschicken |
is from Indonesia. We need p(B)
on Thursday
2 an event that chicken is from Indonesia. We need
NowB1: the p(B)=p(B│A
students pass1the )p(A course )+p(B│A_2 )p(A_2)
Now 1 | |
B2: the students =0.2×0.01+0.06×0.15
do not pass the course
.6, P AAnswer
(a) P A1 0=0.011 1 P Ato Unit 0.4, PActivity
B1 | A1 0.98, PB1 | A2 0.2
Answer to Unit 2Activity1
PB P A1 A P1:Bthe 1 | A 1 P Aattend
students 2 P B1class | A2 on Thursday
A1: the1 students attend class on Thursday
Answer
Answer to Activity 0.6 0.98 to
A2: 3.5Unit
0.4 Activity
0.2
the students do not attend class on Thursday
A : the students
0.668 do not attend class on Thursday
A12: the students
A1: the attendstudents
B1: theclass attend
students class
pass on
on Thursday theThursday
course
B 1: the students pass the course
A2: the studentsA2: the do not
students attend class
B2: the students do not pass on
do not attendon Thursday
class theThursday
course
B : the students
B1: the students do not pass the course
2
B1: the pass students the course
pass the course
B(a) P A1students
: the B 0: .the
6, Pdo(a)
A2notP A1pass
students
1 0.6, P A2 1 P A1 0.4, PB1 | A1 0.98, PB1 | A2 0.2
P A 1
the 0.4, PB1 | A1 0.98, PB1 | A2 0.2
course
2 2 PB1 do P not Apass the course
1 P B1 | A1 P A2 P B1 | A2
PB1 P A1 PB1 | A1 P A2 PB1 | A2
(a) P A1 0.6, P A20.6 10.98P A1 0.4 0.4, P0.2B1 | A1 0.98, PB1 | A2 0.2
0.6 0.98 0.4 0.2 50P A PB | A
PB1 P A 1 P
0.668
B1 | A1 2 1 2
0.668
0.6 0.98 0.4 0.2
0.668
b) By Bayes’ theorem,
64 50
50
50
(b) By Bayes’ theorem,
P ( A1 ) P ( B1 | A1 )
P ( A1 | B1 )
P( A1 ) P ( B1 | A1 ) P ( A2 ) P ( B1 | A2 )
0.6 0.98
0.6 0.98 0.4 0.2
0.854
65
Unit 4:
4.0 Introduction
In this unit you will learn about random variables. Specifically, you will learn what
a random variable is, and types of random variables. In the unit you will also learn
about some discrete and continuous random variables. The discrete random variables
that you will learn about are Binomial, Poisson, mulitinomial and hyper geometric
random variables while continuous random variables are normal, t, chi-square, and f
random variables. This will be followed by the sampling distribution of sample mean,
and related quantities. The unit is important as it prepares you to statistical inference
in terms of hypothesis testing and interval estimation where there is use of z-test, t-test,
chi-square test and f-test.
Key terms
• Random variable
• Probability distribution
• Sampling distribution
66
Example
Discrete random variable: a quantity that assumes either a finite number of values or
an infinite sequence of values, such as 0, 1, 2, , such as number of HIV infections in a
year, number of children a family has in life time.
Continuous random variable: a quantity that assumes any numerical value in an
interval, such as time, height, weight, distance, and temperature.
We will denote name of variable by capital letter e.g X and value of random variable
by small letter e.g x
Definition (probability distribution (or density) function (p.d.f)): a function that
describes how probabilities are distributed over the values of the random variable.
Example
Example
Consider an experiment of tossing a coin, let X be outcome in the experiment, find
probability
Consider distributionofof tossing
an experiment X. a coin, let X be outcome in the experiment, find
probability distribution of X.
Solution
Solution
X H T
P(X=x) 0.5 0.5
Note we have used a table to show how probabilities are distributed over the values of the
Note we have used a table to show how probabilities are distributed over the values of
random variable
the random X.
variable X.
We can also just define the probabilities for each value of the random variable to show
We can also just define the probabilities for each value of the random variable to show the
the distribution of probabilities for each value of random variable
distribution of probabilities for each value of random variable
OR
OR
{
S={H,T} }
P(X=H)=0.5
P(X=H)=0.5
P(X=T)=0.5
P(X=T)=0.5
ORWe
OR Wemaymayuseuse
thethe formula
formula to define
to define probability
probability distribution
distribution for X for X
P(X=x)=0.5, where X=H,T
67
Example
Find probability distribution when tossing a die
OR
OR{ }
P(X=H)=0.5{ }
P(X=T)=0.5
P(X=x)=0.5,
P(X=H)=0.5where X=H,T
ORP(X=T)=0.5
We may use the formula to define probability distribution for X
Example
P(X=x)=0.5,
OR We may where useX=H,T
the formula to define probability distribution for X
Find
P(X=x)=0.5, where X=H,T when tossing a die
Example probability distribution
FindExample
probability distribution when tossing a die
Solution
Solution
Find probability distribution when tossing a die
X
Solution 1 2 3 4 5 6
P(X=x)
X 1/6 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6
Or P(X=x) 1/6 1/6 1/6 1/6 1/6 1/6
P(X=x)=1/6,
OrOr where X=1,2,3,4,5,6.
Now, sometimeswhere
P(X=x)=1/6,
P(X=x)=1/6, may present
where the probability distribution by using the density plot,by plotting
X=1,2,3,4,5,6.
X=1,2,3,4,5,6.
Now,
the sometimes
probabilities may
against present
their
Now, sometimes may present the valuesthe probability
ofprobability distribution
random variable
distribution by using
by using the density
the density plot,byplot,by
plotting
plotting
Example the probabilities against their values of
the probabilities against their values of random variable random variable
TheExample
following probability distribution table shows the probability of selecting 0, 1, 2, 3, or 4
Example
red The
chipsfollowing
when 4 chips are selected. Represent the probability distribution using a0,histogram
probability distribution table shows the probability of selecting 1, 2, 3, or 4
The following probability distribution table shows the probability of selecting 0, 1, 2,
plot.red chips when 4 chips are selected. Represent the probability distribution using a histogram
3, or 4 red chips when 4 chips are selected. Represent the probability distribution using
plot.
a histogram plot.
x P (x)
0 x0.24 P (x)
1 00.412
0.24
2 10.265
0.412
3 20.076
0.265
4 30.008
0.076
4 0.008
ution Solution
53
53
Self-Activity 4.1
A coin is tossed two times. Let X denotes the number of tails. Write the
A coin is tossed two times. Let X denotes the number of tails. Write the probability
probability distribution for X.
distribution for X.
∑
Property 3 is about cumulative probability where probabilities are added up to the
Property 3 value
specified is about cumulative
of random probability where probabilities are added up to the
variable.
specified
Examplevalue of random variable.
Find the value of c so that the following function is a probability distribution
Example
where
Find the value of c so and
thathence find
the following . is a probability distribution P(x)=c(x+2)
function
where x=0,1,2,3 and hence find p(X≤1).
54
69
Thus there are three wa
variable:
Fig 4.1: Probability density plot
Thus there are three ways to display a probability distribution for a discrete random (1) throug
Solution
Solution
(2) throug
variable:
Note ∑ P(X) = 1, for all X if p(X) is a probability distribution.
Solution (3) throug
(1) through a table
P(X=0)+P(X=1)+P(X=2)+P(X=3)=1 (Note that X can only take 0, 1, 2, 3)
Note ∑ P(X) = 1, for all X if p(X) is a probability distribution. So far we have had exam
(2) through a formula/equation
, since p(X=x)=c(x+2) variables are discrete. N
P(X=0)+P(X=1)+P(X=2)+P(X=3)=1 (Note that X can only take 0, 1, 2, 3)
(3) through a density plot e.g a histogram
2c+3c+4c+5c=1, random variables, but a
So far we have had examples of discrete probability , since p(X=x)=c(x+2)
distributions because the randomActivity 4.1
So c=1/14. Activ
2c+3c+4c+5c=1,
variables are discrete. Note that we can also have probability distribution for continuous
Now p(X≤1)=p(X=0)+p(X=1)
Sorandom
c=1/14. variables, but at the moment we will focus on discrete probability distributions.
Self-
Activ Activity 4.1
=1/14(0+2)+1/14(1+2)
Now p(X≤1)=p(X=0)+p(X=1) A coin is tossed two tim
=1/7+3/14 distribution for X.
Self- =1/14(0+2)+1/14(1+2)
is=5/14
A coin=1/7+3/14
tossed two times. Let X denotes the number of tails. Write the probability 4.3 Discrete probabilit
4.3.2 Mean and variance of a discrete probability distribution
distribution for X.
4.3.2 Mean and
=5/14
Expected variance
value or mean ofis aasingle
discrete probability
average distribution
value that summarizes a probability4.3.1 Properties of disc
distribution.
4.3.2 Mean
Expected valueand
We denote or variance
expected is of
meanvalue a ofa discrete
single probability
average
X by E(X) value
and is
distribution
that
defined as:summarizes
The following
a probability
E(X) = XP(X). Now variance dis-of are prope
Expected
4.3 value
Discrete or mean
probabilityis a single average value that summarizes a probability
by E(X) and is defined as: E(X) 1) distribution.
= ∑ P(X) =1
tribution. We denote
a discrete random variable,distributions
expected Xvalue of Xas:
is defined XP(X).
Now We4.3.1
denote
variance expected
2 Properties
of a value
of of 2X distributions
discrete
discrete random by E(X) and is
variable, 2Xdefined
is as: E(X)
defined as: = XP(X). Now 2) variance
P(X) ≥ 0 of
σ = Var(X) = E[(X– E(X)] = ∑[X – E(X)] P(X), this is because
σ2 a=2discrete
E(X) ==
random
The following
Var(X) =E[(X–
XP(X).
variable,
are E(X)]2 2
X=isof
properties defined
∑[X as:random variables:
discrete
– E(X)]2P(X),
2
this is because 3) P(X =
σ =
1) Var(X)
∑
E(X) =Example P(X)
XP(X). =E[(X–
1 E(X)] = ∑[X – E(X)] P(X), this is because
E(X) = XP(X).
2)Toss
P(X) ≥ 0 twice, let X= # of heads. Find E(X) and Var(X).
Example
a coin ∑
Example3)Solution
P(X =
Toss a coin twice,letletX=
X=##of of heads.
heads. Find Property 3 is about cum
Toss a coin twice, FindE(X)
E(X)and andVar(X).
Var(X).
The probability distribution is
Solution specified value of rando
Solution ∑
TheX probability
The probability 0 distribution
distribution is 1 is 2 Example
Property 3 is about cumulative probability where probabilities are added up to the
XP(X=x) 00.25 10.5 0.25
2 Find the value of c so th
specified value of random variable.
P(X=x) 0.25 0.5 0.25 where and
Example
E(X)=∑
E(X)=∑ =0×0.25+1×0.5+2×0.25
Find the value of c so that the following function is a probability distribution
=0×0.25+1×0.5+2×0.25
=1(the number of heads we can expect to get if we toss a coin twice).
where and hence find .
σ2=1(the
= Var number of heads
(X) = E[(X– E(X)]we 2 can expect to2get if we toss a coin twice).
= ∑[X – E(X)] P(X)
2 2 2
σ2 = Var
=(0-1)
(X) ×0.25+(1-1)
= E[(X– E(X)] 2
×0.5+(2-1) – E(X)]2P(X)
= ∑[X×0.25
2
×0.25+(1-1)2×0.5+(2-1)2×0.25
=0.25+0+0.25
=(0-1)
=0.5
=0.25+0+0.25
=0.5
Example
Example
In the particular game, a coin is tossed. If the coin 54comes up a head, the player wins $100. If
Example
In the
theparticular
coin comes game, a coin
up a tail, theisplayer
tossed. If the
loses $50.coin
Whatcomes
is theup a head,value
expected the player
of the wins
game?$100. If
Inthethe particular
coin comes upgame,
a tail, athe
coin is tossed.
player loses $50.If the
Whatcoin comes
is the up avalue
expected head,ofthe
the player
game? wins
$100. If the coin comes up a tail, the player loses $50. What is the expected value of
the game?
70
55
55
Solution
X(Dollar value) $100 $50
P(X=x) 1/2 1/2
Now E(X)= XP(X)
olution =$100×1/2+ $50×1/2
X(Dollar value) =$25$100 $50
(X=x) 1/2
4.3.3 Examples of discrete random variables and their distributions1/2
lution 4.3.3 Examples of discrete random variables and their distributions
ow E(X)=4.3.3.1
XP(X) Discrete uniform random variable
4.3.3.1 Discrete uniform random variable
(Dollar value) It=$100×1/2+ $100
is the simplest$50×1/2
discrete random variable. The discrete $50 uniform random variable X is when
It is the simplest discrete random variable. The discrete uniform random variable X is
(X=x) the probability
=$25 of 1/2
observing a particular value of 1/2
X is equal for all possible values of X.
when the probability of observing a particular value of X is equal for all possible values
3.3 Examples
w E(X)= Since
XP(X) ofthe probabilities
discrete random arevariables
the same,and this their
random variable is called the discrete uniform and
distributions
of X.
3.3.1 Discrete its uniform
probability
=$100×1/2+ $50×1/2 random
distribution variable
is called discrete uniform distribution. If the random variable X
is the simplestassumes
=$25 values
discrete of x1variable.
random , x2, …, xThek with equal probabilities,
discrete uniform random thenvariable
the discrete uniform distribution
X is when
e.3probability
Since
of
the probabilities
observing a
arevalue
particular
the same,
of X is
this random
equal for all
variable
possible
is called
values of
the
X.
discrete uniform
Examples of discrete
is given by P(X=x,randomk) = variables and their distributions . Note that x1, x2, …, xk are just a convenient
and its probability distribution is variable
called discrete
is called uniform distribution.
uniform and If the random
3.3.1
nce theDiscrete uniform
probabilities
way to
are the random
iterate
same, this variable
out all values
possible
random
values that…, X can take on.
the discrete
Theprobabilities,
following are examples of
variable X assumes of x1, x2, xk with equal then
X the discrete
sprobability
the simplest distribution
discrete is called
random
discrete distribution
uniform random
discreteThe
variable. uniform
variable discrete
and
distribution.
uniform
distribution:
If the random
random variable variable
X is when
sumes
probability
uniform
values ofofobserving
is given
x1, x2, …,a xparticular
k with equalvalue
by P(X=x,
probabilities,
of X isthen
equal
k)for
then =the
alldiscrete
possible uniform
values
Note that x1,
distribution
of X. uniform random
x2, …, xk 1.
are When
just a you throw
convenient a die,
way to outcome
iterate out on the
all die
possible is values
discrete that
nce
giventheby
probabilities
P(X=x, k)are = the same, this random variable . Note that is called
x1, x2,the…,discrete
xk are justuniform and can take on.
a convenient
X
The following variable because
are examples the probability
of discrete for every
uniform random value is the
variable same,
and X .
distribution:
probability
ay to iterate distribution is called
out all possible values discrete
that X uniform
can takedistribution.
on. The followingIf the are
random variable
examples of
umes values
screte uniformofrandom
x1, x2, …, xProbability
k with
variable andequal distribution:
probabilities, then the discrete uniform distribution
distribution:
1. When you throw a die, then outcome on the die is discrete uniform random variable
given by 1. P(X=x,
When k) you
= throw X a die, then1outcome . Note
onthat
2the x 1, x
die is2,discrete
…,3 xk are just 4
uniform a convenient
random 5 6
because the probability for every value is the same, 1
y to iterate out all possible P(X=x)
values that X 1/6can take on. 1/6
The is the1/6
following are . 1/6 of 1/6 1/6
variable because the probability
for every value 6 examples
same,
crete uniformProbability
random variable
2. When andyoudistribution:
toss a coin, the outcome on a coin is discrete uniform where
Probability distribution:
distribution:
1. When you throw a die, then outcome on the die is discrete uniform random
X probability of2each outcome
1 the probability is .
3 value
variable because for every is the4 same, . 5 6
P(X=x) Probability
1/6 distribution:
1/6 1/6 1/6 1/6 1/6
Probability distribution:
X H T
2. When you toss a coin, the outcome on a coin is discrete uniform where
X 1P(X=x) 2 3 0.5 4 5 0.5 6
2. probability
When
P(X=x)
you toss
of each
1/6
a coin,
outcome the
1/6
isoutcome
.
1/6
on a coin is
1/6
discrete uniform
1/6
where1/6
probability
of
4.3.3.2 each
Probabilityoutcome
Binomial is 1
distribution:
random variable
2. When you toss a coin, 2the outcome on a coin is discrete uniform T where
TheXbinomial random variable H from
arise the binomial experiment. The following explains the
probability
P(X=x) of
binomial experiment: each outcome is .
0.5 0.5
Probability
1. Probability distribution:
The experimentdistribution:
is repeated for a fixed number of trials, where each trial is independent of
3.2 Binomial X random
other trials. variable H T
binomial 2. P(X=x)
Therevariable
random are onlyarisetwofrompossible 0.5
outcomes
the binomial of interest for
experiment. Theeach 0.5 The
trial.
following outcomes
explains the can be
omial experiment: classified as a success (S) or as a failure (F).
3.2
The Binomial 4.3.3.2
experiment is Binomial
random
Trial repeated 2forrandom
1 Trialvariable a fixed
Trial variable
3, number
…, Trialofntrials, where each trial is independent of
other
binomial trials.
randomS/F variableS/Farise S/F, from the
..., binomial
S/F experiment. The following explains the
There The two
are only
mial experiment: binomial
possiblerandomoutcomes variable
of interest arise
for from the binomial
each trial. The outcomes experiment.
can be The following
3. The probability of a success S denoted by P is the same for each trial, similarly probability
classified
The explains
as
experiment a success the(S)binomial
is repeated experiment:
orforasaafixed
failure (F). of trials, where each trial is independent of
number
of failure denoted by 1 p.
ther trials.
Trial 1 Trial 2 Trial 3, …, Trial n
4. The binomial random variable, call it X is the number of successes in the n independent
There
S/F are only 1. The
S/F twoS/F,experiment
possible S/Fis repeated
..., outcomes of interestfor for aeach
fixed
trial.number of trials,
The outcomes can bewhere each trial is
trials.
lassified as a success (S) independent
or as a failure of other
(F). trials.
The probability of a success S denoted by P is the same for each trial, similarly probability
Trial 1 Trial
of failure 2 Trial
denoted by 3,
1 …,p. Trial n 71
56
S/F S/F random
The binomial S/F, variable,
..., S/Fcall it X is the number of successes in the n independent
trials.
The probability of a success S denoted by P is the same for each trial, similarly probability
2. There are only two possible outcomes of interest for each trial. The outcomes
can be classified as a success (S) or as a failure (F).
Now
Now in a binomial
binomialexperiment,
experiment, thethe probability
probability of getting
of getting exactly
exactly x successes
x successes in nwhich
in n trials trials
which
we callwe call probability
probability distribution
distribution of X is of X is
( )
Now in a binomial experiment, the probability of getting exactly x successes in n trials which
we call probability distribution of X is
where X=0,1,2,…,n ( )
This is what the binomial probability distribution in formula form. Mean and variance of
This is what the binomial probability distribution in formula form. Mean and variance
binomial random variable are given as and respectively.
of binomial random variable are given as E(X) = np and V(X) = np(1-p) respectively.
where X=0,1,2,…,n
Example
Assume a farmer
This is what
Example is examining
the binomial fruits whether
probability ripe inorformula
distribution not. Theform.
probability
Mean andof avariance
ripe fruitofis
0.1. If therandom
binomial farmervariable
examinesare10 of the
given as fruits, (a) what andis the probability that 2respectively.
fruits are ripe?
Assume a farmer is examining fruits whether ripe or not. The probability of a ripe fruit
(b) What is the probability that at most 2 fruits are ripe?
isExample
0.1. If the farmer examines 10 of the fruits, (a) what is the probability that 2 fruits
Solution
are ripe?a(b)
Assume What
farmer is the probability
is examining that atripe
fruits whether mostor2not.fruits
Theare ripe? of a ripe fruit is
probability
Examining of 10 fruits
0.1. If the farmer is binomial
examines experiment
10 of the fruits, (a)because
what is for
the each examination
probability there are
that 2 fruits are two
ripe?
possible outcomes (ripe/not). Let X be number of ripe
(b) What is the probability that at most 2 fruits are ripe? fruits, then for
Solution
(a) We need P(X =2).
Solution
Examining of 10 fruits is binomial experiment because for each examination there are
Using the
Examining formula,
of ( )experiment
10 fruits is(ripe/not).
two possible outcomes binomial Let X be because where
number ofneach
for = 10,
ripe x = 2,then
p = 0.10,
examination
fruits, there we
for are have:
two
possible outcomes
(a) We need (P(X)=2).(ripe/not). Let X be number of ripe fruits, then for
(a) We need P(X =2).
Using the formula, ( ) where n = 10, x = 2, p = 0.10, we have:
(b) What is the probability that at most 2 fruits are ripe?
It means we need( )p(X≤2).
Using the formula for cumulative probability p(X≤x), we have
P( X 2) P( X 0) P( X 1) ( P( X 2)
(b) What is the probability that at most 2 fruits are ripe?
It means we need10 p(X≤2). 10 10
(0.10) 0 (0.90)10 (0.10)1 (0.90) 9 0.10 0.90
2 8
Bernoulli experiment:
1. There is one trial
2. In each trial there are two possible outcomes, success(S) or failure(F)
3. The probability of success is denoted as p and failure is denoted as 1-p
Now the Bernoulli random variable, call it X is the number of success in the one
trial, thus X=0,1. The probability distribution function is defined as P(X=x)=px(1-p)1-x,
where x=0,1.
Note that the Bernoulli probability distribution is a special case of Binomial distribu-
tion where number (n) of trials is 1. Thus E(X)=p and V(X)=p(1-p).
Examples of Bernoulli random variables
1. When throwing a coin once, number of times we have a head(X=0,1)
2. When undergoing HIV testing, the number of times one has positive
result(X=0,1)
4.3.3.4 Poisson random variable and distribution
4.3.3.4 Poisson random variable
The Poisson and distribution
distribution is a discrete probability distribution of a random var
The Poisson distribution is a discrete probability distribution of a random variable X
satisfies the following conditions. The experiment consists of counting the nu
that satisfies the following conditions. The experiment consists of counting the number
an event occurs in a given continuous interval/space. The interval can be an
of times an event occurs in a given continuous interval/space. The interval can be an
area,
interval of time, area, oror volume.
volume. The
The probability
probability of of
thethe event
event occurring
occurring is the
is the same
same forfor each i
The number
each interval. The number of occurrences
of occurrences in onein interval
one interval is independent
is independent of theofnumber
the number of o
of occurrences in other intervals.
intervals. The
The probability exactlyx xoccurrences
probabilityofofexactly occurrencesin in
an an
interval
in- is give
terval is given by where X=0,1,2,…. where e 2.71818 and and μμisisthe mean
Now
73
,
The mean number of extension visits to farmer in Lilongwe rural is 4 per year. (a) Find the
probability that in a given year, there are exactly 3 visits. (b) Find the probability that in a
given year there are at most 2 visits.
Solution
Solution
Number
Number of of extension
extension visits
visits to
to farmers
farmers is
is Poisson
Poisson random
random variable
variable where
where mean
mean visits
visits is =4,
is μ=4, since you are looking at number of observations in a continous
since you are looking at number of observations in a continous space, time.space, time.
Now
0.19
Note that ,
so .
(b) Now here we need , cumulative
probability up to X=2.
Example 1
Suppose you are testing 10 cows for kind of disease, in each trial there are more than
one kinds of diseases(success), and assuming probability of each kind of disease is
same in all trials, then the number of each kind of disease in 10 trials is multinomial
Example 2
Suppose you are testing 1000 children for severity of child anaemia. The outcome may
be mild anaemia, moderate or severe for each testing of the child. Assuming that the
probability of each kind of anaemia is same across 1000 tests, the outcome the number
of each kind of anaemia in 1000 tests is multinomial.
74
may be mild anaemia, moderate or severe for each testing of the child. Assuming
may be mild anaemia, moderate or severe for each testing of the child. Assuming
that the probability of each kind of anaemia is same across 1000 tests, the outcome
that the probability of each kind of anaemia is same across 1000 tests, the outcome
the number of each kind of anaemia in 1000 tests is multinomial.
the number of each kind of anaemia in 1000 tests is multinomial.
Probability
Probabilitydistribution of multinomial random variable is given by by
Probabilitydistribution
distributionofofmultinomial
multinomialrandom randomvariable
variableisisgiven
given by
n! xk
P( X 1 x1 , X 2 x2 ,..., X k xk ) n!p p ... p
x1 x2
P( X 1 x1 , X 2 x2 ,..., X k xxk 1)! x2 !...xk ! 1 2p1x1 p 2kx2 ... p kxk
x1! x2 !...xk !
4.3.3.6 Hypergeometric random variable
4.3.3.6Hypergeometric
4.3.3.6 Hypergeometric random
random variable
variable
Consider
Considera population of size N consisting of m items withwitha characteristic of interest ,
Considera apopulation
populationofofsize
sizeNNconsisting
consistingofofmmitems
items witha acharacteristic
characteristicofofinterest
interest ,
thus, there
thus are N-m
are items without a characteristic of interest e.g a population with mwithHIV
thus there
there are N-m items
N-m items without
without aa characteristic
characteristic of interest
of interest e.gaapopulation
e.g population with mmHIV
and HIV
N-mand with no HIV.
and N-mN-m withwith no HIV.
no HIV.
m N m 59
59
x nx
P( X x) where 0≤x≤m.
N
n
4.44.4Property
Property ofof continuous
continuous random
random variable
variable and
and distribution
distribution
Recall in discrete probability that if X is discrete random
Recall in discrete probability that if X is discrete random variable variablethen
thenthe
theprobability
probability
distributionthat
distribution thatXXtakes
takesspecific
specificvalue
value isis defined
defined as P(X=x) where where xxisisthe
thespecific
specificvalue
of X.ofNow
value if X is
X. Now if continuous, that is,
X is continuous, assuming
that a number
is, assuming of values
a number in a given
of values in a interval
given
interval e.g. a<X<b,
e.g. a<X<b, then the then the probability
probability distribution
distribution that X that
takesXspecific
takes specific
value isvalue is not
not p(X=x)
p(X=x) but
but rather rather p(x<X<x+∆x)≈f(x)dx where dx is the small change
where dx is the small change in Xin X
Consequently we talk of X in an interval so
Consequently we talk of X in an interval so that that wewetrytry
to have p(a<X<b)
to have p(a<X<b) which
whichis area
is area
under the curve, f(x), the probability density function.
under the curve, f(x), the probability density function.
(2) f ( x)dx = 1. (Total area beneath the curve f(x)) is exactly 1.)
(3) The cumulative distribution function (cdf) is denoted by F(x) is
x
P(X<x) = F(x) = f ( x)dx which is area under f(x) from x to the left
so c = 1/2.
Example
77 from 1 to 5 days, which gives
The delivery time X is uniformly distributed
Example
The delivery time X is uniformly distributed from 1 to 5 days, which gives
{
{{
Find the probability that the delivery time is two or more days
Find
Find the
Solution the probability
probability that that the
the delivery
delivery time time is is two
two or or more
more days days
Solution
Solution {
∫ ( ) ( )
∫ ( ) (({ ))
Definition:
Definition:
nd the probability thatThe the ) ortime
∫ (delivery
mean, expected
expectedis twovalue,value,
or more or
or expectation,
expectation,of
days ofaacontinuous
continuousr.v. r.v.XXwithwithp.d.f.
Definition: The mean, or expected value, { or expectation, of
lutionthe p.d.f.
Find Definition:that
probability Thethe mean, or expected
delivery
time is two value, or ormore expectation,
days of a continuous r.v. X with p.d.f.
a continuous r.v. X with p.d.f.
Solution ∫Find f(x) is given
( ) the probabilityby
() E that
E X xf x dx
.
the delivery time is two or more days
xf xx dx
() E X X xf
f(x)
f(x)
finition: Definition:
The ∫
is
Solutiongiven
ismean,
given
( ) orLet
by
byexpected
value,
X be a continuous or expectation,
dx ..
of a continuous r.v. X with p.d.f.
r.v. with p.d.f. f(x), and mean . The variance of X, or the
Definition: Let X ( a) continuous
∫expected
be ( ) r.v. with
Definition: Let X be
be aa continuous is r.v. with byp.d.f.
p.d.f. f(x),
p.d.f.off(x),
and
and mean r.v.
and mean mean .
. The The variancep.d.f.ofof
variance X, or the
of X,
Definition: The mean,
variance of the ordistribution value,
of X, or expectation,
given a continuous . X with
Definition:
or
) is givenvariance the
variance
byDefinition: E
of
X
Let
the of
X
the
The mean, xf
distribution
continuous
x dx
distribution
of
r.v.
.X, is given
orofexpected X,
with
is
value, given
by by
f(x),
or expectation,
The variance
of a continuous r.v. X with p.d.f.
X, or the
variance of the distribution of2X, is given2 by
E E X X Exf( Xx)dx x E ( X ) 2 f x dx . The standard
2
f(x) is given2by X
2
V E ( X ) [ E ( X )]2
finition: XV
Let2f(x) is X
Vbe Xgiven
E
a continuous
Eby XX E
E (( X
E
r.v. Xwith
2 .
X)) 2 E
Exf
p.d.f.
(( X ))x22dx
Xf(x), [[ E
and X
. ((mean
E X )]
2
)]2 .
Thexx variance
E X )) of
E (( X
2 f X,
f xx dx
dxor .the
. The
The standard
standard
riance
Definition: deviation
of the distribution
Let X be of aXcontinuous
is
ofjustX, is thegiven
square
r.v. by with root
of the
p.d.f. f(x), variance.
and mean .
The variance of X, or the
deviation
Example of
of X X is justX the
beisasquare root
root of the variance.
deviation
Definition:
isLetjust the continuous
square ofr.v. thewith p.d.f. f(x),2 and mean . The variance of X, or th
variance.
2variance of the distribution 2of X, given by
V X Example
The
For E
standard
X
a lathe in
variance
Example
E ( X
ofathe )
deviation
machine Eof
( X X ) 2is just
shop, of
distribution [ E
letX, (the
X
X is )]square
denote
2
giventhe root
x E
bypercentage
the) variance.
of( X fofxtime
dx .out The ofstandard
a 40-hour workweek
V Xthat
2 For
For athe
a lathe
lathe
Elathe X in
inE a
a(machine
is
X ) in
machine
actually
2 shop,
E
shop, X )let
(use. let
2
X X denote
[ Edenote
Suppose ( X )] 2
X has the
the probability
x E ( X ) density
of x dxoutfunction
percentage of time out of a 40-hour workweek
apercentage
2
ftime . of
The given workweek
standard
a 40-hour by
viation ofExample
ample
X is
that
thatathe
For
just
the
2
lathe
V in
lathe
lathe
the
square
is
actually
Xisaactually
root
E X inE
machine
of
in the
use.
variance.
2
Suppose
( X )Suppose
use.
shop, let XEdenote
X
( XX)has has
2
{ the
a ( X )] density
probability
[aEprobability
percentage
2
density
x time
of E( X 2
function
) of
function
out
given
f xagiven
dx by
. The
40-hour by standard
deviation of X is just the square root of the variance.
workweek that the lathe is actually in use. {{Suppose X has a probability density function
r a lathe inFind adeviation
machine shop,
of X isletjust X denote
the square the percentage
root of the variance. of time out of a 40-hour workweek
Example giventhe by mean and variance of X.
For latheFind
t thea lathe is athe
actually
inExample
Solution
Find machine
the mean
mean and
in use.
shop,
and variance
Suppose
let X denote
variance of
X has
of X.
X. the a probability
percentagedensity of timefunction
out of a given 40-hour byworkweek
Solution
From For
is thea lathe in
definitioninause. machine
ofSuppose
expectedshop, let Xawe denotehave,the percentage of timegiven out of bya 40-hour workwee
that the lathe Solution actually { Xvaluehas probability density function
From
From the
thatthe definition
thedefinition
lathe is actually of expected
of expected in use. valueSuppose we have,
X has a probability density function given by
nd the mean and variance of X.
∫ {value ∫
we have,
∫ * +
Find the mean and variance of X. {
lutionthe mean and variance of ∫ ∫ ∫ ** ++
Find Thus, on the average,X.
∫ ∫ ∫
the lathe is in use 75% of the time. To compute V(X), we first find
om the definition
Solution Find
Solution
Thus, 2 on of
the expected
the mean
average, and value
the we have,
variance
lathe of
is X.
in
E(X
Thus,
From
) onas follows:
the the average,
definition of the lathe
expected in use
isvalue usewe
75%
75% have,
of
of the
the time.
time. To To compute
compute V(X), V(X), we we first
first find
find
From the E(X Solution
definition
2
)
2 ∫ as of
follows: expected value we have,
E(X ) as the follows: ∫ ∫ * +
From ∫ definition of expected ∫ value we ∫ have, * +
us, on the average, the ∫ ∫ ∫∫ ∫ ∫ * + ** ++we first find
∫ lathe is in use ∫75% of the ∫ time.∫ To compute V(X),
2 Then, ∫ ∫ . * +
XThus,) as on
follows:
the average,
the lathe is in use 75% of the time. To compute V(X), we first findwe first
Then,
Thus,
Then, on the average, the lathe is in use 75% of the time. To compute .
. V(X),
2
E(X ) as follows: Thus,2on ) asthe average, the lathe is in use 75% of the time. To compute V(X), we first find
∫find E(X 2
follows:
∫ ∫ * +
E(X ) as follows:
Activ
∫ Activity ∫ 4.3 ∫ * +
en, Activ Activity 4.3 .
Activ ∫
Activity 4.3 ∫ ∫ * +
Then, Self- .
Self-Then,
Self-1. Suppose Y, the grams of lead per liter of gasoline, has the density function f(y) =
.
tiv Activity 1. 4.3 Y,
1. Suppose
12.5y
Suppose – 1.25Y, the
theforgrams
0.1 <y<
grams of
of lead
0.5. per
lead What
per liter
literisof thegasoline,
of probability
gasoline, has
has thatthe
the density
the nextfunction
density f(y)
f(y) =
liter of gasoline
function =
Activ Activityhas 12.5y
4.3 less– 1.25
than for
0.3 0.1
grams <y< of 0.5.
lead? What 78 is the
12.5y – 1.25 for 0.1 <y< 0.5. What is the probability that the next liter of gasoline probability that the next liter of gasoline
lf-
2. has
Activ Show
has less than
than 0.3
Activity
lessthat 4.3grams
0.3 grams of of lead?
lead?
1. Suppose Y, the grams of lead per liter of gasoline, has the density function f(y) =
Activity 4.3
1. Suppose Y, the grams of lead per liter of gasoline, has the density
function f(y) = 12.5y – 1.25 for 0.1 <y< 0.5. What is the
probability that the next liter of gasoline has less than 0.3 grams
of lead?
2. Show that V(X)=E[X-E(X)]2=E(X^2 )-[E(X)]2
79
shape. The following are normal curves with varying standard deviation, sigma(σ) but with
same centre(μ):
4.5.2 Standard
4.5.2 Standard Normal Distribution
Normal Distribution
normalAdistribution
A normal
A normal distribution
distribution with = 0with
with and===011and = 1 is
isis known
known asknown
as as anormal
aastandard
standard standard
normal normal distribution
distribution.
distribution. The
The
letterletterletter Z is
Z is used
Z is used used to indicate
to indicate
to indicate the standard
the standard
the standard normal normal
normal variable.
variable.
variable. The density
The density
The density function
function
function of standa
of standardof
standard
normal isnormal
definedisas:
defined
defined as:
as: where where . The expected
. The expected value andvalue a
√
√
variance variance
of of standardZnormal
standard are of Z are and respectively.
The expected value normal
and variance standard and
normal Z are respectively.
E(Z)= 0 and V(Z)= 1
respectively.
80
normal
variance is definednormal
of standard as: Z are where
and . The expected value and
respectively.
√
variance of standard normal Z are and respectively.
81
Solution
The P(Z 1.53) =∫ is equal to the shaded area in figure below.
√
Solution
The P(Z 1.53) =∫ is equal to the shaded area in figure below.
√
65
Fig 4.10: P(Z 1)
82
65
b) P(Z < -1.5) =P(Z>1.5)=1-P(Z<1.5)=1-0.9332= 0.0668 as shown in figure below
83
e) To find the value of Z, say z0, such that P(Z z0) = 0.99. We must look for the given
probability 0.99 in the Z-tables. The closest we can come is 0.9901, which
corresponds to the z-value of 2.33. Hence, z0 = 2.33.
Now probability that the machine will dispense more than 17 ounces (p(X>17)) is just same
Now probability that the machine will dispense more than 17 ounces (p(X>17)) is just
as
same as P(Z>1)=1-P(Z<1)=0.1587
4.5.3 The
4.5.3 Thestudent
student t-distribution
t-distribution
The following are the propertiesof
The following are the properties ofthe
thet-distribution:
t-distribution:
1. 1. ItItisisbell-shaped
bell-shapedlike
likenormal
normalcurve.
curve.
2. 2. ItItisissimilar
similartotostandard
standardnormal
normalininthat
thatititalso
alsohas
hasmean
meanof
of0,
0,but
butitithas
has standard
standard
deviation
deviationmore
morethan
than1.1.
3. 3. Hence t-curve isiswider
Hence the t-curve widerthan
thanthethe standard
standard normal
normal curve,
curve, but larger
but for for larger
sample
sample sizes, n ≥ 30 the t-curve is closer to the standard
sizes, n ≥ 30 the t-curve is closer to the standard normal curve. normal curve.
.
.
Fig 4.17: F distribution
Fig 4.17: F distribution
86
Activ Activity 4.4
Activ Activity
Activity 4.4 4.4
1. Let Z be standard normal random variable, find z such that p(Z > z)
=0.012
1. Let Z be standard normal random variable, find z such that p(Z > z)
find z1.such 2.p(Z
Let that
Z be How is the
> z)
standard chi-square
normal random
random variable
variable, related
z suchto
find random F-random
that z)variable?
p(Z >related
2. How is the chi-square variable to F-random variable?
ated to 2.
F-random
How isvariable?
the chi-square random variable related to F-random variable?
4.8 Sampling Distributions
4.8 Sampling Distributions
4.8.1 Distribution
4.8 Sampling of
Distributions sample
4.8.1 mean (Xof̅) sample mean ( ̅
Distribution
Suppose we take
4.8.1 Distribution many
of sample meanwe( ̅take
repeated
Suppose samples
many(of the same
repeated size) from
samples thesame
(of the samesize)
population,
from the same p
ameSuppose
size) and
from
we the
each same
taketime,
manypopulation,
repeated
calculate theand
each time, calculate the sample mean, so that we have a set ofand
samples
sample (of the
mean, same
so size)
that we from
have the
a same
set of population,
sample means. For means
sample
aveeach
a settime,
of sample
example, means.
takethe
calculate For example,
500sample
samples ofsamples
mean,
take 500 size
so 20
that
ofaswefollows:
sizehave a set
20 as of sample means. For example,
follows:
take 500 samples of size 20 as 1 follows:1 1
x1 ,
x2 , , x20 x1
x11 , x12 , , x120 x
2 2 2
x12 , x22 , 1, x20 2
x2
x1 , x2 , , x20 x2
500 500 500
500 500 500
x1 , x 2 ,, x20 x500
x1 , x2 ,, x20 We may x
be interested
500 in the histogram of these sample means to see the dist
We may be interested in the histogram of these sample means to see the distribution
mpleWemeans
may to interested
of be
thesee the distribution
sample in the
means. sample ofmeans.
What the
histogram
wouldofWhat
these sample
would
that set means
that
of sample to sample
see the
setmeans,X
of distribution
means,
̅ values of the
̅ values
look like look
in a like in a
̅
ans,sample
values look
means. like
What in
histogram? Consider a histogram?
would that set of
the sampling sample means,
distribution ̅ values look
of the sample like in
mean X a histogram?
̅ when we take
̅ when
Consider the sampling distribution of the sample mean we take sam
mean ̅ when we
samples take
of samples
size n
Consider the sampling distributionof
from size
a n
normal population
of thepopulation
sample meanwith̅ mean µ
when weand and
take standard deviation
of size n σ.The sampl
samplesdeviation
from a normal with mean standard
ard from aThe
deviation sampling
normal The distribution
sampling
population with
of
of X ̅ has also
distribution
̅ mean
has also and
normal
standard
normal
distributionThe
deviation
distribution
with centre(mean),
sampling
with centre(mean),
µ and
µdistribution
and standard deviati
n),of
µ and standard
̅ hasstandard deviation
deviation
also normal .
distribution with centre(mean), µ and standard deviation .
√ √
Now we need ̅ ( )
√
( )
√
71 ̅
̅
4.8.2 The sampling distribution of
√
4.8.2 The sampling distribution of ̅
√ In
practice,̅the population standard deviation in is typ
√
In practice, the population standard deviation in is typically unknown, so estimate ̅
with ̅sample √ standard deviation
̅
s, to have . The quantit
√
with sample standard deviation s, to have . The quantity has a standard normal
distribution
√ if that√is if sample size is large enough. If
distribution if that is if sample size is large enough. If the sample ̅ size is small that is
̅
if , then the quantity has a t-distribution (“Stud
√
if , then the quantity has a t-distribution (“Student’s t”) with n – 1 degrees of
√ freedom (the parameter of the t-distribution). The t-distrib
freedom (the parameter of the t-distribution). The t-distribution
normal (symmetric, centeredresembles the itstandard
at zero) but is more spread
normal (symmetric, centered at zero) but freedom, the more spread out the t-distribution is of
it is more spread out. The fewer the degrees and as the
freedom, the more spread out the t-distribution is and
gets closer asstandard
to the the d.f. normal.
increase, the t-distribution
gets closer to the standard normal.
Activ Activity 4.5
Activ Activity 4.5
Activity 4.5
1. Suppose extension workers’ salaries have a mean of $90,000 and a
standard deviation of $30,0001. Suppose
(highlyextension
skewed, workers’ salaries
not normal). have aa mean of $9
Given
1. Suppose sample
extension workers’ salaries have
of extension workers,ofcana mean
$30,000 of
we find $90,000
(highly and a standard
skewed, notthe
the probability deviation
normal).
sampleGiven a sam
of $30,000 (highly skewed, not normal).
mean is less than $100,000 ifwe Given
n =find a sample
n = 30? the sample mean iscan
of
5? theIfprobabilityextension workers, less than $10
we find the probability the sample mean is less than $100,000 if n = 5? If n = 30?
2. A sample of 45 maize grains2. A sample
was takenof 45tomaize grains was taken
test hypothesis about to test hypo
2. A sample of 45 maize grains was taken
population maize mean weightmean to test hypothesis
weight
of 8.7g. Theofsample about
8.7g. The population
sample
standard maizedeviation
standard
deviation
mean weight of 8.7g.
was 2g. WhatThe sample
would standard
be the deviation
distribution
distribution of was ?2g. What would be the
̅
̅ √
distribution of ?
√
Unit Activity
Unit Activity A farmer cooperative knows that mean of earnings per year
A farmer cooperative knows that mean knowof earnings per year that
89the probability is $200,000
in a givenand needsthe
quarter to earnings of th
know the probability that in a given quarter
than the earnings
$100,000, of <the100,000)
P(X cooperative
with will be less standard d
a cooperative
than $100,000, P(X < 100,000) with a cooperative standard
is the probability? deviation of $80,000. What
Unit Activity
A farmer cooperative knows that mean of earnings per year is $200,000
and needs to know the probability that in a given quarter the earnings of the
cooperative will be less than $100,000, P(X < 100,000) with a cooperative
standard deviation of $80,000. What is the probability?
4.9 Reflection
What are other examples of normal random variables in real life situation?
Unit Summary
In this unit you have learnt about random variables and their distributions. For
example you have learnt about Binomial, Poisson random variables for discrete
random variables, and normal, t, chi-square and f random variable for continuous
random variables. The unit has also covered the sampling distribution for sample
means, and the related quantities. What has been covered in this unit will enable
you to do inferential statistics like hypothesis testing and interval estimation in the
subsequent unit. TheThe
following areare
following properties
properties of the normal
of the normal density
densitycurve:
curve:
1. 1.It isItsymmetric about its mean, μ, area(probability)
is symmetric about its mean, μ, area(probability) to
same sameas area to the
as area right
to the of the
right mean.
of the mean.
End of unit test 2. 2.TheThe mean = median = mode.
mean = median = mode.
3. 3.Total area
Total area under thethe
under curve(total
curve(total probability)
probability)= 1= 1
4. 4.Area Areaunder
underthethe
curve to the
curve right
to the of μofequals
right μ equalsthethe
areaar
1. Suppose there are three balls in a box.
of μ, One
whichof the balls
equals
of μ, which equals ½ ½ is number 1, another is
the number 2, and the third is the
5. number
As 3. You select
x increases twobound
without balls at random
(gets largerand
andandlarger),
5. As x increases without bound (gets larger large
without replacement from the box and note the two numbers observed. Let X be
butbut never reaches the horizontal axis (like
never reaches the horizontal axis (like approacapproachin
the sum of the two balls selected.
decreases without bound (gets larger andand
larger in the
a. Draw the table of distribution forwithout
decreases X. bound (gets larger larger in
b. graph
What is the probability graphapproaches,
that the sum is but
approaches, never
but
at least 4 reaches,
never reaches,thethe
horizontal
horizonta a
c. What is the mean of X?
2. Suppose you take a4.5.2
sample Standard
4.5.2 Standard
of Normal
Normal
25 high-school Distribution
Distribution
students, and measure their IQ.
Assuming that IQ isAnormally
normal
A normal distribution
distribution
distributed ==0100
with
with =and =1=is15,
0 and 1known
is what
knownas
is athe
asstandard n
a standard
probability that yourletter Z isZused
sample’s
letter is to indicate
IQused
will thegreater?
beindicate
to 105 or standard
the normal
standard normalvariable. TheThe
variable. density
dens
normal
normalis defined as: as:
is defined where
where . The
.T
√ √
variance of standard
variance normal
of standard Z are
normal Z are andand respec
resp
90
2.2. Suppose
Suppose youyou take
take a sample
a sample of high-school
of 25 25 high-school students,
students, and measure
and measure their IQ.
their IQ.
Assuming
Assuming that IQIQ
that is normally distributed
is normally withwith
distributed = 100
= and = 15,
100and =what
15, is the is the
what
probability that your sample’s IQ will be 105 or greater?
probability that your sample’s IQ will be 105 or greater?
Answers to activities in Unit 4
Answers
Answersto activities in Unit 4
Answer to to activities
Activity 4.1 in Unit 4
Answer
AnswertotoActivity
Activity4.14.1
XX 00 1 1 2 2
P(X=x) 0.25 0.5 0.25
P(X=x) 0.25 0.5 0.25
2. , expanding
2. , entering expectation
, expandingoperator
, entering expectation operator
91
Answer to Activity 4.5
1. n=5, we may not find probability because we don’t know distribution of sample mean
n=30, we may find probability using standard normal because distribution of sample
Answers to mean
Unitis test
normal since
1. 2.(a)standard
The sample space
normal consists
since sampleof size
six equally
is largelikely
enoughoutcome {(1,2),
ie greater than(1,3),
30 (2,1),
Answers(2,3),
to Unit test
(3,1), (3,2)}.
1. (a) The sample space consists of six equally likely outcome {(1,2), (1,3), (2,1), (2,3),
X={3, 4, 3, 5, (3,1),
4, 5}. (3,2)}.
X={3, 4, 3, 5, 4, 5}.
Table of distribution of X
Table of distribution of X
X 3 4 5
P(X) 1/3 1/3 1/3
(b) P( the sum is at least 4)= P(4)+P(5)= 2/3
(c) Mean of X = 3×(1/3) + 4×(1/3) + 5×(1/3) = 4
̅
2.
√
√
̅
74
92
Unit 5:
Inferential Statistics
5.0 Introduction
The aim of this unit is to introduce you to hypothesis testing and interval estimation.
The unit will cover the definition of a hypothesis, testing hypothesis about the mean
and proportion using the z-test, t-test, and F-test. In the unit you will also learn about
testing for association in cross tables using the chi-square. The unit offers a background
in testing of research hypothesis.
Key terms
• Hypothesis
• Significance level
• Type I error
• Type II error
Under hypothesis testing we will use the sample to make inferences about the claims
in the population. A statistical hypothesis is a conjecture/claim about a population
parameter which may or may not be true. Statistical hypothesis testing is a
decision-making process for evaluating claims about a population using the sample.
The following are examples of hypothesis:
There are two types of statistical hypotheses: the null hypothesis and the alternative
hypothesis. The null hypothesis, symbolized by H0, is a statistical hypothesis that
states that there is no difference between a parameter and a specific value, or that there
is no difference between two parameters. The null hypothesis contains an equal sign =.
The alternative hypothesis, symbolized by H1, is a statistical hypothesis that states the
existence of a difference between a parameter and a specific value, or states that there
is a difference between two parameters. The alternative hypothesis usually contains the
symbol >, <, or ≠.
Activity 5.1
Which of the examples of hypothesis given above are null and which are
alternative?
The following is a real life example of one left tailed hypothesis formulation:
Ho: Average barley tobacco yield in 2014 was 50 000 000 kg or more than
H1: Average barley tobacco yield in 2014 was less than 50 000 000 kg
Now appropriate sample statistic to use to test the above hypotheses is the sample mean ̅ .
If sample used to test hypotheses is from normal ( , then sample mean ̅ is
̅
normal( ) so that we can use the z-test defined as . Note the Z-test is after
√
x
standardizing the sample mean. If is unknown but n 30, then z-test z is still
s/ n
used since by central limit theorem, ̅ is normally distributed so that ̅ can still be
x x
standardized to have z . If is unknown and n < 30, then t is used. The
s/ n s/ n
following is the rejection criteria. For the right tailed Z test, H0 is rejected when .
Figure below shows the rejection region for the right tailed Z test.
95
Fig 5.5: Left tailed t test rejection region
For the two tailed t-test H0 is rejected when |t| i.e when or
78
Thus we have one right tailed test. Now appropriate statistic to test claim about population
mean ( ) is sample mean ( ̅ . Now sampling distribution of the sample mean ( ̅ must be
̅
normal( since n , by CLT. Thus our test statistic is the Z , after standardizing
√
̅ . Rejection criteria: we reject null when Z ≥ i.e when , using the
̅
Z-tables. Now Z . Thus since Z is less than the critical value we fail to
√
reject the null hypothesis, i.e the mean salary is likely to be 12837.
Example
A nutritionist believes that a 12 ounce box of breakfast cereal should contain an average of
1.2 ounces of bran. The nutritionist measures a random sample of sixty boxes of popular
cereal for bran content. Suppose sample mean is, x 1.170 and standard deviation, s =0
0.111. Do the data indicate that the mean bran content of all boxes of this brand of cereal
differs from 1.2 ounces? Use α=.05.
Solution
Stating null and alternative hypothesis:
96
Thus we have a two tailed test. Now since , we will use z test statistic defined
cereal for bran content. Suppose sample mean is, x 1.170 and standard deviation, s =0
0.111. Do the data indicate that the mean bran content of all boxes of this brand of cereal
differs from 1.2 ounces? Use α=.05.
Solution
Solution
Stating nulland
Stating null andalternative
alternative hypothesis:
hypothesis:
Thus we have a two tailed test. Now since , we will use z test statistic defined
̅
as . Rejection criteria: we reject if | | i.e when
√
Example
The average amount of rainfall during the summer months for the northeast part of the
United States is 11.52 inches. A researcher selects a random sample of 10 cities in the
northeast and finds that the average amount of rainfall for 1995 was 7.42 inches. The
standard deviation of the sample is 1.3 inches. At = 0.05, can it be concluded that for 1995
the mean rainfall was below 11.52 inches?
Solution
Stating null and alternative hypothesis we have:
This means we will have one left tailed test. Appropriate statistic to test about the population
̅
mean is sample mean ̅ . Now since n=10, i.e distributed as t with degrees of
√
freedom n-1. For the rejection criteria: we reject the null hypothesis when
, using t-tables. Now calculating the t test statistic using sample data we
̅
have: . Thus since , we reject ,
√
√
that is, rainfall is likely to be below inches.
, as alternative hypothesis.
80
The test of these hypotheses would be left tailed because rejection region would be the left
tail of the probability distribution of the test statistic. Real life examples of hypothesis
formulations about difference between two population means are as follows:
Example 1
Example 2
Example 3
In all of the above hypothesis formulations test statistic is the difference between two sample
means ̅ ̅ , that is, one may collect two samples of size respectively and have
̅ ̅ and then see if ̅ ̅ or ̅ ̅ . If one has ̅ ̅ , then he/she
may have an idea that is true otherwise he/she may say that is true. Now note that if we
sample from normal populations, that is, normal ( , and normal( ,then ̅ ̅
will also be normal( . So that we can use the z test after standardasing
̅ ̅ defined as
̅ ̅
are estimated by the sample variances. If and and that are estimated by
sample variances and respectively, then we use the t-distribution, that is,
̅ ̅
with 1 or degrees of freedom. Normally to use this t-test we
√
98
use the smaller degrees of freedom of the two 1 or . If you assume that population
√
are estimated by the sample variances. If and and that are estimated by
sample variances and respectively, then we use the t-distribution, that is,
̅ ̅
with 1 or degrees of freedom. Normally to use this t-test we
√
use the smaller degrees of freedom of the two 1 or . If you assume that population
variances are equal, that is, , then you have test statistic
̅ ̅ ̅ ̅ ̅ ̅ ̅ ̅
Z
√ √ ( ) √
√
since under the null hypothesis , that is, . Now if the common variance
is not known and that and , then we may use sample variance to estimate
as follows:
81
and √ . The is called the pooled sample variance because it is the
and . The variances.
weighted√average of two sample is called Now
the pooled sample
the z-test variance
statistic thenbecause
becomesit the
is the
t-test
statistic defined as
weighted average of two sample variances. Now the z-test statistic then becomes the t-test
statistic̅ defined
̅
as
with degrees of freedom.
√̅ ̅
with degrees of freedom.
Example
√
Example
Two types of fertilizers, UREA and CAN were applied to two maize plots 1 and 2
Example
Two types of fertilizers, UREA and CAN were applied to two maize plots 1 and 2 re-
respectively.
Two Farmers think
types of fertilizers, UREA thatand
there
CANis no difference
were applied toin two
maize yieldplots
maize between
1 andtwo
2 fertilizers.
spectively. Farmers think that there is no difference in maize yield between two fertil-
A researcher takes
respectively. Farmerssample
thinkofthat
40 there
maizeisgrains in plot 1inand
no difference 32 maize
maize grains intwo
yield between plotfertilizers.
2. The
izers. A researcher takes sample of 40 maize grains in plot 1 and 32 maize grains in plot
average
A weight
researcher of sample
takes maize grains is 10kg
of 40 maize in plot
grains in 1 and1 7kg
plot and 32in plot 2. grains
maize Standard deviation
in plot 2. The for
2. The average weight of maize grains is 10kg in plot 1 and 7kg in plot 2. Standard
average weight of maize grains is 10kg in plot 1 and 7kg in plot 2. Standard deviation
weights for plot1 is 2kg and for plot 2 is 4kg. Test whether there is a difference in maize for yield
deviation for weights for plot1 is 2kg and for plot 2 is 4kg. Test whether there is a dif-
weights
for UREA forand
plot1 is 2kg and for plot 2 is 4kg. Test whether there is a difference in maize yield
CAN.
ference in maize yield for UREA and CAN.
for UREA and CAN.
Solution
Solution
The two hypotheses are:
Solution
The two hypotheses are:
The two hypotheses are:
Example
Example
A farmer thinks that local chickens lay eggs with larger weight than hybrid. She collects 10
A farmer thinks that local chickens lay eggs with larger weight than hybrid. She
eggs for local and 12 eggs for hybrid. The mean weight for local is found to be 5kg and that
collects 10 eggs for local and 12 eggs for hybrid. The mean weight for local is found to
for hybrid is found to be 12kg. The standard deviation for local weight is 2kg and that for
be 5kg and that for hybrid is found to be 12kg. The standard deviation for local weight
hybrid is 3kg. Test the claim for the farmer.
is 2kg and that for hybrid is 3kg. Test the claim for the farmer.
Solution
Data: ̅ ̅ , , where ̅ denote sample
mean weight for local chicken eggs, ̅ denote sample mean weight for hybrid chickens,
and denote sample standard deviation for weight of local chicken eggs and hybrid chicken
eggs respectively.
Hypothesis:
82
̅ ̅
This is one tailed test. Test statistic to use is ̅ ̅ with smaller degrees of freedom
This is one tailed test. Test statistic to use is √
with smaller degrees of freedom
√
of 1 and , since sample sizes are less than 30. We reject if
of 1 and , since sample sizes are less than 30. We reject if
where df is smaller of 1 and , that is, when . Now
where df is smaller of 1 and ̅ ,̅ that is, when . Now
̅ ̅
√
√ √
√
Thus since we fail to reject null hypothesis, that is, based on the available data
Thus since we fail to reject null hypothesis, that is, based on the available data
, that is, eggs of local chickens are smaller or equal to those of hybrid chickens.
, that is, eggs of local chickens are smaller or equal to those of hybrid chickens.
Example
Example
Example
IsIs there
there difference in
in return
return book
booktimes
timesbetween
betweentwo
twouniversity
university students?
students?
Is there difference in return book times between two university students?
Book-Return times for two University bookstores
Book-Return
Book-Return times times
for (inforUniversity
two two University bookstores
bookstores
days)
(in days)
(in days)
LUANAR Mzuzu
LUANAR Mzuzu
2 3
2 3
4.3 6.5
4.3 6.5
8.5 5
8.5 5
3 7.5
3 7.5
2 8
2 8
4
4
3
3
Solution
Solution
Ho:
Ho:
where and denote mean 100
retun time for LUANAR and Mzuzu university
where and denote mean retun time for LUANAR and Mzuzu university
students respectively.
students respectively. ̅ ̅
4
3
Solution
Solution
Ho:
where and denote mean retun time for LUANAR and Mzuzu university
students respectively.
̅ ̅
This is two tailed test. Test statistic is with degrees of freedom
√
where √ . Note here we assume that two universities have equal return
̅ ̅
time variances, but you can also use with smallest degrees of freedom of
√
1 and , when you assume that university book return times variances are
different. Why do we use t-test instead of z-test? Because <30. Now,
x1 = Mean for LUANAR University
x 2 = Mean for Mzuzu University
s 21 = Variance for LUANAR University
s 2 2 = Variance for Mzuzu University 83
̅ ̅
√ √
̅ ̅
√ √
Activ Activity
Activity5.3
5.3
1. A farmer wants to test the claim that famers in Malawi spend on average
1. A farmer5hrs at a to
wants farm. Heclaim
test the collects
thattimes
famersofin10 famers
Malawi spent
spend onataverage
a farm5hrs
andatfinds
a
farm. He collects times of 10 famers spent at a farm and finds that the times spent of
that the times spent have an average of 6hrs, and standard deviation
have an3hrs. Assuming
average of 6hrs,that
and times aredeviation
standard approximately
of 3hrs.normal,
Assuming what
thatwould be the
times are
appropriate
approximately testwhat
normal, statistic
wouldto be
be the
used in this test.
appropriate test statistic to be used in this
2.test. Suppose a production line operates with a mean filling weight of 16
2. Suppose ounces per container.
a production Since
line operates over-
with or under-filling
a mean filling weight of can16be dangerous,
ounces per a
quality control inspector samples 30 items to determine
container. Since over- or under-filling can be dangerous, a quality control inspectorwhether or not
samplesthe30filling
items to weight has whether
determine to be adjusted.
or not the The sample
filling weightrevealed
has to be aadjusted.
mean of
16.32revealed
The sample ounces.a meanFromofpast data,
16.32 the standard
ounces. From past deviation
data, theisstandard
knowndeviation
to be .8
ounces. Using a 0.10 level of significance, can it be concluded
is known to be .8 ounces. Using a 0.10 level of significance, can it be concluded that the
that
process
the process is of
is out outcontrol
of control (not equal
(not equal to 16 ounces)?
to 16 ounces)?
101
5.4 Testing hypothesis about population proportion
5.4.1 Hypothesis about single population proportion
is known to be .8 ounces. Using a 0.10 level of significance, can it be concluded that
the process is out of control (not equal to 16 ounces)?
standardizing ̂ That is, when testing hypothesis about population proportion , we will
assume large samples so as to use z-test statistic.
Example
A marketing company claims that it receives 4% responses from its mailing. To test this
claim, a random sample of 500 was surveyed with 25 responses. Test at α =0.05 significance
84
level.
Solution
H0 : p = 0.04
H1: p 0.04
This is a two-sided rejection region test since sample proportions that are either much smaller
than 0.04 or much larger than 0.04 would cause you to reject the null and support the
alternative. Rejection region: we reject null hypothesis when Z ≥ or Z ≤ - . That is
when Z ≥ or Z ≤ - , that is, when ever Z ≥ 1.96 or Z ≤ -1.96. Now the value of
̂
test statistic is
√
.
In conclusion since z , we fail to reject the null hypothesis, that is, the claim of
marketing company is likely to be true. Note we used the Z-test since by CLT.
normal after standardised ̂ ̂ . Under assumption of equal population proportions, that is,
, we have
̂ ̂
√ ( )
̂ ̂
where , that is, estimated by the pooled sample proportion.
Example
Example
A farmer club in Mzuzu claims that the proportion of rotten ground nuts in their 50kg bag is
Asame
farmer clubforinMulli
as that
Example Mzuzu claims Athat
Brothers. the proportion
researcher of rotten
collects 100 ground
ground nuts fromnutsa in
bagtheir 50kg
of farmer
bag is same
clubAand
farmer as
finds that
that
club for Mulli
20% are
in Mzuzu Brothers.
rotten
claims thatand A
thecollectsresearcher
8580offrom
proportion collects
Mulli
rotten bag
ground 100
and
nuts ground
in finds nuts
12%is are a
that bag
their 50kg from
bag of
same
rotten. farmertheclub
as that
Test for and
Mulli
claim finds that
ofBrothers.
famers 20% are collects
A researcher
club. rotten 100
andground
collects
nuts80
fromfrom
a bagMulli bag and
of farmer
findsclub
thatand
12%findsare
thatrotten. Test
20% are theand
rotten claim of famers
collects club. bag and finds that 12% are
80 from Mulli
rotten. Test the claim of famers club.
Solution
Data ̂ ̂ .
Solution ̂ ̂ ̂ ̂
TestData
statistic
̂ is ̂ where . , since sample sizes are more than
√ ( )
̂ ̂ ̂ ̂
Test statistic is where , since sample sizes are more than
30. We reject null hypothesis
√ ( when
) |Z|> ,that is, when Z ≥ or Z ≤ - , that is, when
30. We reject null hypothesis when |Z|> ,that is, when Z ≥ or Z ≤ - ̂ , that̂ is, when
Z ≥ or Z ≤ - , that is, when Z ≥ 1.96 or Z≤ -1.96. Now and
̂ ̂
Z ≥ ̂ or
̂ Z≤- , that is, when Z ≥ 1.96 or Z≤ -1.96. Now and
. Since Z<1.96 we fail to reject the null
√ (̂ ̂ ) √ ( ) . Since Z<1.96 we fail to reject the null
√ ( ) √ ( )
hypothesis, that is, based on the available data there is no difference in proportions.
hypothesis, that is, based on the available data there is no difference in proportions.
ActivActivity 5.4
Activity 5.4
Activ Activity 5.4
Two pesticides are applied to two maize plots respectively with aim of
comparing
TwoTwo pesticideseffectiveness
pesticides are
are applied
applied toof two
to pesticides.
two The
plotsfollowing
maize plots
maize shows
respectively
respectively withwithproportion
aim aim of pest
of comparing
of comparing
died after
effectiveness
effectiveness the
of of treatment
pesticides. of
pesticides.The maize
The following plots
showsproportion
following shows proportion of pest
of pest dieddied
after after the treatment
the treatment of of
maize plots.
maize plots. Sample
Sample 11 Sample
Sample 2 2
nn1 1==368
368 n2 n=2 405= 405
x1x1==175
175 x2 x=2 182
= 182
x x1 175175 x 182
pˆ 2 pˆ 2 x2 182
pˆ 1pˆ1 1n 368 == 0.476
0.476
2n 405
= 0.449. Test whether
= 0.449. Test whether
n1 1 368 2 n2 405
Testiswhether
there significantthere is significant
difference difference
in effectiveness in effectiveness
of pesticides. of pesticides.
there is significant difference in effectiveness of pesticides.
5.5 Testing for equality of multiple populations means using F-test
5.5 Consider
Testingthe
fordata
equality
in tableofbelow:
103 means using F-test
multiple populations
Consider the data in table below:
Treatments/groups Observations
Treatments/groups Observations
1 2
n1 368 n2 405
there is significant difference in effectiveness of pesticides.
5.5
5.5 Testing
Testing for equalityofofmultiple
for equality multiple populations
populations means
means usingusing
F-testF-test
Consider the data in table below:
Consider the data in table below:
Treatments/groups Observations
1 …. …
2 …. …
. . . . .
. . . . .
. . . . .
k …. …
One
Onemay
may wish totest
wish to testasasto to whether
whether the treatments/groups
the treatments/groups areorequal
are equal not e.gortesting
not e.g
as testing
to
aswhether
to whether cropunder
crop yield yielddifferent
under different
fertilizersfertilizers
are equal orare equal
not. or equality
To test not. To oftest equality of
such
such treatments it is same as to test whether the treatment means are equal or not. The
following is the hypothesis formulation when testing for equality of treatments/groups:
treatments it is same as to test whether the treatment means are equal or not. The following is
86
the hypothesis formulation when testing for equality of treatments/groups:
versus
/atleast two treatment means are different.
The appropriate test statistic is the F-test statistic, defined as which is ,
that is, it has an distribution with degrees of freedom for numerator as and degrees
of freedom for denominator as
k
n x x
2
j j
j 1
MST where is the sample size for group/treatment j, ̅ is sample mean
k 1
for group j for j=1,2,3,…,k, ̅ is the grand/overall mean of all the data regardless of group,
and k-1 is degrees of freedom for between groups/treatments which is number of groups (k)
minus 1.
k k nj
n 1s 2j x xj
2
j j ,i
j 1 j 1 i 1
MSE where is group j sample variance, is an
N k N k
observation in group and column , and ̅ is group sample mean. The null hypothesis is
rejected when the F-test statistic is greater or equal to the critical F-value, i.e when
Note the F-test is always right tailed test, that is, the rejection region for the F-test is always
to the right, because the F-values are always positive.
104
Note the F-test is always right tailed test, that is, the rejection region for the F-test is always
to the right, because the F-values are always positive.
Solution
, 87
̅ ̅ ̅ ̅
Then,
∑ ̅ ̅
∑ ∑ ∑ ̅
210 195.82 190 195.82 180 1752 155 1752 145 161.62 175 161.62
18 3
216.91
Treatments/groups Observations
1 105 …. …
2 …. …
difference in the means of products of the three production lines. Note the F-test to test
equality of treatment means in table below
Treatments/groups Observations
1 …. …
2 …. …
. . . . .
. . . . .
. . . . .
k …. …
is based on ANOVA, analysis of variance. In this case the variability in observed data is
is split into different sources i.e due to difference between groups, and due to error
(unobserved influences). This means total variation in the observed data is split/analysed as
follows:
Total variation = variation due to group + variation due to error.
Total variation is measured by total sum of squares (TSS), error variation is measured by
error sum of squares (SSE), and between group is measured by between group sum of squares
(BSS). Thus summary of analysis of variance in the data becomes as follows:
TSS=SSB+ SSE
Now the F-test compares the variability between groups (BSS) and variability due to
treatment/group means then F is too large, that is, variability in data due to groups/treatments
(numerator) is larger
rejected whenthan variability due
, the to error
critical (denominator).
value or when Now where
the nullishypothesis
the is
rejected when
significance level. The, following
the critical value
is the
88or when
analysis where
of variance summary table whenis the the F-
using
test to compare treatment/group means.
significance level. The following is the analysis of variance summary table when using the F-
Source of Sum of squares Degrees of Mean square F-value
test to compare treatment/group means.
variation freedom
Source ofBetween Sum ofSST squares Degrees k-1 of Mean square
SST/k-1=MST F-value
F=MST/MSE
variation group/treatment freedom
Between Error SST SSE k-1 N-k SST/k-1=MST
SSE/N-k=MSE F=MST/MSE
Total
group/treatment TSS N-1 TSS/N-1
Error SSE N-k SSE/N-k=MSE
Total TSS N-1 TSS/N-1
Activ Activity 5.5
Activity 5.5
Activ Activity 5.5 ANOVA table for four normal populations with the same variance 2 and
The following
means 1 , 2 , 3 , 4 .
Source Sum of Degree of Mean Square F
Square Freedom
The following ANOVA table for four normal populations with the same variance 2 and
Between (1) (3) 237.4 (6)
means 1 , 2 , 3 , 4 .
Within (2) (4) (5)
Source Total Sum of1909.22 Degree 22 of Mean Square F
(a) Complete the Square
above ANOVA Freedom
table.
(b)
Between Test H 0 :
1 (1) 2 3 4 at (3)
0.05 . 237.4 (6)
Within (2) (4) (5)
5.6
Total Testing for association
1909.22 in 22 106
contingency tables using the Pearson chi-square
Let X1 and X2 be categorical variables with and categories/levels respectively in the cross
(a) Complete the above ANOVA
table/contingency table. table.
(a) Complete the above ANOVA table.
(b) Test H 0 : 1 2 3 4 at 0.05 .
5.6 Testing for association in contingency tables using the Pearson chi-square
Let X1 and X2 be categorical variables with and categories/levels respectively in the cross
table/contingency table.
X2
Level 1 Level 2 Level 3 … … Level J
Level 1
Level 2
X1 .
.
.
Level I
We wish to test whether there is an association between X1 and X2. The following is the
hypothesis formulation.
versus
Or
are independent versus
are not independent.
One of the test statistics to use is the Pearson89
chi-square test statistic defined as
(Oij Eij ) 2
2 ~ (2I 1)( J 1) , that is, has chi-square distribution with
E
ij
degrees of freedom where is number of rows and is the number of
columns. The null hypothesis is rejected when the chi-square is so large, that is, when
2 (2I 1)( J 1) ,
Type A Type B
Heart attack status
Heart Attack O=25 O=10
Solution
H0: Personality type & heart attack status are independent in the population versus
H1: Personality type & heart attack status are dependent in the population.
Before we can compute 2 we first need to find the expected frequencies in each of our
category cells. To calculate the E for a particular “cell” in the table we use the formula:
E = (cell’s column total)(cell’s row total) / n
Personality type
Type A Type B Row Total
Heart status
Heart attack O=25 O=10 35
No heart attach O=5 O=40 45
Column Total 30 50 80
E: type A and heart attack: (30)(35)/80 = 13.125
E: type A and no heart attack: (30)(45)/80
90 =16.875
E: type B and heart attack: (50)(35)/80 = 21.875
E: type B and no heart attack: (50)(45)/80 = 28.125
Let’s put this information in our table:
Personality type
Type A Type B Row Total
Heart status
Heart attack O=25 O=10 35
13.12 21.875
No heart attach O=5 O=40 45
16.875 28.125
Column Total 30 50 80
(O E ) 2
Chi-square is obtained via: 2 . The degrees of freedom for this test are:
E
df = (number of rows –1)(number of columns –1). We have 2 rows and 2 columns, thus our
degrees of freedom are: (2-1)(2-1) =1. Now,
The following data are the number of people who are in favor of, are not
The
The following datafollowing
in no
favor areof,the
data are the
number
and of number
people
haveproposal:
no comment
of
whopeople who
aresome
on
are in of,
in favor favor
proposal:areof,not
arein
notfavor
in favor
of,of,
andandhave
have
comment on some
no comment on some proposal:
Favor Not Favor No Comment
Male Favor 252 Not Favor145 No Comment 203
Male Female 252 148 145 105 203 147
Female 148 differ in their opinions
Test if female and male 91 105about the proposal. 147
Test if female andifmale
Test female differ
andinmale
their differ
opinionsin about the proposal.
their opinions about the proposal.
5.7 Point and interval estimation of population mean and proportion
Ainterval
point estimate is the valueofofpopulation
a single statisticmean
(e.g. the mean) while a confidence
5.75.7Point
Point
andand interval estimation
intervalestimation
is the value ofofanpopulation mean
interval or range
and
and proportion
of numbers
proportion
constructed around the point
AApoint
point estimate
estimate isisthe
thevalue
value of of a single
a single statistic
statistic (e.g.mean)
(e.g.attaches
the the mean)
while awhile a confidence
confidence
estimate. A confidence/ interval estimation an error to the estimate unlike point
interval
intervalisis the valueofofanCalculating
the estimation.
value aninterval
interval or or
therange range of
of numbers
confidence numbers
interval constructed
constructed
for the population around
mean:around
the point
when the point
the
estimate. A confidence/
estimate. A confidence/ interval
interval
population standard estimation
estimation
deviation attaches
attaches
is known, whichan an error
error the
is rarely to the to the
case,estimate
the estimate unlike
unlike)%point
point estimation.
estimation. Calculating
confidence
Calculating interval
the thepopulation
for
confidence confidence interval
intervalis:for for the mean:
the population population mean: when
when the
the population
population standard
standard x deviation
deviation √
≤ µis≤ known,
- z α/2is known, + zα/2 iswhich
x which
√
or ( x - is
rarely zthe
√
rarely
α/2 the
zα/2 case,
, x the
case, √
) the )%
100(1-α)%
confidence
confidenceinterval for
where for
interval population
x ispopulation μis:is:
the sample mean, µ is the population mean, z is the value of z depending
x upon
- z α/2the level
≤ µof≤ confidence
x + zα/2 desired,
or ( x1-α- zI nnms
α/2
the confidence level, and
, x zα/2 ) √
is the
√ √ √ √
standard error of the mean.
where x is the sample mean, µ is the population mean, z is the value of z depending
upon the levelExample
of confidence desired, 1-α I nnms the confidence level, and is the
√
Suppose that we wanted to calculate the 95% confidence level of the mean for the
standard error of the mean.
approval rating of President Joyce Banda in the population where x = 56% and σ = 12.1
and the sample size was 500. The value for z is obtained from the Z Table where 1-α =0
Example .95 and the value of z for 0.95/2 or 0.475 = 1.96. Then our 95% CI for the mean would be
Suppose that we wanted to calculate the5695% confidence
– (1.96× ) ≤ µ ≤level
56 + of the mean
(1.96× ) orfor the
√ √
approval rating of President Joyce Bandaµ = 56 in the or
± 1.06 population
it is within where = 56%
interval x(54.94, and σ = 12.1
57.06)
and the sampleWhen
size the
waspopulation
500. Thestandard
value for z is obtained
deviation is unknown, fromthe the Z Table
formula where
is below. 1-α
Note =0
that
.95 and the value of z for 0.95/2 or 0.475 = 1.96. Then our 95% CI for the mean would be
instead of the Z Table the t Table is used in the calculations.
x - tα/2, n-1 56 –≤(1.96×
µ ≤ x + tα/2,) n-1 or (+x (1.96×
≤ µ √≤ 56 - tα/2, n-1 , x + tα/2, n-1 ).
√ √ ) or √
√ √
Where x is sample mean, µ is the population mean, t is the value of t depending upon
µ = 56 ± 1.06 or it is within interval (54.94, 57.06)
the confidence level desired, α is the significance level, and n is the sample size.
When the population standard deviation is unknown, the formula is below. Note that
instead of the Z Table the t Table is used in 109
Example the calculations.
Suppose
x - tα/2, n-1 that we wanted to calculate
≤ µ ≤ x + tα/2, n-1 the
or (95%
x -confidence
tα/2, n-1 level
, xof+the mean for the
tα/2, ).
n-1
√ rating of President Joyce√Banda in the population
approval √ where x = 56%, s√= 65.3, and
.95 and the value of z for 0.95/2 or 0.475 = 1.96. Then our 95% CI for the mean would be
56 – (1.96× ) ≤ µ ≤ 56 + (1.96× ) or
√ √
µ = 56 ± 1.06 or it is within interval (54.94, 57.06)
When the population standard deviation is unknown, the formula is below. Note that
instead of the Z Table the t Table is used in the calculations.
x - tα/2, n-1 ≤ µ ≤ x + tα/2, n-1 or ( x - tα/2, n-1 , x + tα/2, n-1 ).
√ √ √ √
Where x is sample mean, µ is the population mean, t is the value of t depending upon
the confidence level desired, α is the significance level, and n is the sample size.
Example
Suppose that we wanted to calculate the 95% confidence level of the mean for the
approval rating of President Joyce Banda in the population where x = 56%, s = 65.3, and
the sample size was 1,025. Then our confidence interval for the mean would be
56 – (1.96× ) ≤ µ ≤ 56 + (1.96× ) or
√ √
µ = 56 ± 4.0 or (52, 60)
When calculating the confidence interval 92
for proportions, where there is a dichotomous
categorical outcome, the equations are somewhat different. It is assumed that the
population follows the binomial distribution and with multiple trials the normal
distribution would be approximated. The 100(1-α)% confidence interval for population
proportion is defined as:
⁄ √ ⁄ √ or ( ⁄
√ ⁄ √ )
Where p is the proportion of one group and (1-p) is the proportion of the other group.
Example
If in a random sample of 300 voters, 120 preferred candidate X, what is the 95%
confidence interval for candidate X? Our 95% CI for candidate X would be
5.8 Reflection
5.8 Reflection
Suppose you are testing for association between farmer farm size and access to
Suppose you are testing for association between farmer farm size and access to extension and
extension and you find that two out of six expected cell frequencies are less than five.
you find that two out of six expected cell frequencies are less than five. Would you proceed
Would you proceed with the Pearson chi-square test?
with the Pearson chi-square test?
Unit Summary
Unit Summary
InInthis
thisunit
unit you
you have
have learnt
learnt about
about hypothesis
hypothesis testing.testing. Youlearnt
You have haveonelearnt oneand
sample sample
two and
sample test of hypothesis using z and t test. You have also learnt testing of equality of equal-
two sample test of hypothesis using z and t test. You have also learnt testing of
ity of population
population means
means using the using
F-test. the F-test. Estimation
Estimation of populationofmean
population mean and
and proportion usingpropor-
tion usinginterval
confidence confidence interval
has also has also
been done. Whatbeen done.
has been What has
introduced in been introduced
this unit gives you in
an this
unit gives you an idea in carrying out test of research hypothesis where the
idea in carrying out test of research hypothesis where the research hypothesis is actually the research
hypothesishypothesis.
alternative is actually the alternative hypothesis.
110
End of unit test
2. Of a sample of 361 owners of retail service and business firms that had gone
into bankruptcy, 105 reported having no professional assistance prior to
opening the business. Test the null hypothesis that at most 25% of all members
of this population had no professional assistance before opening the
business.
3. A drawing training procedure’s effect is to compared with that of a sham
(nonsensical) method and a placebo control (no training). A sample of 53
subjects were obtained, each drawing a picture prior to “training”. 19 subjects
received the training method of interest (Edwards’ method), 18 received the
sham treatment, and 16 received the placebo treatment (no training). Drawings
were obtained after the training, and difference scores obtained for each subject
(post training-pre training). Complete the following ANOVA table and test
whether the mean change scores differ among the three conditions (α=0.05).
4. With a sample size of 800, and a standard deviation of 4.3, what is the 90%
confidence interval if the sample mean is 4.5?
111
among the three conditions (α=0.05).
4. With a sample size of 800, and a standard deviation of 4.3, what is the 90%
confidence interval if the sample mean is 4.5?
Thus,
2 (O ij Eij ) 2 / Eij
252 240 2
145 150 2
203 210 2
240 150 210
148 160 105 100 147 1402 2.5
2 2
H0: p = 0.25
Calculating Test Statistic:
41.3 50
t
12.2 / 20
3.189
2. Formulate Hypotheses:
H0: p = 0.25
H1: p > 0.25
Calculating Test Statistic:
0.2909 0.25
z
0.250.75
361
1.79
3.
Source df SS MS F
Groups 2 291.8027 145.9 14.78
0
Error 50 493.3881 9.87
Total 52 785.1908
SX
4.4 X Z
n
4.3
4.5 ± 1.645 or 0.25 ie 4.5 – 0.25 = 4.25 to 4.5 + 0.25 = 4.75 . Thus
800
confidence interval (at 90%) is from 4.25 to 4.7
96
114
Module Test
1. In an agricultural experiment, a large uniform field was planted with a single variety
of wheat. The field was divided into many plots (each plot being 700 m2) and the
yield Module test
(in kg) of grain was measured for each plot. These plot yields followed
approximately a normal distribution with mean 88 kg and standard deviation 7 kg.
1. In an agricultural experiment, a large uniform field was planted with a single
Whatvariety
percentage of the plot
of wheat. The yields were
field was divided into many plots (each plot being 700
a)m2)
80 and
kg the yield (in kg) of grain was measured for each plot. These
or less? plot
(3 marks)
yields followed approximately a normal distribution with mean 88 kg and
b)standard
Between 75kg and
deviation 90kgWhat percentage of the plot yields were (6 marks)
7 kg.
a) 80 kg or less? (3 marks)
2. A study of effect of three feeding regimes (maize bran, broiler starter and fishmeal) on
b) Between 75kg and 90kg (6 marks)
growth of fish was conducted. Maize bran was fed to fish in pond 1, broiler starter was
2. A study of effect of three feeding regimes (maize bran, broiler starter and
fed to fish in pond 2, and fishmeal was fed to fish in pond 3. The following weights in
fishmeal) on growth of fish was conducted. Maize bran was fed to fish in pond
grams1,ofbroiler
fish were measured
starter after
was fed toafish
period of 6 months
in pond 2, andtofishmeal
comparewas
growth
fed of
to fish
fishunder
in
pond,
different 3. The
feeding following weights in grams of fish were measured after a period
regimes.
of 6 months to compare growth of fish under different feeding regimes.
Maize bran: 63 58 61 60 62 59
BroilerMaize bran:
starter: 71 64 63 6858 6561 67
60 6762 59
Broiler starter: 71 64 68 65 67 67
Fish meal :
Fish meal 49
: 52494752 5147 4851 48
SST =SST = 56.00,
56.00, SSE =SSE = 140.00
140.00
Fill in the ANOVA table. (10 marks)
Fill in the ANOVA table. (10 marks)
SOURCE DF SS MS F
OF VARIATION
Treatment/feeding _ _ _ _
regime
Error _ _ _
Total _ _
Does
Doesthethedata
dataprovide
providesufficient
sufficient evidence indicateaadifference
evidence to indicate differenceamong
amongthethe feeding
feeding
regimes testing at (2 marks)
regimes testing at 0.10. (2 marks)
3. A popular traditional variety of Malawi cotton yields an average of μ = 425 kgs/
acre. An international seed company has developed a new variety which they
believe will provide a higher yield. To test the new variety, a student at Bunda
College grows 6 plots of the new variety. The yields for the plots are shown
below (in kgs/acre):
97
115
provide a higher
provide yield.
a higher To test
yield. the new
To test variety,
the new a student
variety, at Bunda
a student College
at Bunda grows
College 6 6
grows
plotsplots
of the
of new variety.
the new TheThe
variety. yields for the
yields for plots are shown
the plots below
are shown (in kgs/acre):
below (in kgs/acre):
X=New Variety
X=New Variety
431431
460460
430430
425425
435435
450450
Is the
Is New Variety
the New preferred
Variety overover
preferred the Traditional Variety
the Traditional at at0.
Variety 050.?05 ? (6 marks)
(6 marks)
Is the New Variety preferred over the Traditional Variety at ? (6 marks)
4. 4.4.
A farmer wants
A farmer
A farmer to know
wants
wants to ifknow
to know there ifisthere
if there anisassociation between
is association
an an association fishfish
between
between death
fish and
death feed
death
and type.
and feed
feed type.
SheShe type.fish
gives Shemeal
gives fish mealand to 53 fish, andgiven
39 fish are given maize bran and after
gives fish meal to 53 fish, and 39 fish are given maize bran and after the end6of 6
to 53 fish, 39 fish are maize bran and after the end of
the end of 6 months she notes number of fish dead and alive under each fish feed
months she she
months
regime.notes
Thenumber
notes number
following of fish deaddead
oftable
fish and and
alivealive
summarizes under each
under
the fishfish
each
observed feed
data regime.
feed Theafter
regime.
obtained The
period
following of summarizes
table six months. the observed data obtained after period of six months.
following table summarizes the observed data obtained after period of six months.
FishFish
mealmeal Maize bran
Maize bran Total
Total
DeadDead 50 (50 ( ) ) 28( 28( ) ) 78 78
Alive
Alive 3( 3( ) ) 11( 11( )
) 14 14
Total
Total 53 53 39 39 92 92
a) a)
Copy the table into youryouranswer book and and
fill in
fillexpected cell cell
frequencies in in
a) CopyCopy the table
the table into into
your answer answer
bookbook
and fill in in expected
expected frequencies
cell frequencies in
the brackets
the brackets given
givengiven
the brackets (4 marks)
(4 marks)
(4 marks)
b) b) Using
Using chi-square
chi-square test, test
test,test, as to whether
test test whether there
thereisisanan associationbetween
betweenfishfish
b) Using chi-square as to whether there isassociation
an association between fish
feeding regime and death. (4 marks)
feeding regime
feeding and and
regime death.death. (4 marks)
(4 marks)
98 98
116
References
117