Applied Statistics Course Guide ECO 452
Applied Statistics Course Guide ECO 452
APPLIED STATISTICS
ECO 452
Course Developers:
Dr. Adesina- Uthman Ganiyat Adejoke
Department of Economics, Faculty of Social Sciences,
National Open University of Nigeria.
and
Ogunjirin Olakunle
Yaba College of Technology
School of Liberal Studies
Department of Social Sciences.
Course Editor:
Course Content
This course will expose you to different statistical tools that economist can apply in
economic analysis. This course is built on the foundation of elementary statistics and
elementary economics in the understanding of real life situation.
Course Aims
There are fourteen study units in the course and each unit has its objectives. You are
advised to read through the objective of each them and bear them in mind as you through
each of the unit. In addition these objective is the overall objective which includes;
- Exposing you to basic statistical tools that can be applied in economics,
- Apply these tools to real life situation,
- Expose the students to economic interpretation of all calculated coefficients
Course objectives
The over-all objectives of this course are;
- to expand the learning horizons of the students
- to understand how to apply statistical tool in economics
Working through This Course
You have to work through all the study units in the course. There are four modules and
fourteen study units in all.
Course Materials
Major components of the course are:
1. Course Guide
2. Study Units
3. Textbooks
4. CDs
5. Assignments File
6. Presentation Schedule
Study Units
The breakdown of the four Modules and the 14 study units are as follows:
Module 1; Statistical Inference
Unit 1; sampling distribution defined
Unit 2; sampling distribution of proportion
Unit 3; sampling distribution of difference and sum of two means
Unit 4; probability distribution
Assignment File
In this file, you will find all the details of the work you must submit to your tutor for
marking. The marks you obtain from these assignments will count towards the final mark
you obtain for this course. Further information on assignments will be found in the
Assignment File itself and later in this Course Guide in the section on assessment.
Presentation Schedule
The Presentation Schedule included in your course materials gives you the important
dates for the completion of tutor-marked assignments and attending tutorials. Remember,
you are required to submit all your assignments by the due date. You should guard
against falling behind in your work.
Assessment
Your assessment will be based on tutor-marked assignments (TMAs) and a final
examination which you will write at the end of the course.
Tutor Marked Assignments (TMA)
Every unit contains at least one or two assignments. You are advised to work through all
the assignments and submit them for assessment. Your tutor will assess the assignments
and select four which will constitute the 30% of your final grade. The tutor-marked
assignments may be presented to you in a separate file. Just know that for every unit there
are some tutor-marked assignments for you. It is important you do them and submit for
assessment.
Introduction
The course advanced statistics (ECO 410) is a first semester course which carries two
credit units for fourth year level economics students in the School of Art and Social
Sciences at the National Open University, Nigeria. The course is a very useful course to
you in your academic pursuit, because it helps gain in-depth insight of the underlining
statistical tools usually used by economists.
This course guide tells you what advanced statistics entails, what course materials you
will be using and how you can work your way through these materials. It suggests some
general guidelines for the amount of time required of you on each unit in order to achieve
the course aims and objectives successfully. It also provides you some guidance on your
tutor marked assignments (TMAs) as contained herein.
Course Content
This course is built on the foundation of what you have learnt in your elementary
statistics. Topics covered include: statistical inference, probability distribution, analysis
of variance, multiple regressions, time series, price index.
Course Material
The major component of the course, What you have to do and how you should allocate
your time to each unit in order to complete the course successfully on time are listed
follows:
1. Course guide
2. Study unit
3. Textbook
4. Assignment file
5. Presentation schedule
Tutor-Marked Assignments (TMAs)
There are four tutor-marked assignments in this course. You will submit all the
assignments. You are encouraged to work all the questions thoroughly. The TMAs
constitute 30% of the total score.
Assignment questions for the units in this course are contained in the Assignment File.
You will be able to complete your assignments from the information and materials
contained in your set books, reading and study units. However, it is desirable that you
demonstrate that you have read and researched more widely than the required minimum.
You should use other references to have a broad viewpoint of the subject and also to give
you a deeper understanding of the subject.
When you have completed each assignment, send it, together with a TMA form, to your
tutor. Make sure that each assignment reaches your tutor on or before the deadline given
in the Presentation File. If for any reason, you cannot complete your work on time,
contact your tutor before the assignment is due to discuss the possibility of an extension.
Extensions will not be granted after the due date unless there are exceptional
circumstances.
Use the time between finishing the last unit and sitting for the examination to revise the
entire course material. You might find it useful to review your self-assessment exercises,
tutor-marked assignments and comments on them before the examination. The final
examination covers information from all parts of the course.
Course Marking Scheme
The table presented below indicate the total marks (100%) allocation.
Assessment Marks
Assignment (Best three assignment out of the four marked) 30%
Final Examination 70%
Total 100%
Course Overview
The table presented below indicate the units, number of weeks and assignments to be
taken by you to successfully complete the course, advanced statistics (ECO 410).
Each of the study units follows a common format. The first item is an introduction to the
subject matter of the unit and how a particular unit is integrated with the other units and
the course as a whole. Next is a set of learning objectives. These objectives let you know
what you should be able to do by the time you have completed the unit.
You should use these objectives to guide your study. When you have finished the unit
you must go back and check whether you have achieved the objectives. If you make a
habit of doing this you will significantly improve your chances of passing the course and
getting the best grade.
The main body of the unit guides you through the required reading from other sources.
This will usually be either from your set books or from a readings section. Some units
require you to undertake practical overview of events. You will be directed when you
need to embark on discussion and guided through the tasks you must do.
The purpose of the practical overview of some certain practical issues are in twofold.
First, it will enhance your understanding of the material in the unit. Second, it will give
you practical experience and skills to evaluate economic propositions, arguments, and
conclusions. In any event, most of the critical thinking skills you will develop during
studying are applicable in normal working practice, so it is important that you encounter
them during your studies.
Self-assessments are interspersed throughout the units, and answers are given at the ends
of the units. Working through these tests will help you to achieve the objectives of the
unit and prepare you for the assignments and the examination. You should do each self-
assessment exercises as you come to it in the study unit.
The following is a practical strategy for working through the course. If you run into any
trouble, consult your tutor. Remember that your tutor's job is to help you. When you need
help, don't hesitate to call and ask your tutor to provide it.
Your tutor will mark and comment on your assignments, keep a close watch on your
progress and on any difficulties you might encounter, and provide assistance to you
during the course. You must mail your tutor-marked assignments to your tutor well
before the due date (at least two working days are required). They will be marked by your
tutor and returned to you as soon as possible.
Do not hesitate to contact your tutor by telephone, e-mail, or discussion board if you need
help. The following might be circumstances in which you would find help necessary.
Contact your tutor if.
• You do not understand any part of the study units or the assigned readings
• You have difficulty with the self-assessment exercises
• You have a question or problem with an assignment, with your tutor's comments on an
assignment or with the grading of an assignment.
You should try your best to attend the tutorials. This is the only chance to have face to
face contact with your tutor and to ask questions which are answered instantly. You can
raise any problem encountered in the course of your study. To gain the maximum benefit
from course tutorials, prepare a question list before attending them. You will learn a lot
from participating in discussions actively.
Summary
On successful completion of the course, you would have developed critical thinking skills
with the material necessary for efficient and effective use of statistical tools economics.
However, to gain a lot from the course please try to apply anything you must have learnt
in the course to practice by doing the calculation on paper yourself. We wish you success
with the course and hope that you will find it both interesting and useful.
MODULE ONE; Statistical Inference
Unit 1; Sampling distribution defined
Unit 2; Sampling distribution of proportion
Unit 3; Sampling distribution of difference and sum of two means
Unit 4; Probability distribution
UNIT ONE; Sampling Distribution
CONTENT
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Sampling Distribution, Population and Sample Defined
3.2 Sampling Distribution of Parameter Estimate
3.3 Estimate of Sample Statistics
3.4 Estimators for Mean and Variance
3.5 The Role and Significant of Statistics in Social Sciences
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/ Further Readings
1.0 INTRODUCTION
Generally statistical data are studied in order to learn something about the broader field
which the data represents. In order to make statistical work meaningful, statistician
generalize from what we find in the figure at hand to the wider phenomenon which they
represent. In technical language we regard a set of data as a sample drawn from a larger
“universe”. We analyze the data of the sample in order to draw conclusion about the
corresponding universe or population.
In a sense universe actually exists and it is theoretically possible to study the universe
completely. But in another sense the universe is broader and in a sense less tangible.
This unit happens to be one of the four units in this module, for proper understanding of
the topics in this unit a thorough knowledge of elementary statistics is required.
2.0 OBJECTIVES
At the end of this unit you should be able to understand the following;
- Sample
- Population
- Sampling theory
- Parameter estimation
- Estimate sample mean, population mean etc.
Table M1.1.1
Basic Descriptive Measure of Population and Sample
Population parameters Symbol Sample statistics Symbol
I Population mean Sample mean x̅
2
Ii Population variance σx Sample variance Sx2
Iii Population standard σx Sample standard sx
deviation deviation
Note E(x) = = x1 + x2 + …… xn
n
SELF ASSESMENT EXERCISE
What are descriptive measures that can be used in describing a sample or population?
Table M1.1.2
Table of Analysis for Sample Mean, Standard Deviation and Variance
X X- (X - )2
X – E(X) (X – E(x)2
11 1 1 – 13 = 2 4
12 12 – 13 = 1 1
13 13 – 13 = 0 0
14 14 – 13 = 1 1
15 15 – 13 = 2 4
n=5 10
x = 11 + 12 + 13 + 14 + 15 = 65
5 5 = 13
Var (X) = (X – E(x))2 = (x - )2
Var (X) = 10
2 =5
δX = √2
δX = 1.4142
SELF-ASSESSMENT EXERCISE
Define standard deviation of a population
n
Sample variance as it has been said before now, it is a measure of dispersion of the value
of x in the sample around their average value. This is denoted as
(11,13) 11+13
= 12
2
(11,14) 11+14
= 12.5
2
(11,15) 11+15
= 13
2
(12,13) 12+13
= 12.5
2
(12,14) 12+14
= 13
2
(13,14) 13+14
= 13.5
2
(13,15) 13+15
= 14
2
(14,15) 14+15
= 14.5
2
n = 10
All the information about the population and possible samples can be summarize in a
frequency distribution as depicted in table 1.3 below.
Table M1.1.14
Table of Possible Samples
X F
11 1
11.5 1
12 1
12.5 2
13 2
13.5 2
14 1
14.5 1
15 1
(14-13)2 + (14.5-13)2
9
+ (15-13)2
9
Sx2 = (-2)2+(1.5)2+(-1)2+2(-0.5)2+2(0)2+2(0.5)2+(1)2+(2)2+(1.5)2
9
Sx2 = 4 + 2.25 + 1 + (0.25)2 + 2(0) + 2 (0.25) + 1 + 4 + 2.25
9
Sx2 = 4 + 2.25 + 1 + 0.5 + 0 + 0.5 + 1 + 4 + 2.25
9
Sx2 = 15.5
9
Sx2 = 1.722
Sx2 ≅ 2
Sx = √1.722
Sx = 1.31233
From the foregoing analysis it would be observed that given X1, X2 ……Xn of any
random sample of size n from any infinite population with population mean u and σ 2 then
1
with sample mean x̅ = ∑ni=1 x we have
n
(i) E(x̅) =
𝜎2
(ii) Var (x) =
n
SELF-ASSESSMENT EXERCISE
Define the sample variance of any given population
√n
This is a general case whereby sampling is specifically taken from a normal distribution.
Worked Example
Given a random sample of 20 taken from a normal distribution with mean 90 and
variance 25 find the probability that the mean is greater than 101.
Solution
x̅ 20 (90, 25)
x̅−
Z= N (0,1)
σ
√n
= 101 – 90
25
√20
= 11
25
√20
= 11
25
4.47213
11
=
5.590169
= 1.967739
≅ 1.968
SELF-ASSESSMENT EXERCISE
What are the assumptions of a normal distribution?
Self-Assessment Question
Do you think that statistics in social sciences has role to play in societal problem solving?
4.0 CONCLUSION
It has been established that given a random sample of X1, X2, …. Xn with population
mean and standard variance r2.
(i) (x̅) =
(ii) Var(x) = σ2/n
5.0 SUMMARY
In this unit, we have attempted the definition of population, sample, sample distribution
theory, so also estimation of parameter estimate and sample statistics had been attempted,
so also it has been proved from our calculation that the mean of sample must always
equal to the population mean it’s representing and that the variance of the population and
sample estimate are equal.
6.0 TUTOR MARKED ASSIGNMENT
Explain the descriptive measure of a sample statistics.
- Esan F.O. and Okafor, R.O. (2010): Basis statistical method (revised edition)
Toniichristo Concept Lagos.
- Olufolabo, O.O. and Talabi, C.O (2002): Principles and practice of statistics.
HASFEM Nig Enterprises, Shomolu, Lagos.
- Oyesiku, O.K. and Omitogun, O. (1999). Statistics for social and management
sciences. Higher Education Books Publisher Lagos.
UNIT TWO; SAMPLING DISTRIBUTION OF PROPORTION
CONTENT
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Sampling Distribution of proportion defined Sampling Distribution of
Parameter Estimate
3.2 Standard Error
3.3 Sampling Distribution of differences and sum of means
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/ Further Readings
1.0 INTRODUCTION
This unit is an extension of unit one of this module. In this unit we are going to look at
sampling distribution of proportion, sampling distribution of sum and difference and
standard error. Since this unit is an offshoot of the unit one of this module, most of the
statistical term used in unit one will be implied here.
2.0 OBJECTIVE
At the end of our discussion of this unit, you should be able to calculate:
- Sampling distribution of proportion
- Sampling distribution of sum
- Sampling distribution of difference and
- Standard error
3.0 MAIN CONTENT
3.1 Sampling Distribution of Proportion Defined
Samples are usually embedded in a population, each time attribute is sampled, the
concept of proportion is coming in. the estimation here is concentrating on the proportion
of the population that has a peculiar characteristics. This sampling distribution is like of
binomial distribution, where an event is divided into been a success represented with p or
been a failure represented with q or 1 – p.
Given an infinite population consisting of sample size n. The sampling distribution of
proportion is said to have a mean of np.
and variance
P (1−p) pq
var (p) = var (p) = =
n n
It is to be noted at this juncture that the sample proportion is also an unbiased estimator
of the population proportion i.e. (p) = P
Example
A coin is tossed 120 times, find the probability that head will appear between 45% and
55%.
Solution
From the above the prob(head) = ½ = p
Prob(not obtaining ahead) = ½ = q = 1 – p
45% of tosses = 45 x 120
100
= 54
55
While 55% of tosses gives x 120
100
= 66
Mean p = np = 120 x ½
= 60
0.25
S.D = √npq = x 120
120
= 0.00208333
0.25
S.D. √
120
= 0.4564
1 1
S.D. = √npq = √120 ( ) ( )
2 2
= √30
= 5.477225575
54−60 66−60
Prob (54 < p < 78) = p ( <z< )
5.5 5.5
6 6
=p( <z< )
5.5 5.5
SELF-ASSESSMENT EXERCISE
What is the symbolic definition of standard deviation of a sample proportion?
( )
:. √p 1 − p
n
From the example in subsection 3.2 above
p=½=q
n = 120
0.5 (0.5)
:. S.E = √
120
0.25
S.E = √
120
S.E = √0.0020833
S.E = 0.0456
SELF-ASSESSMENT EXERCISE
What does S.E stands for?
4.0 CONCLUSION
During the course of our discussion of this unit we have talked about;
- Sampling distribution of proportion
- Standard error
5.0 SUMMARY
In the course of our discussion we defined the mean of a sampling distribution of
proportion as np.
i.e. mean = np
variance (p) = P(1-P)
σ(p) = √npq
- Esan, F.O. and Okafor, R.O. (2010): Basis statistical method (revised edition)
Toniichristo Concept, Lagos.
- Murray, R.S. and Larry, J. S. (1998): (Schaum outlines series). Statistics (Third
edition) MCGRAW HILLS.
- Olufolabo, O.O. and Talabi, C.O. (2002): Principles and practice of statistics.
HASFEM Nig Enterprises Shomolu Lagos.
- Oyesiku, O.K. and Omitogun, O. (1999): Statistics for social and management
sciences. Higher Education Books Publisher Lagos.
UNIT THREE; SAMPLING DISTRIBUTION OF SUM AND DIFFERENCE OF
TWO MEANS
CONTENT
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Sampling Distribution of difference and sum of two means defined
3.2 Worked example sampling distribution of sum of two means
3.3 Worked example sampling distribution of sample differences of two
means
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/ Further Readings
1.0 INTRODUCTION
This unit is an extension of unit one and unit two of this module. In this unit we are going
to look at sampling distribution of sum and difference of two means. Since this unit is an
offshoot of the unit one of this module, most of the statistical term used in unit one will
be implied here.
2.0 OBJECTIVE
At the end of our discussion of this unit, you should be able to calculate:
- Sampling distribution of sum of two means
- Sampling distribution of difference and
3.0 MAIN CONTENT
3.1 Sampling Distribution of Difference of Two Means and Sum (𝐱̅ − 𝐱̅)
If two independent random sample of sizes n1 and n2 are selected from 2 different
population of size N1 and N2 with population means 1 and 2 respectively and
population variance σ12 and σ22 respectively, then the sampling distribution of the
difference of two means (x̅1 − x̅2 ) = p1 -p2 and standard deviation of the sample
distribution is written as
σx1 – x2 = σ12 + σ22
n1 n2
SELF-ASSESSMENT EXERCISE
Sampling distribution of the difference of two means is defined as-------
p1 + p2 = 95
Considering the 1st population p2 (30,50)
30+50
p1 =
2
80
p1 =
2
p1 = 40
Considering the 2nd population (40, 70)
40+70
p2 =
2
110
p2 = = 55
2
:. p1 + p2 = 55 + 40
p1 + p2 = 95
Note p1 + p2 = 95
p1 + p2 = 95
:. p1 + p2 = p1 + p2
SELF-ASSESSMENT EXERCISE
What is the population and sample mean of P1 = (70,90), P2 = (60,80)
p1 – p2 = -10 – 40 + 10 – 20
4
− 60
p1 – p2 =
4
p1 – p2 = - 15
:. p1 – p2 = p1 - p2
- 15 = - 15
= 100
Considering the 2nd population
σ2p2 = variance of (40,70)
σ2p2 = (40 – 55)2 + (70 – 55)2
2
Where 55 = mean of population = p2
σ2p2 = 152 + 152
2
σ2p2 = 2252 + 2252
2
σ2p2 = 450
2
σ2p2 = 225
σ2p1 + σp2 = 225 + 100 = 325
σ2p1 + p2 = 325
SELF-ASSESSMENT EXERCISE
Sampling distribution of the difference of 2 mean x̅1 & x̅2 is usually written as?
4.0 CONCLUSION
In the course of our discussion of this unit you have learnt about
- Sampling distribution of difference of two means
- Sampling distribution of sum of two means
5.0 SUMMARY
In the course of our discussion on this unit we defined sampling distribution of the
difference of two mean as p1 - p2 and standard deviation of the difference as
- Esan F.O. and Okafor, R.O. (2010): Basic statistical method (revised edition)
Toniichristo Concept, Lagos.
- Olufolabo, O.O. and Talabi, C.O. (2002): Principles and practice of statistics.
HASFEM Nig Enterprises Shomolu, Lagos.
- Oyesiku, O.K. and Omitogun, O. (1999): Statistics for social and management
sciences. Higher Education Books Publishers, Lagos.
UNIT FOUR; PROBABILITY DISTRIBUTION
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Probability defined
3.2 Probability distribution of a random variable (Binomial distribution)
3.3 Poisson distribution
3.4 Probability distribution of a continuous variable (normal distribution)
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/ Further Readings
1.0 INTRODUCTION
Probability Defined
For thorough understanding of this unit, it is assumed that you must have familiarized
yourself with introductory statistics and unit one of this module. The main thrust of this
unit is to introduce to you the concept of probability distribution, its discussion,
calculation and interpretation of result. This unit is fundamental to the understanding of
subsequent modules. This is because other unit and module will be discussed on the basis
of the fundamentals concept explained here.
2.0 OBJECTIVES
At the end of this unit you should be able to understand the following:
i. Concept of probability
ii. Different probability distribution
iii. Calculate the different probability distribution
3.0 MAIN CONTENT
3.1 Probability Defined
Statisticians spends quality time measuring data and drawing conclusions based on his
measurement sometimes, all the data is available to the statisticians and the measurement
are bound to be accurate in such circumstances, it can be said that the statistician has
perfect knowledge of the population.
There are a-times whereby this will not be the usual situation. In most cases, the
statistician will not have the details he wants about the population and will be unable to
collect the information he wants because of cost and labour involved.
However, because the entire population has not been examined, the statistician can never
be completely sure of the result, so when quoting conclusion based on sample evidence, it
is usual to state how confident the statistician is about his result. So you will often see
estimates quoted with 85% confidence. This is simply talking about the probability that
the estimate is right is 85%.
The probability of a value X of a random variable is usually referred to as the limiting
value of the relative frequency of that value as the total number of observation on the
variable approaches infinity, the value which the relative frequency assumes at the limit
as the number of observations tends to infinity. This can be written as
f
P(x) = limn
∑Fx
SELF-ASSESSMENT EXERCISE
What is another name that probability can be called?
Alternatively
𝑛!
P(x) = px (1-p)n-x
𝑋!(𝑛−𝑥)
Solution
Probability of obtaining a head = 1/2 = p
Probability of not obtaining ahead = 1-p = q = ½
X = 3, n = 5
𝑛!
(a) P(x) = p.x (1-p)n-x
𝑋!(𝑛−𝑥)!
𝑛!
= p.x qn-x
𝑋!(𝑛−𝑥)!
P(x) = 0.3125
:. The probability of obtaining 3 heads from 5 tosses of coin = 0.3125
= 1/32
= 0.03125
1 5-1
P(1) = 5! 1 1
1! (5-1)! 2 2
5𝑥4𝑥3𝑥2𝑥1
= ½ . ½4
4𝑥3𝑥2𝑥1𝑥1
5 1
= x½x
1 16
5
= = 0.15625
32
P(2) = 5! 12 1 5-2
2! (5-2)! 2 2
5𝑥4𝑥3𝑥2𝑥1 1
= ¼.
2𝑥3𝑥2𝑥1𝑥1 8
10 1
P(2) = x
1 32
10
P(2) =
32
P(2) = 0.3125
P(<3) = P(0) + P(1) + P(2)
P (<3) = 0.03125 + 0.15625 + 0.3125
P (<3) = 0.49625
Mean = = np = 5 (½)
np = 5/2
np = 2.5 heads
Standard deviation = σ = √npq
1 1
σ= √5 ( ) ( )
2 2
5
σ= √
4
σ= √1.25
S.D. = σ = 1.1180339887499
= 1.12 heads
SELF-ASSESSMENT EXERCISE
What do you understand by the word a random variable?
P(x=1) = 61 x 2.71828-6
1!
P(x =1) = 6 x 0.00248
1
P(x=1) = 0.01488
P(x=2) = 62 x 2.71828-6
2!
P(x =2) = 36 x 0.00248
2x1
P(x=2) = 18 x 0.00248
P(x=2) = 0.04464
SELF-ASSESSMENT EXERCISE
Define standard deviation of Poisson distribution?
3.4 Probability Distribution of a Continuous Variable (Normal Distribution)
If a variable is continuous, it can assume an infinite number of values within a given
interval. An important feature of probability distribution is that the areas under these
curve represents probabilities. The total area under the curve of a probability distribution,
being the sum of individual probabilities is equal to unity (1).
The normal distribution as a continuous probability distribution and the most commonly
used distribution in statistical analysis. The normal curve is bell-shaped and symmetrical
about its mean. Usually, it extends indefinitely in both directions, but most of the area
(probability) is clustered around the mean.
To find the probabilities for problems involving the normal distribution, first convert the
x value into corresponding z value using
X−
Z=
σ
Example
Given that family incomes are normally distributed with = N14,000 and δ = 4000.
What is the probability that a family picked a random will have;
(a) Between N13,000 and N16,000 ?
(b) Below N13,000?
(c) Above N16,000 ?
(d) Above N18,000?
X−
Z=
σ
Z1 = - 1,000
4,000
Z1 = - 0.25
When X = 16,000; Z2 = 16,000 - 14,000
4,000
= 2000
4000
= 0.5
Z1 = 0.25 ; Z2 = 0.5
ZT1 = 0.0987 ; ZT2 = 0.1915
Where ZT1 and ZT2 represents the table value for Z1 and Z2
:. Prob (13,000 x 16,000) = 0.0987 + 0.1915
:. Prob (13,000 x 1,6000) = 0.2902
= 29%
(b) Prob (x < 13,000) = 0.5 – 0.0987
= 0.4013
≅ 40%
(c) Prob (x > 16,000) = 0.5 – 0.1915
= 0.3085
≅ 30.85%
(d) Prob (x > 18,000)
(x = 18,000)
Z = 18,000 – 14,000
4,000
Z = 40,000
4000
Z =1
ZT = 0.3413
:. Prob (x > 18,000) = 0.5 – 0.3413
= 0.1587
Prob (x > 18,000) = 15.8%
≅ 16%
SELF-ASSESSMENT EXERCISE
Explain the attributes of a normal distribution curve
4.0 CONCLUSION
From our discussion so far you have learnt about:
- Probability
- Probability distribution
- Different probability distribution, the binomial, Poisson, and normal distribution.
5.0 SUMMARY
In the course of our discussion of this unit, we have defined the different probability
distributions binomial distribution is defined as
P(x) = nCx Px qn-x
Alternatively
𝑛!
P(x) = Px (1-p)n-x
𝑋!(𝑛−𝑥)!
√ = standard deviation
Normal distribution
X−
Z=
σ
- Esan, E.O. and Okafor, R.O. (2010): Basic Statistical Methods (Revised
Edition) Tonichristo Concept.
- Olufolabo, O.O. and Talabi, C.O. (2002): Principles and practice of statistics,
HASFEM (NIG) ENTERPRISES, Somolu, Lagos.
- Oyesiku, O.K. and Omitogun, O. (1999): Statistics for social and management
sciences. Higher Education Book Publishers Lagos.
1.0 INTRODUCTION
A detailed knowledge and understanding of introductory statistics is assumed, it is also
expected that students would have familiarized themselves with hypothesis testing. This
unit is one of the four units in module 2 of the course.
2.0 OBJECTIVE
At the end of this unit, you should be able to understand and be able to calculate:
Total sum of square
Sum of square between groups
Sum of square within the group
Mean square
3.0 LOGIC OF ANALYSIS OF VARIANCE (ANOVA)
Analysis of variance (Anova) is usually used to test null hypothesis that the means of two
or more populations are equal versus the alternative that at least one of the means is
different. The null hypothesis (Ho) tested in the case of ANOVA is that the means of the
population from which the sample is drawn are all equal i.e. Ho, 1 = 2 = 3 = ……… =
n while the alternative hypothesis says that Ho taken as a whole is not true i.e. H1; 1≠
2 ≠ 3.
It is to be noted that each time ANOVA is used, all we are trying to do is to analyze or
test the variances in order to test the null hypothesis about the means (i.e. H o; 1 = 2 =
3). The ANOVA procedure is based on mathematical theory that the independent sample
data can be made to yield two independent estimate of the population variance namely;
(i) Within group variance (or error) this is variance estimate which deals with how
different each of the values in a given sample is from other values in the same
group.
(ii) Between group variance this is estimate that deals with how the means of the
various samples differs from each other.
SELF-ASSESSMENT EXERCISE
State the null hypothesis of analysis of variance?
SSA + SSE
𝑥𝑖𝑗
Where 𝑥̅𝑗 = mean of sample j composed of r observations =
r
𝑖 𝑗 𝑥𝑖𝑗
x̅ = 𝑔𝑟𝑎𝑛𝑑 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑎𝑙𝑙 𝑐 𝑠𝑎𝑚𝑝𝑙𝑒𝑠 =
rc
SSA = Sum of square explained by factor A = 𝑟 (x̅ − 𝑥̿ )2
SSE = Sum of square of error unexplained by factor A = (𝑥𝑖𝑗 − 𝑥 ̅ )2
2
SST = Total Sum of squares = SSA + SSE = (𝑥𝑖𝑗 − 𝑥̿𝑗 )
Where c = no of samples
r = no of observations in each sample
SELF-ASSESSMENT EXERCISE
State the fisher ratio
3.4 Worked Example `
The information below relates to quantities of plastic produced by a plastic industry in 3
sections (morning, afternoon and evening) for 5 weeks. The production data are normally
distributed with equal variance.
Table M2.1.2
Table showing production of a plastic industry
Weeks Morning (X1) Afternoon (X2) Evening (X3)
1 85 77 90
2 83 81 92
3 79 75 84
4 81 82 82
5 82 80 87
Is there any significant difference due to production session?
Test at 5% level of significance.
Solution
Ho; 1 = 2 = 3
Hi; 1 ≠ 2 ≠ 3
Note let the quantities produced in morning be represented by X1, afternoon X2, evening
X3 .
X1 = 410
𝑥̅1 = X1 = 410 = 82
r 5
where r = number of weeks
X2 = 395
𝑥2 = X2 = 395 = 79 79
̅̅̅
r 5
X3 = 435
𝑥3 = X3 = 435 = 435 = 87 87
̅̅̅
r r 5
410 + 395 + 435
̅
x =
(5)(3)
= 1240 = 82.66667 = 82.67
15
83
SSA = 5[(82 – 82.67)2 + (79 – 82.67)2 + (87 – 82.67)2]
= 5[(-0.67)2 + (-3.67)2 + (4.33)2]
= 5(0.4489 + 13.4689 + 18.7489)
= 5(32.667)
= 163.3335
2
SSE = (𝑥𝑖𝑗 − 𝑥̅𝑗 )
= (85 – 82)2 + (83 – 82)2 + (79 – 82)2 + (81 – 82)2 + (82 – 82)2 +(77 – 79)2
+ (81 – 79)2 + (75 – 79)2 + (82 – 79)2 + (80 – 79)2 + (90 – 87)2 +(92 – 87)2
+ (84 – 87)2 + (82 – 87)2 + (87 – 87)2
= (3)2 + (1)2 + (-3)2 + (-1)2 + 02 +(-2)2 +(2)2 + (-4)2 (3)2 + (1)2 + (3)2 + (5)2 +(-
3)2 + (-5)2 + 0
= 9 + 1 + 9 + 1 + 0 + 4 + 4 + 16 + 9 + 1 + 9 + 25 + 9 + 25 + 0
= 122
Table M2.1.3
One-Way Analysis of Variance Table
Sources of variation Sum of squares Degree of Mean square I ratio
freedom
Explained variation 𝑆𝑆𝐴 = 163.3335 3-1 =2 MSA = 163.335
(between column) 2
= 81.66675 81.66675
Unexplained 10.167
variation or error 𝑆𝑆𝐸 = 122 (5 – 1)3 = (4)3 MSE = 122 = 8.0325
(within column) = 12 12
= 10.167
Total 285.3335 rc – 1 = 14 -
Degree of Freedom
Explained variation = c – 1 Where c = number of samples
Unexplained variation = (r – 1) c r = number of weeks
Total variation = rc – 1
MEAN SQUARE
MSA = SSA
c–1
MSE = SSE
(r-1)c
F-ratio = MSA
MSE
F0.05(2,12) = 3.88 (Critical value)
Source: F distribution table
Decision
Accept Hi, reject Ho because Fcal > Ftab which implies that there is significant difference
between the mean of production sessions.
Self-assessment exercise
State the formulae for sum of square?
4.0 CONCLUSION
In the course of our study of one-way analysis of variance you must have learnt about;
Explained variation
Unexplained variation
Total variation
5.0 SUMMARY
In the course of our discussion of one-way analysis of variation the following definitions
were inferred
𝑆𝑆𝐴 = 𝑟 (𝑥̅𝑗 − 𝑥̿ )2
2
𝑆𝑆𝐸 = ( 𝑥𝑖𝑗 − ̅̅̅̅
𝑥𝑖𝑗 )
2
𝑆𝑆𝐽 = (𝑥𝑖𝑗 − x̿ )
6.0 TUTOR MARKED ASSIGNMENT
Submit a one page essay on the definition of degree of freedom for explained variation,
unexplained variation and total variation.
- Olufolabo, O.O. and Talabi, C.O. (2002): Principles and Practice of Statistics;
HAS-FEM ENTERPRISES Somolu, Lagos.
1.0 INTRODUCTION
This unit is an extension of unit one, the difference between them is that, here, we
can test for two (2) null hypothesis, one for factor A and the other for factor B.
2.0 OBJECTIVE
At the end of this unit, you should be able to test for two null hypothesis.
Ho; Ua1 = Ua2 = Ua3
Ho; Ub1≠ Ub2 ≠ Ub3
SELF-ASSESSMENT EXERCISE
State the divisions into which total variation is divided into?
The subscripted dot signifies that more than one factor is under consideration.
2
SST = (𝑥𝑖𝑗 − 𝑥̿𝑗 )
SSA = 𝑟 (xij
̅̅̅ − 𝑥̿ )2 between column variation
SSB = 𝑐 (𝑥̅𝑖 − x̿ )2 between row variation
SSE = SST – SSA – SSB
Degree of freedom of SSA = c – 1
Degree of freedom of SSB = r – 1
Degree of freedom of SSE = (r-1) (c – 1)
Degree of freedom of SST = rc – 1
Mean Square
MSA = SSA
c–1
MSB = SSB
r–1
MSE = SSE
(r-1)(c-1)
F- statistics
F-ratio for factor A = MSA
MSE
F-ratio for factor B = MSB
MSE
It is to be noted that; two (2) separate null hypothesis is considered.
(i) Ho; There is no difference between mean of treatment
(ii) Ho; There is no difference between mean of block.
SELF-ASSESSMENT EXERCISE
State the formulae for column mean?
3.4 Worked Example
Samples taken involving two (2) interactive factors A & B in a two analysis of
variance experience gives the result below:
TableM2.2.2
Table showing interactive factors A and B
Treatment A
Block (B) 22 11 10 5
13 10 8 6
7 9 6 2
You are carry out a 2-way analysis of variance at 0.05 level of significance?
Solution
Hypothesis
1. Ho; 1 = 2 3 = 4; H1; 1 ≠ 2 = 3 = 4
2. Ho; 1 = 2 = 3; H1; 1 ≠ 2 ≠ 3
Table M2.2.3
Two-Way Classification Table
Treatment A Total Sample
mean
Block B 22 11 10 5 48 𝑥̅ 1 = 12
13 10 8 6 37 𝑥̅ 2 = 9.25
7 9 6 1 23 𝑥̅ 3 = 5.75
Total 42 30 24 12 108 𝑥̅ i = 27
Sample 42/3 30/3 24/3 12/3 𝑥̿ = 9
mean x.1 = 14 x.2 = 10 x.3 = 8 x.4 = 4
2
SST = (𝑥𝑖𝑗 − 𝑥̿𝑗 )
(22 – 9)2 = (13)2 = 169; (11 – 9)2 = (2)2 = 4; (10 – 9)2 = (1)2 = 1
(13 – 9)2 = (4)2 = 16; (10 – 9)2 = (1)2 = 1; (8 – 9)2 = (-1)2 = 1
(7 – 9)2 = (-2)2 = 4; (9 – 9)2 = (0)2 = 0; (6 – 9)2 = (-3)2 = 9
= 189 =5 = 11
(5 – 9)2 = (-4)2 = 16;
(6 – 9)2 = (-3)2 = 9;
(1 – 9)2 = (-8)2 = 64
= 89
Degree of Freedom
SSA = c – 1 = 4 -1 = 3
SSB = r – 1 = 3 – 1 = 2
SSE = (r-1) (c – 1) = (3 – 1) (4 – 1) = (2) 3 = 6
SST = rc – 1 = (4 x 3) – 1 = 12 – 1 = 11
Mean Square
MSA = SSA = 156 = 156 = 52
c–1 4 -1 3
MSB = SSB = 78.5 = 78.5 = 39.25
r–1 3-1 2
MSE = SSE = 59.5 = 59.5 = 59.5 = 9.916666667
(r-1)(c-1) (3-1)(4-1) (2)(3) 6
F-ratio
MSA = 52 = 5.243697303
MSE 9.916667
MSB = 39.25 = 3.95798318
MSE 9.9166667
Table M2.2.4
Two-ways / Two Factor Analysis of Variance
Sources of variation Sum of squares Degree of Mean square E ratio
freedom
Explained variation
by factor A (between 𝑆𝑆𝐴 = 156 C–1=3 MSA = 5.24370
column) MSA = 52 MSE
Explained variation
by factor B (between 𝑆𝑆𝐵 = 78.5 r–1=2 MSB = 3.95798
rows) MSB = 39.25 MSE
Unexplained variation 𝑆𝑆𝐸 = 59.5 MSE= -
or error (r – 1)(c-1) = 6 9.91667
Total 294 11 - -
SELF-ASSESSMENT EXERCISE
State the decision criteria for accepting or rejecting hypothesis?
4.0 CONCLUSION
In the course of our discussion on two-way analysis of variance, we have learnt
about:
(i) Sum of square of Factor A
(ii) Sum of square of Factor B
(iii) Sum of square of the error term
(iv) Mean square of Factor A
(v) Mean square of Factor B
(vi) F-ratio of both Factor A and Factor B
(vii) Sum of Square of total variation.
5.0 SUMMARY
In our discussion the following definition were inferred to:
2
(i) SST = (𝑥𝑖𝑗 − 𝑥̿𝑗 )
(ii) SSA = 𝑟 (xj
̅ − 𝑥̿ )2
(iii) SSB = 𝑐 (𝑥̅𝑖 − x̿ )2
(iv) SSE = SST – SSA – SSB
(v) MSA = SSA
c–1
(vi) MSB = SSB
r-1
(vii) MSE = SSE
(r-1)(c-1)
(viii) F-ratio for
Factor A = MSA
MSE
Factor B = MSB
MSE
- Olufolabo, O.O. and Talabi, C.O. (2002): Principles and Practice of Statistics;
HAS-FEM ENTERPRISES Somolu, Lagos.
1.0 INTRODUCTION
In general, research is conducted for the purpose of explaining the effect of the
independent variable on the dependent variable, and the purpose of research design is to
provide a structure for the research. In the research design, the researcher identifies and
controls independent variable that can help to explain the observed variation in the
dependent variable which in turn reduces error variables (unexplained variation).
In addition to controlling and explaining variation through research design, it is also
possible to use statistical control to explain the variation in the dependent variable,
statistical control is usually used when experimental control is difficult, if not impossible,
can be achieved by measuring one or more variable in addition to the independent
variable of primary interest and by controlling the variation attributed to these variables
through statistical analysis rather than through research design. The analysis procedure
employed in this statistical control is analysis of covariance (ANCOVA).
2.0 OBJECTIVE
At the end of our discussion on analysis of covariance, you should be able to;
- Define analysis of variance
- Define covariate
- Define adjusted Yis
- Develop table of analysis of covariance
- Calculate the various terms that may be needed on the computation of ANCOVA
Table
ANCOVA works by adjusting the total sum of square, group sum of squares and error
sum of square of the independent variable to remove the influence of the covariate.
ASSUMPTIONS OF ANALYSIS OF COVARIANCE
- Variance is normally distributed
- Variance is equal between group
- All measure are independent
- Relationship between dependent variable and the covariate as linear
- The relationship between the dependent variable and the covariate is the same for all
groups.
Self-assessment exercise
Why analysis of covariance?
𝐧 𝐧
̿)𝟐
𝐒𝐱𝐱 = ∑ ∑(𝐗 − 𝐗
𝐢 =𝟏 𝐣= 𝟏
𝐧 𝐧
𝐧 𝐧
𝐒𝐱𝐲 = ∑ ∑( 𝐗 𝐢𝐣 − 𝐗̿ ) (𝐘𝐢𝐣 − 𝐘
̿)
𝐢 =𝟏 𝐣=𝟏
𝐧 𝐧
= ∑∑ 𝐗 𝐢𝐣 𝐘𝐢𝐣 − 𝐗 𝐢𝐣 𝐘𝐢𝐣 / 𝐚𝐧
𝐢 =𝟏 𝐣= 𝟏
𝐧 𝐧
= ∑∑ 𝐗 𝐢𝐣 𝐘𝐢𝐣 − 𝐗 𝐢𝐣 𝐘𝐢𝐣 / 𝐚𝐧
𝐢 =𝟏 𝐣= 𝟏
𝐧
𝐓𝐲𝐲 ̅𝐢 − 𝐘
= ∑( 𝐘 ̿𝐢 )𝟐
𝐢 =𝟏
= ∑𝐢 =𝟏 Yi2 - Y2
n an
̅𝐢 − 𝐗
𝐓𝐱𝐱 = ∑( 𝐗 ̿ .. )𝟐
𝐢 =𝟏
= Xi2 - X2
n an
̅ 𝐢𝐣 − 𝐗
𝐓𝐱𝐲 = ∑( 𝐗 ̿ 𝐧 ) (𝐘
̅𝐢𝐣 − 𝐘
̿.. )
𝐢 =𝟏
=X
̅Y̅ − XY
an
𝐧 𝐧
̅) 𝟐
𝐄𝐲𝐲 = ∑ ∑( 𝐘𝐢𝐣 − 𝐘
𝐢 =𝟏 𝐣 =𝟏
= Syy – Tyy
𝐧 𝐧
̅ )𝟐
𝐄𝑿𝑿 = ∑ ∑( 𝐗 𝐢𝐣 − 𝑿
𝐢 =𝟏 𝐣 =𝟏
= SXX – TXX
𝐧 𝐧
̅ 𝐢 ) (𝐘𝐢𝐣 − 𝐘
𝐄𝐱𝐲 = ∑ ∑( 𝐗 𝐢𝐣 − 𝐗 ̅𝐢 )
𝐢 =𝟏 𝐣 =𝟏
= Sxy – Txy
S=T+E
̅ = mean of X
Where X
̿ = Grand mean of X
X
̿ = Grand mean of Y
Y
a = variable involved
n = no of observations
Where the symbols S,T and E are used to denote sum of square and cross product for
total, treatment and error respectively.
Table M2.3.1
Analysis of Covariance for a Single Factor Experiment with One Covariate
Source of Df Sum of square and Adjusted df Mean square
variation product Regression Y error (MSE)
X XY Y
Treatment a–1 Txx Txy Tyy
Error a (n-1) Exx Exy Eyy SSE = Eyy – (Exy)2 a(n-1)-1 SSE
Exx a(n-1)-1
Fo = Fstatistics = Exy2/Exx
MSE
Fc = (SS’E – SSE) / (a-1)
SSE / (a(n-1)-1)
Which is distribute as
Fa-1,a(n-1)-1
Decision criteria
Reject Ho if Fc > F1, a(n-1)-1
Worked Example
A soft drink distributor is studying the effectiveness of delivery methods. Three different
types of truck have been developed, and an experiment is performed in the company’s
laboratory. The variable of interest is the delivery time in minute (Y): however, delivery
time is also strongly related to the case volume delivered (X). Each truck is used four
times and the data below are obtainable.
Table M2.3.2
Table Showing Delivery Method of a Distributor
Truck Types
1 2 3
Y X Y X Y X
27 24 25 26 40 38
44 40 35 32 22 26
33 35 46 42 53 50
41 40 26 25 18 20
Y1 = 145 Y1 = 139 Y2 = 132 Y2 = 125 Y3 = 133 Y3 = 134
Solution
̅1 = 145 = 36.25
Y ̅1 = 139 = 34.75
X
4 4
̅2 = 132 = 33
Y ̅2 = 125 = 31.25
X
4 4
133 134
̅3 =
Y = 33.25 ̅3 =
X = 33.5
4 4
a=3
n=4
Syy = 272 + 442 + 332 + 412 + 252 + 352 + 462 + 262 + 402 + 222 + 532 + 182 – 4102 /3x4
= 729 + 1936 + 1089 + 1681 + 625 + 1225 + 2116 + 676 + 1600 + 484 + 2809 +324 -
(410)2 / 12
Syy = 15,294 – 168,100 /12
Syy = 15294 – 14,008.33
Syy = 1,285.6711
Sxx = 242 + 402 + 352 + 402 + 262 + 322 + 422 + 252 + 382 + 262 + 502 + 202 – (3982
/(3x4))
Sxx = 576 + 1600 + 1225 + 1600 + 676 + 1024 + 1764 + 625 + 1444 + 676 + 2500 + 400
– (158404/12)
Sxx = 14,110 – 13,200.333
Sxx = 909.6666711
𝐧 𝐧
𝐒𝐱𝐲 = ∑ ∑( 𝐗𝐘 − 𝐗𝐘/𝐚𝐧)
𝐢 =𝟏 𝐣 =𝟏
X 2 (X)2
Txx = −
n an
Txx = 1392 + 1252 + 1342 – 3982
4 3x4
Txx = 19,321 + 15,625 + 17,956 – 158,404
4 12
Txx = 52902 - 158,404
4 12
Txx = 13,225.5 - 13,200.333
Txx = 25.1667
𝐧
̅ 𝐢𝐣 𝐘
𝐓𝐱𝐲 = ∑ 𝐗 ̿ 𝐘
̅𝐢𝐣 – ( 𝐗 ̿)
𝐢 =𝟏
n
XiY XY
Txy = ∑ −
n an
i =1
884.5
̂ = 1.1732265461
B
Test of hypothesis can be carried out on this too, by using the test statistic.
̂ =0
Ho: B
Fc = (Exy)2 /(Exx)
MSE
Fc = (1037.753)2 / 884.5
5.2425
Fc = 1,217.5594
5.2425
Fc = 232.2478588
F0.05,1,8 = 5.32
Decision
Since Fc > Ftab reject Ho and accept Hi, it simply implies that the exists a linear
relationship between the delivery time and volume delivered.
The adjusted treatment can be computed as;
̅1 - B
Adjusted Y1 = Y ̂(X
̅1 -X
̿)
̅2 - B
Y2 = Y ̂ (X
̅2 -X
̿)
̅3 - B
Y3 = Y ̂(X
̅3 -X
̿)
̿ = grand mean of Xiz = X
Where X ̅1 + X
̅2 + X
̅3 = X
̿
̅1,X
X ̅2, X
̅3 = the respective mean of x
̅1,Y
Y ̅2, Y
̅3 = respective mean of Y
̅1 – B
Adjusted Y1 = Y ̂ (X
̅1 - X
̿)
= 36.25 – (1.173265461) (34.75 – 33.167)
= 36.25 – 1.16422 (1.5833)
= 36.25 – 1.857631204
= 34.3923688
≅ 34.40
Adjusted ̅2 – B
Y2 = Y ̂ (X
̅2 - X
̿)
Y2 = 33 – 1.173265461 (31.25 – 33.167)
Y2 = 33 – (1.16422) (-1.917)
Y2 = 33 + 2.249149889
Y2 = 35.24914989
Y2 ≅ 35.249
Adjusted ̅3 – B
Y3 = Y ̂ (X
̅3 - X
̿)
Y3 = 33.25 – 1.173265461 (33.5 – 33.167)
Y3 = 33.25 – 1.173265461 (0.33)
Y3 = 33.25 – 0.387177602
Y3 = 32.8628224
Y3 ≅ 32.86
SELF-ASSESSMENT EXERCISE
Define 𝐒𝐲𝐲 ?
4.0 CONCLUSION
In the course of our discussion on analysis of covariance you have learnt about the
following:
- Definition of analysis of covariance
- Estimation of analysis of covariance
- Computation of analysis of covariance table
- Adjustment of the dependent variables
5.0 SUMMARY
In the course of our discussion the following were inferred.
𝐧
(𝐄𝐘)𝟐
𝐒𝐲𝐲 = ∑ ∑(𝐘 − 𝟐
= (𝐘 − 𝐘
̿)
𝐚𝐧
𝐢 =𝟏 𝐣=𝟏
𝐧
(𝐗)𝟐
𝐒𝐱𝐱 ̿) = (𝐗 𝟐 –
= ∑ ∑(𝐱 − 𝐗
𝐚𝐧
𝐢 =𝟏 𝐣=𝟏
𝐧
𝐗𝐢𝐘)
̿) = (𝐗 𝐢𝐣 𝐘𝐢𝐣 −
̿)(𝐘 − 𝐘
𝐒𝐱𝐲 = ∑ ∑(𝐱 − 𝐗
𝐚𝐧
𝐢 =𝟏 𝐣=𝟏
𝐧
𝐘 𝐘 𝟐
𝐓𝐲𝐲 ̅ ̿ 𝟐
= ∑(𝐘 − 𝐘) = ∑{ − }
𝐧 𝐚𝐧
𝐢=𝟏
x x2
Txx = (X
̅1 − X
̿) = −
n an
XY
Txy = (X
̅1 − X
̿) = (Y ̿) = XY −
̅1 − Y
an
Eyy = Syy – Tyy
Exx = Sxx – Txx
Exy = Sxy – Txy
6.0 TUTOR MARKED ASSIGNMENT
Submit a one page discussion on the definition of analysis of covariance and its
assumption.
7.0 REFERENCES
- Damodar N. G., Dawn C. P. and Sangetha, G. (2012): Basic Econometrics. Tata
McGraw Hill Education Private Ltd. New Delhi India.
- Dominick, S. and Derrick, R. (2011): Statistics and Econometrics, (Schaum’s
Outlines) McGraw-Hill Company, New York.
- Kuotsoyanis, A. (2003): Theory of Econometrics (second edition).Palgrave
publishers Ltd (formerly Macmillan publishers Ltd), Houndmills, Basingstoke,
New York.
- www.youtube.com
MODULE 3: Multiple Regression Analysis
Unit 1: Estimation of multiple regressions
Unit 2: Partial correlation coefficient
Unit 3: Multiple correlation coefficient and coefficient of determination
Unit 4: Overall test of significance
1.0 INTRODUCTION
Multiple Regressions Defined
In introductory statistic, simple linear regression is one of the topics discussed.
Regression equation is an expression by which you may calculate a typical value of a
dependent variable say Y, on the basis of the values of independent variable(s).
Multiple regression model attempts to expose the relative and combine importance of the
independent variables on dependent variables.
Multiple regression models is one among the commonly used tools in research for the
understandings of functional relationship among multi-dimensional variables. The model
attempts to expose the relative and combine effect of the independent variable on the
dependent variable.
For your success in this course of study it is required that you have a thorough knowledge
of simple regression model, hypothesis testing among others.
2.0 OBJECTIVE
At the end of our discussion on multiple regression you should be to;
(i) Regress the independent variable on the dependent variable
(ii) Understand parameter estimates involved
(iii) You should know how to calculate the values of bo, b1, b2, … bn
(iv) Test of significance
Coefficient of multiple determinations
Test of overall significance of the regression
Partial correlation coefficient
SELF-ASSESSMENT EXERCISE
Define multiple regression model of four variables?
Table M3.1.1
Table Showing Expenditure on Clothing, Total Expenditure and Price of Clothing
1990 1991 1992 1993 1994 1995
Price of clothing (X2 3.5 9.8 8.3 7.6 9.3 7.7
Total expenditure (X1 3.5 2.2 2.7 1.6 2.8 4.6
Value of expenditure clothing (Y) 2.0 1.5 1.7 1.6 2.7 3.5
Find the least square regression equation of Y on X1 and X2.
Table M3.1.2
Multiple Regression Table
Years Y X1 X2 y x1 x2 x1x2 x12 x22 x1y x2y y2
1990 20 35 35 -1.7 6 -42 -252 36 1764 -10.2 71.4 2.89
1991 15 22 98 -6.7 -7 21 -147 49 441 46.9 -140.7 44.89
1992 17 27 83 -4.7 -2 6 -12 4 36 9.4 -28.2 22.09
1993 16 16 76 -5.7 -13 -1 13 169 1 74.1 5.7 32.49
1994 27 28 93 5.3 -1 16 -16 1 256 -5.3 84.3 28.09
1995 35 46 77 13.3 17 0 0 289 0 226.1 0 176.89
n=6 Y X1 = X2 = ∑y=0 x1 = 0 x2 = 0 -414 548 2498 341 -7 307.34
=130 174 462
Y 130
̅ =
Y = = 21.7
n 6
̅1 = X1 = 174
X = 29
n 6
̅2 = X2
X = 462 = 77
n 6
̅
y =Y- Y
̅2
x1 = X1 - X
̅2
x2 = X2 - X
b̂1 = (x1y) (x22) – (x2y)(x1x2)
(x12) (x22) – (x1x2)2
b̂1 = (341) (2498) – (- 7)(- 414)
(548) (2498) – (- 414)2
b̂1 = 851,818 - 2898
1368904 – 171396
b̂1 = 848,920
1197508
b̂1 = 0.70891
b̂1 ≅ 0.71
b̂2 = 136,638
1197508
b̂2 = 0.1141
b̂2 ≅ 0.11
b̂0 = Y
̅ - b̂1 X
̅1 - b̂2 X
̅2
5.0 SUMMARY
In this unit, multiple regression model is given as
Y = Bo + B1X1 + B2X2 +
̂= Y
B ̅-B
̂1 X
̅1 - B
̂2 X
̅2
̂1 = (x1y) (x22) – (x2y)(x1x2)
B
(x12) (x22) – (x1x2)2
̂2 = (x2y) (x12) – (x1y)(x1x2)
B
(x12) (x22) – (x1x2)2
̂1 measures the change in Y for a unit change in X1 while holding X2 constant
Where B
B2 measure change in Y per units change in X2 holding X1 constant
6.0 TUTOR MARKED ASSIGNMENT
i. The simplest possible multiple regression model is a ---------
ii. Given that Y = B1X1 + B2X2 + B3X3 + where X1 = 1 this is an example
of –
7.0 REFERENCES
- Damodar, N. G., Dawn, C. P., and Sangetha, G. (2012): Basic Econometrics. Tata
McGraw Hill Education Private Ltd. New Delhi, India.
- Dominick, S. and Derrick, R. (2011): Statistics and econometric (Schaum outline)
(2nd edition) McGraw Hill, New York.
- Oyesiku, O.K. and Omitogun, O. (1999): Statistics for Social and Management
Sciences: Higher Education Books Publishers Lagos.
- Oyesiku, O.O., Abosede, A.J., Kajola, S.O, and Napoleon, S.G.(1999): Basics of
Operation research. CESAP Ogun State University. Ago-Iwoye, Ogun State.
UNIT TWO: PARTIAL CORRELATION COEFFICIENT
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Partial correlation defined
3.2 Assumptions of multiple regression
3.3 Estimation of multiple regression parameters
3.4 Worked example
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/ Further Readings
1.0 INTRODUCTION
It is assumed that you must have read unit 1 of this module that talks about multiple
regression, a detailed understanding of this will be assumed. this unit is building on the
unit 1 of this module. This unit will be dealing with thorough explanation of the
parameters involved in the regression analysis.
2.0 OBJECTIVE
At the end of our discussion, you should be able to understand the following concepts
such as;
- Partial regression co-efficient
- Estimation of partial regression co-efficient
Partial correlation coefficients range in value between -1 and +1. This value(s) is usually
used to determine the relative importance of the different explanatory variables in a
multiple regression.
SELF–ASSESSMENT EXERCISE
Define rx1x2.y
Table M3.2.1
Correlation Coefficient Table
Years Y X1 X2 Y x1 x2 x1x2 x12 x22 x1y x2y y2
1990 20 35 35 -1.7 6 -42 -252 36 1764 -10.2 71.4 2.89
1991 15 22 98 -6.7 -7 21 -147 49 441 46.9 -140.7 44.89
1992 17 27 83 -4.7 -2 6 -12 4 36 9.4 -28.2 22.09
1993 16 16 76 -5.7 -13 -1 13 269 1 74.1 5.7 32.49
1994 27 28 93 5.3 -1 16 -16 1 256 -5.3 84.3 28.09
1995 35 46 77 13.3 17 0 0 289 0 226.1 0 176.89
n=6 Y =130 X1 = X2 = ∑y=0 x1 = 0 x2 = 0 -414 648 2498 341 -7 307.34
174 462
√x12 √y 2
= 341
√648 x √307.34
= 341
√648 x 307.34
= 341
√199, 156.32
ryx1 = 341
446.2693357
ryx1 = 0.764112549
ryx1 ≅ 0.76
r2yx1 = 0.583867987
r2yx1 = 0.58
ryx2 = x2y
√x 2 2 √y 2
ryx2 = -7
(√2498) (√307.34)
ryx2 = -7
(√2498 x 307.34
ryx2 = -7
√767,7353.32
ryx2 = -7
876.2050673
= - 0.007988997395
ryx2 = - 0.008
r2yx2 = 0.0000638 = 0.000064
rx1x2 = x1x2
(√x 2 2) (√x12)
rx1x2 = - 414
(√2498) (√648)
rx1x2 = - 414
√2498 x 648
rx1x2 = - 414
√1,618,704
rx1x2 = - 414
1,272.282987
rx1x2 = - 0.325399305
rx1x2 = - 0.33
r2x1x2 = 0.105884707
r2x1x2 = 0.11
ryx2.x1 = 0.2428
√0.3738
ryx2.x1 = 0.2428
0.61 ryx2.x1 1391854
ryx2.x1 = 0.397126651
ryx2.x1 ≅ 0.40
ryx2.x1 = 40%
Therefore, from the calculations above it shows that X1 explains more than X2 and X1is
more important in explaining variation in Y.
SELF-ASSESSMENT EXERCISE
Define rx1x2?
4.0 CONCLUSION
In the course of our discussion on partial correlation coefficient you must have learnt
about the following:
- Partial correlation definition
- Partial correlation between the dependent variable and the independent variable such
as; ryx1.x2 = partial correlation coefficient between variable y and x1 holding variable
x2 constant
ryx2 .x1 = partial correlation coefficient between variable y and x2 holding x1 constant.
rx1x2.y = partial correlation between variable y and x1 variable x2 holding variable y
constant
5.0 SUMMARY
In the course of our discussion the following formulas were made use of;
ryx1 = x1y
x12 y2
ryx2 = x2y
x22 y2
rx1x2 = x2x1
x22 x12
ryx1.x2 = ryx1 – (ryx2)(rx1x2)
(1 – r2x1x2)(1 – r2yx2)
ryx2 .x1 = ryx2 – (ryx1)(rx1x2)
(1 – r2x1x2)(1 – r2yx1)
- Oyesiku, O.K. and Omitogun, O. (1999): Statistics for Social and Management
Sciences: Higher Education Books Publishers, Lagos.
- Oyesiku, O.O., Abosede, A.J., Kajola, S.O. and Napoleon, S.G. (1999): Basics of
Operation research. CESAP Ogun State University, Ago-Iwoye, Ogun State.
UNIT THREE: Multiple Correlation Co-efficient and Coefficient of Determination
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Partial correlation defined
3.2 Assumptions of multiple regression
3.3 Estimation of multiple regression parameters
3.4 Worked example
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/ Further Readings
1.0 INTRODUCTION
This unit is an extension of unit one and two of this module. This unit requires thorough
knowledge of unit 1 and unit two. In this unit we are going to look at multiple Correlation
Coefficients (R) and multiple coefficient of determination (R2).
2.0 OBJECTIVE
At the end of this unit you should be able to:
- Estimate multiple correlation coefficient (r)
- Estimate coefficient of determination
- Interprete your answer i.e. statistical interpretation
3.0 MAIN CONTENT
3.1 Multiple Correlation Coefficient (R) Coefficient of Determination (R2)
Defined
Multiple correlation coefficients represented by R measures the degree of linear
association between two or more variables. Say variable Y and the entire explanatory
variable jointly. Its value can be positive or negative; multiple correlation coefficients is
always taken to be positive. In practice the multiple correlation coefficients is of little
importance. The more meaningful coefficient is the coefficient of determination R2 or r2.
Coefficient of determination (R2) is defined as the proportion of the total variation in Y
explained by the multiple regression of Y on X1 and X2. It measures goodness of fit of the
regression equation. In a three variable model we are always interested in knowing the
proportion of the variation in Y explained by each of the explanatory variable X1 and X2.
The coefficient of determination is denoted by R2 or r2. Because of the relative
importance of coefficient of determination (R2) we concentrate more on the coefficient of
determination (R2).
SELF-ASSESSMENT EXERCISE
The most important coefficient is --------
Table M3.3.1
Multiple Regression Table
Years Y X1 X2 Y x1 x2 x1x2 x12 x22 x1y x2y y2
1990 20 35 35 -1.7 6 -42 -252 36 1764 -10.2 71.4 2.89
1991 15 22 98 -6.7 -7 21 -147 49 441 46.9 -140.7 44.89
1992 17 27 83 -4.7 -2 6 -12 4 36 9.4 -28.2 22.09
1993 16 16 76 -5.7 -13 -1 13 169 1 74.1 5.7 32.49
1994 27 28 93 5.3 -1 16 -16 1 256 -5.3 84.3 28.09
1995 35 46 77 13.3 17 0 0 289 0 226.1 0 176.89
n=6 Y X1 X2 ∑y=0 x1 = 0 x2 = 0 -414 548 2498 341 -7 307.34
=130 = =
174 462
ryx1x2 = 0.650246274
ryx1x2 = 0.806378493
r2 = 0.650246274
r2 ≅ 0.65%
SELF-ASSESSMENT EXERCISE
When r2=0.85, what is the economic interpretation of this?
4.0 CONCLUSION
In the course of our discussion of this unit, you have learnt about the following:
- Concept of multiple correlation
- Coefficient of determination
- Estimation of R2 & r
- Interpretation of r & r2
5.0 SUMMARY
In our discussion of this unit we defined coefficient of determination R2 as
R2 = b̂1 y1x1 + b̂2yx2
y2
r = r2yx1 + r2yx2 – 2ryx1 . ryx2 . rx1x2
1 – r2x1x2
The closer the r2 is to 1, the better
- Oyesiku O.K. and Omitogun O. (1999): Statistics for social and management
science (2nd edition). Higher Education Books Publisher, Lagos.
- Oyesiku, O.O., Abosede, A.J., Kajola, S.O. and Napoleon, S.G. (1999): Basics of
Operation research. CEAP OSU, Ago-Iwoye. Ogun State.
UNIT FOUR: TEST OF SIGNIFICANCE
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Test of significance defined
3.2 Estimation of test of significance
3.3 Summary of F-statistics
3.4 Worked example
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/ Further Readings:
1.0 INTRODUCTION
This unit completes this module, so it is required that thorough knowledge of unit one to
unit three is very germane. It is important to test for the significance of the value of the
regression, Coefficients, and the level of prediction or explanation given by the
regression equation.
2.0 OBJECTIVE
At the end of this unit the student(s) should be able to calculate and understand:
- The calculation of F.statistics (Fcal)
- Check the corresponding tabulated value of F.statistics through its degree of freedom.
- Compare the F.statistics and Ftab
- Interprete your answer
3.0 MAIN CONTENT
3.1 Test of Significance Defined
Test of significance is a procedure by which sample results are used to verify truity of
falsity of a null hypothesis. The key idea behind test of significance is that of a test
statistics (estimator and the sampling distribution of such a statistics under the null
hypothesis).
The decision to accept or reject Ho is made on the basis of the test statistics obtained from
the data at hand.
The overall significance of the regression can be tested with the ratio of the explained to
the unexplained variance. This follows an F-distribution with k – 1 and n – k degree of
freedom, where n is the number of observations and k is the number of parameters
estimated.
The joint hypothesis can be tested by the analysis of variance (Anova).
SELF-ASSESSMENT EXERCISE
The decision to accept or reject Ho depends on ---------
Note
yi2 = B
̂1yx1 + B
̂2 yx2 + 12
TSS = ESS + RSS
̂1yx1 + B
F-.ratio = (B ̂2yx2)/2
2/n – 3
E
̂2 ̂2) = σ2
= E (σ
n-3
If the null hypothesis is true, it gives identical estimates of true σ2. This statement should
not be surprising because if there’s a trivial relationship between y and x1 and x2 the
source of variation in Y will be due to the random forces usually represented by ei or 1.
If however, the null hypothesis is false, that is x1 and x2 actually influence Y; the equality
will not hold. Here, the ESS will be relatively larger than the RSS taking due account of
their respective degree of freedom. Therefore, the F-.ratio provides a test of the null
hypothesis that the true slope coefficients are simultaneously zero.
DECISION CRITERIA; If the F-ratio calculated exceeds the critical F-value from the
table at the percent level of significance we reject Ho; otherwise do not reject it.
Alternatively if the F-cal of the observed F is sufficiently low accept Ho.
SELF-ASSESSMENT EXERCISE
State the decision criteria for test of significance?
Method I
F3-1,6-3 = 307.34 = 307.34
3–1 2
73.506432 73.506432
6–3 3
Fcal = F2,3 = 153.67 = 6.271696061
24.502144
Method II
Table M3.4.4
Anova Table for 3-Variance
Sources of variation Sum of squares DF MSS
ESS 200.56 2 100.28
RSS 73.506436 3 24502
Total 274.066436 5
SELF-ASSESSMENT EXERCISE
What will be the decision criteria if Fcal > Ftab?
4.0 CONCLUSION
From our discussion on this unit you have learnt about:
- Definition test of significance
- Estimation of test of significance
- The interpretation of resulting
- Estimation through ANOVA table and otherwise
- Derivation of the error term ui or ei
5.0 SUMMARY
In the course of our discussion the following formulars where discussed
Fk-1,n-k = Y
̂2/(k – 1)
e2/(n – k)
y12 = 𝑏𝑖
̂ 1y1x1 + bi
̂ 2y1x2
12 = (Y-Y
̂ )2
7.0 REFERENCES
- Damodar, N. G., Dawn, C. P. and Sangetha, G. (2012): Basic Econometrics. Tata
McGraw Hill Education Private Ltd. New Delhi, India.
- Oyesiku, O.K. and Omitogun, O. (1999): Statistics for Social and Management
Sciences: Higher Education Books Publishers, Lagos.
- Oyesiku, O.O., Abosede, A.J., Kajola, S.O., and Napoleon, S.G.(1999): Basics of
Operation research. (CESAP Ogun State University), Ago iwOye, Ogun State.
MODULE 4: Time series analysis
Unit 1: Time series and its components
Unit 2; Quantitative estimation of time series
Unit 3: Price index
UNIT ONE: TIME SERIES
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Time Series defined
3.2 Component of time series
3.3 Graphical representation of trends
3.4 Worked example
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/ Further Readings
1.0 INTRODUCTION
In all the social sciences, and particularly economics and business, the problem of how
condition changes with the passage of time is of utmost importance. For study of such
problems, the appropriate kind of statistical information consist of data in the form of
time series, figures which shows the magnitude of a phenomenon month after month or
year after year. The proper methods for treating such data and thus summarizing the
experience which they represent are indispensable part of the practicing statistician
equipment.
2.0 OBJECTIVE
At the of this unit, you should be able to
- Understand or define time series
- Understand component part of time series
- Understand methods of estimating time series
- Estimation and graphical representation of the trend
SELF-ASSESSMENT EXERCISE
The essence of time series is forecast. (true/false)
Fig. M4.1.1
3.2.2 Seasonal Variation
This refers to short term fluctuation or changes that occur at regular intervals less than a
year. It is usually brought about by climatic and social factor(s), it is usually because of
an event occurring at a particular period of the year. Examples of these are sale of card
during valentine period, sale of chicken during xmas, new year or any festive period(s).
Fig. M4.1.2
Fig. M4.1.4
SELF-ASSESSMENT EXERCISE
The trend of secular trend can either be upward or downward. (true/false)
Fig. M 4.1.5
Other quantitative methods will be dealt with in the next unit
4.0 CONCLUSION
In the course of our discussion on time series analysis you have learnt about
- Time series
- Time series data
- Component of time series
- Free hand trend measurement
-
5.0 SUMMARY
Majorly time series decomposes itself into the following;
- Secular trend or long term movement
- Seasonal variation
- Cyclical variation
- Irregular or residual variation
6.0 TUTOR MARKED ASSIGNMENT
Table Showing Sales of a Chemist
Years Sales
2000 85
2001 96
2002 108
2003 123
2004 98
- Esan, E.O. and Okafor, R.O. (2010): Basis Statistical Method. Tony Chriisto
Concept, Lagos.
- Olufolabo, O.O. & Talabi, C.O. (2002): Principles and Practice of Statistics HAS-
FEM (NIG) ENTERPRISES Somolu Lagos.
1.0 INTRODUCTION
This unit is an extension of unit one of this module, here, you are going to learn more
about estimation of time series data, also, a thorough understanding of unit one of this
module is required for proper understanding of this module.
2.0 OBJECTIVE
At the of this unit, you should be able to
- estimate any time series data,
- Understand methods of estimating time series,
- Estimate and do the graphical representation of the trends.
Formular 1
Trend equation of the least square method is given as
Y = na + bx
Y = ax + bx2
Where = summation term derived from the data of the problem at hand
x = sum of X values
Y = sum of Y values
xy = sum found by multiplying each Y by corresponding X value and adding the
Products
n = no of items involved in the whole time series
The least square estimates of a and b are the solution to the normal equation above which
can be solve simultaneously.
Formular 2
The general formular is as given below
b = nXY - XY
nX2 – (X)2
̅ – 𝑏̂ x̅
a=Y
̅ = Y
where Y
n
̅ = x
X
n
Worked Example
Given the 7weeks information below about the sales of a company
Table M 4.1.1
Table Showing the Sales of a Company
Wk Sales
1 15
2 25
3 38
4 32
5 40
6 37
7 50
̂ = nXY - XY
B
nX2 – (X)2
n=7 ̅ = 28 = 4
X
7
x = 28 ̅ = 237 = 33.857
Y
7
y = 237
xy = 1079
x2 = 140
̂ = 7(1079) – 28 (237)
B
7 (140) – 282
̂= 7553 – 6636
B
980 – 784
̂ = 917
B
196
̂ = 4.67857
B
̅ -B
a=Y ̂ x̅
a = 33.857 – (4.67857)4
= 33.857 – 18.7142
= 15.1427
The trend equation will be:
Y = 15.1427 + 4.6785x
This trend equation can be used in forecasting into future sales of the company, for
example future sales value for the 10th and 12th week can be known by simply
substituting the week’s value into the trend equation.
i.e. for the 10th week we have;
Y = 15.1427 + 4.6785(10)
Y = 15.1427 + 46.785
Y = 61 – 9277
Y ≅ 62
For the 12th week
Y = 15.1427 + 4.6785(12)
Y = 15.1427 + 56.142
Y = 71.2847
Y = 71
Fig. M4.1.6
Example 2
From the time series data below determine the trend on sales of a company
Table M4.1.5
Table showing sales of a company per quarter
Quarter
Years 1 2 3 4
1982 600 820 400 720
1983 630 840 420 740
1984 670 900 430 760
Prepare a 4-quarter moving average
Solution
Table M 4.1.6
4-point moving average table of analysis
Year Quarter Sales 4 point moving 4-point average 2 point total Moving
total moving or 4 or centre average
quarterly total (trend)
average
1982 1 600 - - - -
2 820 - - - -
2540 635 -
3 400 1277.5 638.75
2570 642.5 - -
4 720 1290 645
2590 647.5 - -
1983 1 630 1300 650
2610 652.5 - -
2 840 1310 655
2630 657.5 - -
3 420 1325 662.5
2670 667.5 - -
4 740 1350 675
2730 682.5
1984 1 670 1367.5 683.75
2740 685 - -
2 900 1375 687.5
2760 690 - -
3 430 - - - -
4 760 - - - -
Fig. M4.1.7
Example
Table M4.1.8
Semi- Moving Average Method Table of Analysis
Years Quarter Y sales X Semi Average Semi average method
Total trend
1992 1 600 -6
2 820 -5
3 400 -4
4 720 -3 4010 668.33
1993 1 630 -2
2 840 -1
3 420 1
4 740 2
1994 1 670 3
2 900 4
3 430 5
4 760 6 3,920 653.3
Column 4 represents the total of the 1st half and 2nd half.
Column 5 is arrived at by dividing the column 4 by 6 this represents the trend value,
when plotted in a graph it gives the trend line.
Fig. M4.1.8
SELF-ASSESSMENT EXERCISE
State the least square equation of a time series data?
4.0 CONCLUSION
In the course of our discussion on estimation of time series, you have learnt about
- least square method
- moving average method
- semi average method.
5.0 SUMMARY
The least square trend equation is written as
Y= a+bx+e
̅ - b̂ x̅
Where a = intercept = Y
b̂ = slope = nXY - x(Y)
nx2 – (x)2
For moving average develop the following
- n – moving total
- determine the moving average
- plot the trend value to know the trend line
- Esan, E.O. and Okafor, R.O. (2010): Basic Statistical Method. Tony Christo
Concept, Lagos.
- Olufolabo, O.O. & Talabi, C.O. (2002): Principles and Practice of Statistics HAS-
FEM (NIG) ENTERPRISES Somolu Lagos.
- Oyesiku, O.K. and Omitogun, O. (1999): Statistics for social and Management
Sciences. Higher Education Books Publisher, Lagos.
UNIT THREE: PRICE INDEX
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Price index defined
3.2 Computation of price index
3.3 Worked example
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/ Further Readings
1.0 INTRODUCTION
In introductory statistics a lot of meaning has be given to the average, this (average) has
been confirmed not to be necessarily representative of the data it describes. statisticians
have constructed a device that attempt to measure the magnitude of economic change
over time, a device called index number. This device is also used for international
comparison of economic data. This device called index number is what this unit shall be
looking at, we shall examine the basic principles by which index numbers are
constructed.
2.0 OBJECTIVE
At the end of this unit, you should be able to:
- Define index number
- Calculate the index number through different methods
- Use(s) of index number
- Relevance of index number
3.0 MAIN CONTENT
3.1 Index Number Defined
In statistical analysis of one very large and important class of problems, we must
combine different set of data into a single measure e.g. we may wish to study the
behaviour of wholesale prices and to do this, we calculate an index number which
describes the changes, not in the various individual prices in which we are interested but
in the group of prices taken as a whole.
The relevance of this statistical device is shown by the fact that governmental and other
agencies devotes very substantial amount of money every year to the work of collecting
appropriate data performing the necessary calculations for the construction of index
numbers. The most widely known of these measure is the consumer price index or cost of
living index.
In general, index numbers are used in the study of prices (wholesale, retail, form, export
etc), output (manufacturing mining). The purpose of such measures is to get a summary
of a whole range of similar activities, thereby, one will be able to investigate problem on
relatively broad basis.
SELF-ASSESSMENT EXERCISE
What is the basis for an index number?
Where Pn refer to price of current year and Po represents the price of the base period or
reference period.
𝑃𝑛
SPI = x 100
𝑃𝑜
1911
= x 100
665
= 287.36
This is simply implying that cost of the commodity had risen by 287.4% between 1986
and 1990.
LPI = pnqo x 100
poqo
75,165
LPI = x 100
23,850
LPI = 315.157
𝑃𝑜𝑄𝑛
LQI =
𝑃𝑜 𝑄𝑜
64,690
LQI =
23,850
LQI = 271.2369
≅ 271%
Using Laspeyres price index it is showing the rate of rise in price as by 315.16% between
1986 and 1990.
Where LPI = Laspeyres Price Index
LPI = Laspeyres Quantity Index
Paasche method
𝑃𝑛 𝑄𝑛
PPI = x 100
𝑃𝑜 𝑄𝑛
208,803
PPI = x 100
64,690
PPI = 322.77%
208,803
PQI =
75,165
PQI = 277.79
PQI ≅ 278%
Using Paasche price index the rate of increase in price is 322.8% between 1986 and 1990.
= 320.722
≅ 320.7%
M.E Quantity Index = Qn (Po + Pn) x 100
Qo (Po + Pn)
273,493
MEQI =
99,015
MEQI = 276.213
MEQI = 276%
SELF-ASSESSMENT EXERCISE
Define Fisher’s Ideal formular for price index?
4.0 CONCLUSION
In the course of our discussion on this unit you have learnt about
- Definition of price index
- Calculation about:
Simple price index
Weighted price index; where we talked about
Laspeyres index number
Paasche index number
Fisher’s ideal index number
Marshall edge-worth index number
5.0 SUMMARY
Below is the summary of all the price indices we talked about in this unit.
Pn
Price relative = x 100
Po
Pn
Simple price index = x 100
Po
Pn Qo
Laspeyres index = x100
Po Qo
Pn Qn
Paasche index = x100
Po Qo
7.0 REFERENCES
- Adedayo, O.A. (2006): Understanding statistics. JAS Publishers, Lagos.
- Edward, E.L. (1983): Statistical Analysis in Economic and Business. (2nd edition)
Houghton Mifflin Company, Boston.
- Esan, E.O. and Okafor, R.O. (2010): Basic statistical methods (Revised Edition)
Toniichristo Concept, Lagos.
- Olufolabo, O.O. and Talabi, C.O. (2002): Principle and practice of statistics.
HASFEM (NIG) Enterprises, Lagos.
- Oyesiku, O.K. and Omitogun, O. (1999): Statistics for social and management
sciences. (2nd edition) HEBP, Lagos.