Introduction To Quantitative Methods
Introduction To Quantitative Methods
Uses of Statistics
a) To present the data in a concise and definite form: Statistics helps in classifying and tabulating
raw data for processing and further tabulation for end users.
b) To make it easy to understand complex and large data: This is done by presenting the data in
the form of tables, graphs, diagrams etc., or by condensing the data with the help of means, dispersion
etc.
c) For comparison: Tables, measures of means and dispersion can help in comparing different
sets of data.
d) In forming policies: It helps in forming policies related to the education environment.
e) Enlarging individual experiences: Complex problems can be well understood by statistics, as
the conclusions drawn by an individual are more definite and precise than mere statements on facts.
f) In measuring the magnitude of a phenomenon (occurrence):- Statistics has made it possible to
count the population of a country, the industrial growth, the agricultural growth, the educational level
(of course in numbers)
Definitions:
• Statistics is the art and science of collecting, analyzing, presenting and interpreting data. A
branch of mathematics taking and transforming numbers into useful information for decision
makers. It refers to methods used for helping reduce the uncertainty inherent in decision
making
• Data are the facts and figures that are collected, summarized, analyzed, and interpreted.
• Raw data refers to unprocessed or unorganized data.
• Data can be broadly classified as being qualitative or quantitative.
• Quantitative data indicate either how many or how much. Are countable or numerical.
– Quantitative data that measure how many are discrete i.e. take specific values e.g.
whole numbers
[email protected] Page 1
– Quantitative data that measure how much are continuous because there is no
separation between the possible values for the data i.e. can take any value including
fractions.
• Qualitative data are labels or names used to identify an attribute of each element. Are non
numerical and therefore not countable.
Qualitative data use either the nominal or ordinal scale of measurement.
The statistical analyses for qualitative data are rather limited
The statistical analysis that is appropriate depends on whether the data for the variable are
qualitative or quantitative.
• Attribute: A characteristic of an elementary unit that can only be observed as to its presence or
absence.
• Variable: An observable quantitative characteristic of an elementary unit that vary from unit to
unit.
• Discrete Variable: A variable whose values are restricted to integer values only i.e. takes whole
numbers e.g. no. of students
• Continuous Variable: A variable that can assume any value within some interval i.e. can take
even fractions e.g. height or size of a building, measurements, weights, age
• Population: the entire possible observations that may be made in the universe
• Sample: Any portion drawn from a population. Generally a sample consist of a fewer elementary
units or observations than contained in a population. Thus a sample is a sub set of a population.
• Elementary units: Physical entity on which an observation is made.
• Survey: A planned and Systematic process of collecting statistical data
• Census: A survey in which observations are made on every elementary unit of the whole
population
• Sample Survey: A survey in which observations are made on a sample of elementary unit drawn
from the population.
Data
Qualitative Quantitative
(Categorical) (Numerical)
Continuous
Discrete (takes
(takes whole
only whole Nos.)
Nos. & fractions )
Classification of Statistics
Broadly classified into two categories:
i. Descriptive statistics: Refers to the collection, analysis and synthesis of data in order to come up
with a better description of the situation. It is a branch of statistical which is concerned with
collecting, describing and summarizing a set of data so as to derive meaningful information. It
[email protected] Page 2
involves classification of data, presentation of data in tabular forms, graphs, charts and
calculation of averages.
ii. Inferential statistics: Divided into two:
a) Inductive statistics: Is concerned with the development of scientific criteria so that
values of a group may be meaningfully estimated by examining only a small portion of
that group. The whole group is known as population or universe whle the portion is
known as sample. Values in the samples are known as statistics and values in the
population are known as parameters. Thus, inductive statistics is concerned with
estimating universe parameters from the sample statistics.
A sample is chosen instead of considering the whole population because of:
Time limit: using a census survey based on the entire universe requires a lot of time
which might not be available.
Costs: A sample survey is much cheaper compared to a census survey
Volatility: since the census survey is time consuming, the relevance of the research
may not apply by the time of finishing the research.
b) Deductive statistics: it is concerned with the establishing of laws and procedures for
choosing one course from alternatives courses of actions under situations of
uncertainty. Since deductive statistics uses probability theory, it provides a rational base
for dealing with situations influenced by chances related factors.
Types of
Statistics
Descriptive Inferential
statistics Statistics
Inductive Deductive
Inferential Inferential
Statistics Statistics
Scales of measurement
Nominal Scale
Nominal measurement consists of assigning items to groups or categories. No quantitative information
is conveyed and no ordering of the items is implied. Nominal scales are therefore qualitative rather than
quantitative e.g. Religious preference, race, and gender. Variables measured on a nominal scale are
often referred to as categorical or qualitative variables.
Ordinal Scale
Measurements with ordinal scales are ordered in the sense that higher numbers represent higher
values. However, the intervals between the numbers are not necessarily equal. For example, on a five-
point rating scale measuring attitudes towards whether the quality of education offered in M.K.U. is of
standard. The rating on scale could be I strongly agree, I agree, am neutral, I disagree or I strongly
disagree. The difference between a rating of 2 and a rating of 3 may not represent the same difference
[email protected] Page 3
as the difference between a rating of 4 and a rating of 5. There is no "true" zero point for ordinal scales
since the zero point is chosen arbitrarily.
Interval Scale
On interval measurement scales, one unit on the scale represents the same magnitude on the trait or
characteristic being measured across the whole range of the scale. Interval scales do not have a "true"
zero point, however, and therefore it is not possible to make statements about how many times higher
one score is than another
Ratio Scale
Ratio scales are like interval scales except they have true zero points.
1.2 Organizing and Presenting Data
• It’s hard to interpret raw data in its original form. Hence it is always important to organize the
data in a systematic way.
• Organizing data refers to arranging data:
according to similarity or resemblance
according to the order of importance
in the descending or ascending order
• The purpose for organizing data: To make the data easily understandable
In order to make comparison and draw meaningful conclusion easily
To eliminate unnecessary data
Statistical Series: Refers to different ways of arranging data.
a. Time Series: This is arranging data according to when they occur. This can be in terms of
hrs, days, months or years
b. Spatial Series: This is arranging data according to their geographical characteristics.
c. Conditional Series: This is arranging data according to their specific characteristics. E.g
male or female.
NOTE: Refer to class exercises for different methods of presenting data.
[email protected] Page 4
X
Mean ( x ) =
n
Where x = values of items
∑ = summation
n = no. of observations or items
Example
The mean of 60, 80, 90, 120
60 + 80 + 90 + 120
4
350
=
4
= 87.5
The arithmetic mean is very useful because it represents the values of most observations in the
population.
The mean therefore describes the population quite well in terms of the values attained by most of the
members of the population.
Note: Refer to class exercises for further understanding.
Calculating Arithmetic Mean for grouped data
x
Where f=frequency
The following statistical terms are commonly used in statistical calculations. They must therefore be
clearly understood.
i) Class limits
These are numerical values which give a lower and upper limit for any given class i.e. all the
observations in a given class are expected to fall within the interval which is bounded by the class limits.
ii) Class boundaries
These are statistical boundaries, which separate one class from the other. They are usually determined
by adding the upper class limit to the next lower class limit and dividing by 2 e.g. in the above table the
19 + 20
class boundary between 19 and 20 is 19.5 which is = .
2
iii) Class mid points
These are very important values which mark the center of a given class. They are obtained by adding
together the two limits of a given class and dividing the result by 2.
[email protected] Page 5
Example
In a social survey in which the main purpose was to establish the intelligence quotient (IQ) of resident in
a given area, the following results were obtained as tabulated below:
IQ No. of residents
1 – 20 6
20 – 40 18
40 – 60 32
60 – 80 48
80 – 100 27
100 – 120 13
120 – 140 2
Required
Calculate the modal value of the IQ’s tabulated above using the formula method and by graphical
method.
Formula. First identify the modal class i.e. the class with the highest frequency. The use the following
formula.
[email protected] Page 6
Graphical method
50
40
30
20
10
[email protected] Page 7
[email protected] Page 8
Weighted mean
The mode
Merits
i. It can be determined from incomplete data provided the observations with the highest
frequency are already known
ii. The mode has several applications in business eg stocking the most sold good.
iii. The mode can be easily defined
iv. It can be determined easily from a graph
Demerits
i. If the data is quite large and ungrouped, determination of the mode can be quite
cumbersome
ii. Use of the formula to calculate the mode is unfamiliar to most business people
iii. The mode may sometimes be non-existent or there may be two modes for a given set of
data. In such a case therefore a single mode may not exist
The median
Merits
i. It shows the centre of a given set of data
ii. Knowledge of the determination of the median may be extended to determine the quartiles
iii. The median can easily be defined
iv. It can be obtained easily from the cumulative frequency curve
v. It can be used in determining the degrees of skew-ness
[email protected] Page 10
Demerits
i. In some situations where the no. of observations is even, the value of the median obtained
is usually imaginary
ii. The computation of the median using the formulas is not well understood by most people.
iii. In the education environment the median has got very few applications
Measures of Dispersion
- Also known as Measures of variations or variability.
- They measure how much data spread out around a central measure.
- The measures of dispersion are very useful in statistical work because they indicate whether the
rest of the data are scattered around the mean or away from the mean.
- If the data is approximately dispersed around the mean then the measure of dispersion
obtained will be small therefore indicating that the mean is a good representative of the sample
data. But on the other hand, if the figures are not closely located to the mean then the
measures of dispersion obtained will be relatively big indicating that the mean does not
represent the data sufficiently.
- The measures of dispersion are expressed in two ways:
i. Absolute measure: This is when the measures are expressed using the same units of
measure as the original data.
ii. Relative measure: This is when the measures are expressed as a fraction or percentage.
Are also known as coefficients of dispersion.
[email protected] Page 11
- The main disadvantage of range is that it only uses 2 values (Highest and smallest) in a given
series. However the smaller the value of the range, the less dispersed the observations are from
the arithmetic mean and vice versa.
- The main disadvantage of quartile deviation is that it uses only two values (Q 1 and Q3) ignoring
other values.
Note: Refer to class exercises for further understanding.
c) The mean deviation
- This is a deviation taken from the mean, median or mode. The deviations are taken as positive
values. This measure of dispersion makes use of all the values given.
Finding Mean deviation for ungrouped data
x
[email protected] Page 12
E.g.1
In a given exam the scores for 10 students were as follows
Student Mark (x) xx
A 60 1.8
B 45 16.8
C 75 13.2
D 70 8.2
E 65 3.2
F 40 21.8
G 69 7.2
H 64 2.2
I 50 11.8
J 80 18.2
Total 618 104.4
Required: Determine the absolute mean deviation
618
Mean, x = = 61.8
10
X-X 104.4
Therefore AMD = = = 10.44
N 10
E.g. 2
The following data was obtained from a given financial institution. The data refers to the loans given out
in 2013 to several students
[email protected] Page 13
Required
Calculate the mean deviation for the amount of loan given
NB If the absolute mean deviation is relatively small it implies that the data is more compact and
therefore the arithmetic mean is a fair sample representative.
The main disadvantage of mean deviation is that it takes all the deviations as positive values even when
some are negative.
d) The standard deviation
- This is one of the most accurate measures of dispersion. It has the following advantages;
i. It utilizes all the values given
ii. It makes use of both negative and positive values if they occur
iii. The standard deviation reflects an accurate impression of how much the sample data varies
from the mean.
Std deviation for ungrouped data
Coefficients
x
Variance
Square of the standard deviation is called variance.
[email protected] Page 14
Example
A sample comprises of the following observations; 14, 18, 17, 16, 25, 31
Determine the standard deviation of this sample
x
x x x x
2
14 -6.1 37.21
18 -2.1 4.41
17 -3.1 9.61
16 -4.1 16.81
25 4.9 24.01
31 10.9 118.81
121 210.56
121
X 20.1
6
2
xx 210.56
standard deviation,
n 6
= 5.93
[email protected] Page 15
Calculate the standard deviation from the above table showing how the hourly payment were varying
from the respective mean
∴ Standard deviation,
=
= 96.29
Example 3.1
The quality controller in a given firm had an accurate record of all the iron bars produced in may 1997.
The following data shows those records
Bar lengths (cm) No. of bars(f) Class mid point fx
2
f xx
(x)
201 – 250 25 225.5 5637.5 596756.3
[email protected] Page 16
∴ Standard deviation,
=
TOPIC TWO: INTRODUCTION TO PROBABILITIY
It is a chance that something will happen.
It is the ratio of the number of favorable cases to the total number of equally likely cases i.e.
Approaches to probability
ii. The relative frequency/empirical approach: This is a statistical approach to probability through
use of observation
iii. The personalistic approach: Also known as subjective approach. The probability here depends
on the individual’s beliefs, opinion, feelings and is based on one’s own experience.
iv. The axiomatic approach: Axiom means rule or law or norm. In this approach, there are three
basic rules:
a) Rule one: Probability of any event is a non negative real number. Hence the smallest
probability is zero.
[email protected] Page 17
c) Rule three: For mutually exclusive events, P (E1 or E2) = P (E1) +P (E2). This rule gives the
total probability for mutually exclusive events as being one.
Probability Line
0 ½ 1
E.g. of an impossible event: The probability of getting a 7 when you through a dice. P (of a 7 in a dice
through= 0
E.g. of a sure event: The rising of the sun tomorrow. Probability here = 1
Sample space:
Examples:
Tossing a coin: You can get either a head or a tail Sample space=2 {Head, Tail}
Sample Point
Examples:
[email protected] Page 18
Drawing a card from a deck of cards: A Queen of hearts is one of the 52 sample points.
Event
It can include one or more sample points e.g. getting an even number after throwing a dice. The
event here is even number. There are three even numbers (2, 4, 6), hence three sample points.
Types of Events
ii. Compound Events: When two or more events occur in connections with each other. It is an
aggregate of simple event.
iii. Mutually Exclusive Events: When you cannot get both events at the same time; either one or
the other but not both. Mutually exclusive events are those events which cannot happen at the
same time. E.g. getting a head and a tail at the same time when tossing a coin.
iv. Collectively Exhaustive events: Events which include all possible outcomes of an experiment.
v. Complementary Events: All out comes that are not the event being considered. E.g. head is
complement to tail (in tossing a coin).
Complementary law: States that the sum of the probability of an event and the
probability of its complement equals to one. i.e. P(A) + P (Ac)= 1
vi. Equally likely Events: When one event does not occur more often than others i.e. each event of
an experiment has an equal chance of happening, just like any other.
vii. Independent Event: An event is independent if the occurrence of the event is not affected by
any other event. E.g. getting a head does not depend on getting a tail.
viii. Dependent Event: Two events are dependent if the outcome or occurrence of the first affects
the outcome or occurrence of the second. E.g. drawing a card from a deck without replacement
Conditional probability is based on dependent events.
[email protected] Page 19
Probability Laws
It is used to calculate the probability of two or more mutually exclusive events. In such case the
probability of the separate events must be added
E.g. For events A and B that are mutually exclusive: P (A or B) =P (A) + P (B)
If in case there is any intersect for the two events, the probability of the intersect should be subtracted
e.g. For events A and B that are not mutually exclusive: P (A or B) =P (A) +P (B)-P (A∩ B) (∩- symbol
for intersect; ᴜ -symbol for union)
E.g. For events A and B that are independent: P (A & B) =P (A) x P (B)
Two events A and B are independent if the fact that the occurrence or non-occurrence of event A does
not affect the occurrence or non-occurrence of event B.
Conditional Probability
Conditional probability of an event B in relationship to an event A is the probability that event B occurs
given that event A has already occurred. It is written as P (B/A) (read as probability of B given A)
P (B/A) =P (A ∩ B)
P (A)
Examples
1. A math teacher gives her class a text 25% of the class passed both test and 42% of the class passed
the first test also passed the second test
2. In a certain company there are 550 employees 380 employee have gone to at least to college
Education 412 have attended vocational training programme.357 have both gone to at least upto
college and attended vocational training programme what is the probability of randomly choosing an
employee who has at least college education or vocational training or both.
3. A consulting form is bidding for two jobs .the pro of getting firm A is 0.45 while pro for getting B given
that it gets a bid with firm A is 0.9 ,what chance does the firm has of getting both jobs
[email protected] Page 20
P (A ∩ B) = P (B/A) x P (A)
Types of Correlations
Variables may be related:
i. Perfectly related: A change in independent variable causes the same amount of change in the
dependant variable. Here we can have perfect +ve correlation (where an increase in
independent variable causes the same increase in dependent) or perfect –ve correlation where
an increase in independent variable causes an equal decrease in dependent variable.
ii. Partly correlated
iii. Uncorrelated (where no relationship exists)
[email protected] Page 21
[email protected] Page 22
20
15
10
0
0 2 4 6 8 10 12 14
25
20
15
10
0
0 2 4 6 8 10 12 14
[email protected] Page 23
No correlation
25
20
15
10
0
0 2 4 6 8 10 12 14
Spurious Correlations
- in some rare situations when plotting the data for x and y we may have a group showing either
positive correlation or –ve correlation but when you analyze the data for x and y in normal life
there may be no convincing evidence that there is such a relationship. This implies therefore
that the relationship only exists in theory and hence it is referred to as spurious or non sense
e.g. when high pass rates of student show high relation with increased accidents.
Note: Refer to class exercise for further understanding.
Correlation coefficient
- These are numerical measures of the correlations existing between the dependent and the
independent variables
- These are better measures of correlation than scatter diagram.
- The range for correlation coefficients lies between +ve 1 and –ve 1. A correlation coefficient of
+1 implies that there is perfect positive correlation. A value of –ve shows that there is perfect
negative correlation. A value of 0 implies no correlation at all
3.2 KARL PEARSON’S COEFFICIENT OF CORRELATION (r)
Example
(Product moment correlation)
The following data was obtained during a social survey conducted in a given urban area regarding the
annual income of given families and the corresponding expenditures.
[email protected] Page 24
Required
Calculate the product moment correlation coefficient briefly comment on the value obtained
The produce moment correlation
n xy x y
r=
n x 2 x n y 2 y
2 2
= 0.89
Comment: The value obtained 0.89 suggests that the correlation between annual income and annual
expenditure is high and positive. This implies that the more one earns the more one spends.
NOTE:
A high value of r (+0.9 or – 0.9) only shows a strong association between the two variables but
doesn’t imply that there is a causal relationship i.e. change in one variable causes change in the
other it is possible to find two variables which produce a high calculated r yet they don’t have a
causal relationship. This is known as spurious or nonsense correlation e.g. high pass rates in M.K.U.
and increased inflation in Asian countries.
Also note that a low correlation coefficient doesn’t imply lack of relation between variable but lack
of linear relationship between the variables i.e. there could exist a curvilinear relation.
A further problem in interpretation arises from the fact that the r value here measures the
relationship between a single independent variable and dependent variable, whereas a particular
variable may be dependent on several independent variables (e.g. goods demanded may depend on
price, customer’s preferences/tastes, price of related goods, substitutes etc.) in which case multiple
correlation should be used instead.
[email protected] Page 25
Example
A group of 8 students are tested in Psychology and Statistics. Their rankings in the two tests were.
Student Psychology Statistics ranking d d2
ranking
A 2 3 -1 1
B 7 6 1 1
C 6 4 2 4
D 1 2 -1 1
E 4 5 -1 1
F 3 1 2 4
G 5 8 -3 9
H 8 7 1 1
d 2
22
d = Psychology ranking – Statistics ranking
6 d 2 6 22
R=1- 1
n n 1
2
8 82 1
= 0.74
Thus we conclude that there is a reasonable agreement between student’s performances in the two
types of tests.
R=1-
6 d 2 t 12t
3
n n 1
2
Example
Assume that in our previous example student E & F achieved equal marks in Psychology and were given
joint 3rd place.
[email protected] Page 26
Solution
Student Psychology Statistics ranking d d2
ranking
A 2 3 -1 1
B 7 6 1 1
C 6 4 2 4
D 1 2 -1 1
E 3½ 5 -1 ½ 2¼
F 3½ 1 2½ 6¼
G 5 8 -3 9
H 8 7 1 1
d 2
26 1 2
R = 1-
6 d
2 t 3 t
12
= 1-
6 26 1 2 23 2
n n 1
12
8 8 1
2 2
= 0.68
NOTE: It is conventional to show the shared rankings as above, i.e. E, & F take up the 3rd and 4th rank
which are shared between the two as 3½ each.
3.4 REGRESSION
- This is a concept, which refers to the changes which occur in the dependent variable as a result
of changes occurring on the independent variable.
- Knowledge of regression is particularly very useful in business statistics where it is necessary to
consider the corresponding changes on dependant variables whenever independent variables
change
- It should be noted that most business activities involve a dependent variable and either one or
more independent variables. Therefore knowledge of regression will enable a business
statistician to predict or estimate the expenditure value of a dependant variable when given an
independent variable e.g. consider an example for annual incomes and annual expenditures.
Using the regression techniques one can be able to determine the estimated expenditure of a
given family if the annual income is known and vice versa
- The general equation used in simple regression analysis is as follows
ŷ= a + bx
Where y = Dependant variable
a= Interception of y axis (constant)
b = Slope on the y axis
x = Independent variable
The determination of the regression equation such as given above is normally done by using a technique
known as “the method of least squares’.
Regression equation of y on x i.e. y = a + bx
[email protected] Page 27
x
The following sets of equations normally known as normal equation are used to determine the equation
of the above regression line when given a set of data.
Σy = an + bΣx
Σxy = aΣx + bΣx2
Where Σy = Sum of y values
Σxy = sum of the product of x and y
Σx = sum of x values
Σx2= sum of the squares of the x values
a = The intercept on the y axis
b = Slope gradient line of y on x
Example
An investment company advertised the sale of pieces of land at different prices. The following table
shows the pieces of land their acreage and costs
Required
Determine the regression equations of
i. y on x and hence estimate the cost of a piece of land with 4.5 hectares
ii. Estimate the expected average if the piece of land costs £ 900,000
Σy = an + bΣx
Σxy = a∑x + bΣx2
[email protected] Page 28
the y-intercept a =
y b x
n
n xy x y
n x 2 x
Slope b = 2
[email protected] Page 29
This refers to the ratio of the explained variation to the total variation and is used to measure the
strength of the linear relationship. The stronger the linear relationship the closer the ratio will be to one.
[email protected] Page 30
[email protected] Page 31
2. The line of symmetry divides the curve into two equal halves
3. The two ends of the normal distribution curve continuously approach the horizontal axis
but they never cross it
4. The values of the mean, mode and median are all equal
NB: The above distribution curve is referred to as normal probability distribution curve because if a
frequency distribution curve is plotted from measurements of a given sample drawn from a normal
population then a graph similar to the normal curve must be obtained.
- It should be noted that 68% of any population lies within one standard deviation, ±1σ
- 95% lies within two standard deviations ±2σ
- 99% lies within three standard deviations ±3σ
0 Z
Standardization of Variables
- Before we use the normal distribution curve to determine probabilities of the continuous
variables, we need to standardize the original units of measurement, by using the following
formula.
μ
Z=
σ
Where χ = Value to be standardized
Z = Standardization of x
µ = population mean
σ = Standard deviation
Example
A sample of students had a mean age of 35 years with a standard deviation of 5 years. A student was
randomly picked from a group of 200 students. Find the probability that the age of the student turned
out to be as follows
[email protected] Page 32
Solution
(i). The standardized value for 35 years
35 - 35
Z= = = 0
σ 5
∴ the area between Z = 0 and Z = 1 is 0.3413 (These values are checked from the normal tables). The
value from standard normal curve tables.
When z = 0, p=0
And when z = 1, p = 0.3413
Now the area under this curve is the area between z = 1 and z = 0
= 0.3413 – 0 = 0.3413
∴ the probability age lying between 35 and 40 yrs is 0.3413
(ii). 30 and 40 years
30 35 5
Z= = = = -1
σ 5 5
40 35
Z= = = 1
σ 5
30 35
Z= = = -1
σ 5
[email protected] Page 33
c) Rejecting a true hypothesis – incorrect decision. This is called type I error, with probability = α.
d) Accepting a false hypothesis – (incorrect decision) – this is called type II error, with probability = β.
Levels of significance
A level of significance is basically the probability of one making an incorrect decision after the statistical
testing has been done. Usually such probability used are very small e.g. 1% or 5%
Acceptance Region
Rejection Region
Critical Value
[email protected] Page 35
Rejection Region
Critical Value
[email protected] Page 36
5. Since –10.1 < -1.65, we reject the null hypothesis but accept the alternative hypothesis at 5%
level of significance i.e. the marriage age in this community is significantly lower than 19 years
Acceptance region
Rejection region
- 1.65 0
Example 2
[email protected] Page 37
A foreign company which manufactures electric bulbs has assured its customers that the lifespan of the
bulbs is 28 month with a standard deviation of 4months. Recently the company embarked on a quality
improvement research for their product. After the research using new technology, a sample of 70 bulbs
was tested and they gave a mean lifespan of 30.2 months
Does this justify the research undertaken? Use 1% level of significance to conduct a statistical test in
order to establish the truth about the above question.
Testing procedure
1. Null hypothesis H0: µ = 28
Alternative hypothesis HA: µ > 28
2. The level of significance is 1% (one tailed test)
3. The test statistics is the sample mean age, x’ = 30.2
4. The critical value of the one tailed test at 5% level of significance is + 2.33
5. The standardized value of the sample mean is
X 30.2 28
Z = = 4
= 4.6
Sx 70
6. Since 4.6 > 2.33, we reject the null hypothesis but accept the alternative hypothesis at 1% level
of significance i.e. the new sample mean life span is statistically significant higher than the
population mean
Therefore the research undertaken was worthwhile or justified
0.4900
1% = 0.01
2.33
Example 3
A construction firm has placed an order that they require a consignment of wires which have a mean
length of 10.5 meters with a standard deviation of 1.7 m
The company which produces the wires delivered 90 wires, which had a mean length of 9.2 m., The
construction company rejected the consignment on the grounds that they were different from the order
placed.
Required
Conduct a statistical test to indicate whether you support or not support the action taken by the
construction company at 5% level of significance.
Solution
Null hypothesis µ = 10.5 m
Alternative hypothesis µ ≠ 10.5 m
Level of significance be 5%
[email protected] Page 38
Acceptance Region
- 1.96 +1.96
The standardized value of the test Z =
X -μ 9.2 10.5
Z = = 1.7
= - 7.25
SX 90
Since -7.25 < -1.96, reject the null hypothesis but accept the alternative hypothesis at 5% level of
significance i.e. the sample mean is statistically different from the consignment ordered by the
construction company. Therefore support the action taken by the construction company
Required
Conduct a statistical test in order to establish whether there was a significant difference between the
mean harvests under the two types of field conditions. Use 5% level of significance.
Solution
[email protected] Page 39
H0: µ1 = µ2
HA: µ1 ≠ µ2
Critical values of the two tailed test at 5% level of significance are ±1.96
The standardized value of the difference between sample means is given by Z where
X1 X B
Z =
where S X 1 X 2 = 1.52 1.32
S X1 X 2 50 60
Z =
60 63
0.045 0.028
= 11.11
Acceptance region
- 1.96 0 +1.96
Since 11.11 >1.96, we reject the null hypothesis but accept the alternative hypothesis at 5% level of
significance i.e. the difference between the sample mean harvest is statistically significant. This implies
that the fertilizer had a positive effect on the harvest of maize
Note: You don’t have to illustrate your solution with a diagram.
Example 2
An observation was made about reading abilities of males and females. The observation led to a
conclusion that females are faster readers than males. The observation was based on the times taken by
both females and males when reading out a list of names during graduation ceremonies.
In order to investigate into the observation and the consequent conclusion a sample of 200 men were
given lists to read. On average each man took 63 seconds with a standard deviation of 4 seconds
A sample of 250 women were also taken and asked to read the same list of names. It was found that
they on average took 62 second with a standard deviation of 1 second.
Required
By conducting a statistical hypothesis testing at 1% level of significance establish whether the sample
data obtained support earlier observation or not
Solution
H0: µ1 = µ2
HA: µ1 ≠ µ2
Critical values of the two tailed test at 1% level of significance is 2.58.
[email protected] Page 40
X1 X 2
Z =
S X1 X 2
63 62
Z = = 3.45
42
250
2
1
200
Acceptance region
Rejection region
Rejection region
Example
A member of parliament (MP) claims that in his constituency only 50% of the total youth population
lacks university education. A local media company wanted to ascertain that claim conducted a survey
taking a sample of 400 youths, of these 54% lacked university education.
Required:
At 5% level of significance, confirm if the MP’s claim is wrong.
Solution.
Note: This is a two tailed tests since we wish to test the hypothesis that the hypothesis is different (≠)
and not against a specific alternative hypothesis e.g. < less than or > more than.
[email protected] Page 41
pq 0.5 x0.5
Sp = = = 0.025
n 400
0.54 0.50
Z= = 1.6
0.025
At 5% level of significance for a two-tailored test the critical value is 1.96. Since calculated Z value <
tabulated value (1.96).
i.e. 1.6 < 1.96 we accept the null hypothesis.
Thus the MP’s claim is accurate.
P1 P2
And Z =
S p1 p2
Example
In a random sample of 100 persons taken from village A, 60 are found to be consuming tea. In another
sample of 200 persons taken from a village B, 100 persons are found to be consuming tea. Do the data
reveal significant difference between the two villages so far as the habit of taking tea is concerned?
Solution
Let us take the hypothesis that there is no significant difference between the two villages as far as the
habit of taking tea is concerned i.e. π1 = π2
We are given
P1 = 0.6; n1 = 100
P2 = 0.5; n2 = 200
[email protected] Page 42
=
0.6100 0.5 200
100 200
= 0.53
q = 1 – 0.53
= 0.47
pq pq
S P1 P2 =
n1 n2
=
0.53 0.47 0.53 0.47
100 200
= 0.0608
0.6 0.5
Z=
0.0608
= 1.64
Since the computed value of Z is less than the critical value of Z = 1.96 at 5% level of significance
therefore we accept the hypothesis and conclude that there is no significant difference in the habit of
taking tea in the two villages A and B
Example
Ken industrial manufacturers have produced a perfume known as “fianchetto.” In order to test its
popularity in the market, the manufacturer carried a random survey in Back rank city where 10,000
consumers were interviewed after which 7,200 showed preference. The manufacturer also moved to
Rook town where he interviewed 12,000 consumers out of which 1,0000 showed preference for the
product.
Required
Design a statistical test and hence use it to advice the manufacturer regarding the differences in the
proportion, at 5% level of significance.
Solution
H0: π1 = π2
HA: π1 ≠ π2
The critical value for this two tailed test at 5% level of significance = 1.96.
Z=
P1 P2
S P1 P2
[email protected] Page 43
Where;
Sample 1 Sample 2
Sample size n1 = 10,000 n2 = 12,000
Sample proportion of success 1.2 P2 = 0.83
P1 =
10
Population proportion of success. Π1 Π2
pq pq
Now S p1 p2 =
n1 n2
p1n1 p2 n2
Where P =
n1 n2
And q = 1 – p
in our case
10, 000(0.72) 12, 000(0.83)
P=
10, 000 12, 000
84, 000
=
22, 000
= 0.78
q = 0.22
0.78 0.22 0.78 0.22
S P1 P2
10, 000 12, 000
= 0.00894
0.72 0.83
Z= = 12.3
0.00894
Since 12.3 > 1.96, we reject the null hypothesis but accept the alternative. The differences between
the proportions are statistically significant. This implies that the perfume is much more popular in
Rook town than in Back rank city.
t distribution (student’s t distribution) tests of hypothesis (test for small samples n < 30)
For small samples n < 30, the method used in hypothesis testing is exactly similar to the one for large
samples except that t values are used from t distribution at a given degree of freedom v, instead of z
score, the standard error Se statistic used is also different.
Note that v = n – 1 for a single sample and n1 + n2 – 2 where two sample are involved.
When the population standard deviation (S) is known then the t statistic is defined as
X S
t = where S X
SX n
Follows the students t distribution with (n-1) d.f. where
X = Sample mean
μ = Hypothesis population mean
n = sample size
and S is the standard deviation of the sample calculated by the formula
X X
2
S= for n < 30
n 1
If the calculated value of t exceeds the table value of t at a specified level of significance, the null
hypothesis is rejected.
Example
Ten oil tins are taken at random from an automatic filling machine. The mean weight of the tins is 15.8
kg and the standard deviation is 0.5kg. Does the sample mean differ significantly from the intended
weight of 16kgs. Use 5% level of significance.
Solution
Given that n = 10; x = 15.8; S = 0.50; μ = 16; v = 9
H0: μ = 16
HA: μ ≠ 16
0.5
= SX
10
15.8 16
t = 0.5
10
0.2
=
0.16
= -1.25
The table value for t for 9 d.f. at 5% level of significance is 2.26. The computed value of t is smaller than
the table value of t. therefore, difference is not significant and the null hypothesis is accepted.
[email protected] Page 45
X1 X 2
t = at n1 + n2 – 2 d.f.
S X X 2
1
The standard deviation is obtained by pooling the two sample standard deviation as shown below.
Sp =
n1 1 S12 n2 1 S22
n1 n2 2
Where S1 and S2 are standard deviation for sample 1 & 2 respectively.
Sp Sp
Now S X 1 = and S X 2 =
n1 n2
S X1X 2 = S X2 S X2 2
1
n1 n2
Alternatively S = Sp
X1X 2 n1n2
Example
Two different types of drugs A and B were tried on certain patients for increasing weights, 5 persons
were given drug A and 7 persons were given drug B. the increase in weight (in pounds) is given below
Drug A 8 12 16 9 3
Drug B 10 8 12 15 6 8 11
Do the two drugs differ significantly with regard to their effect in increasing weight? (Given that v= 10;
t0.05 = 2.23)
Solution
H0: μ1 = μ2
HA: μ1 ≠ μ2
X1 X 2
t=
S X1X 2
[email protected] Page 46
8 -2 4
11 +1 1
ΣX1 = 45 Σ(X1– X 1 ) = 0 Σ (X1 – X 1 )2= 62 ΣX2= 70 Σ (X2 – X 2 ) = 0 Σ (X2– X 2 )2= 54
X1 =
X 1
=
45
=9 X2 =
X 2
70
10
n1 5 n2 7
62 54
S1 = = 3.94 S2 = 3
4 6
Sp =
4 15.5 6 9
10
= 3.406
= 1.99
X1 X 2 9 10
t = =
S X1X 2 1.99
= 0.50
Example
Two salesmen A and B are working in a certain district. From a survey conducted by the head office, the
following results were obtained. State whether there is any significant difference in the average sales
between the two salesmen at 5% level of significance.
A B
No. of sales 20 18
Average sales in $ 170 205
Standard deviation in $ 20 25
[email protected] Page 47
Solution
H0: μ1 = μ2
HA: μ1 ≠ μ2
Where
Sp =
n1 1 S12 n2 1 S22
n1 n2 2
n1 n2
S X 1 X 2 = Sp
n1n2
= 22.5
38
S X 1 X 2 22.5
360
= 7.31
170 205
t=
7.31
= 4.79
t0.05(36) = 1.9 (Since d.f > 30 we use the normal tables)
The table value of t at 5% level of significance for 36 d.f. when d.f. >30, that t distribution is the same as
normal distribution is 1.9. since the value computed value of t is more than the table value, we reject
the null hypothesis. Thus, we conclude that there is significant difference in the average sales between
the two salesmen
The Chi square test (χ2) is used when comparing an actual (observed) distribution with a hypothesized or
explained distribution.
[email protected] Page 48
O E
2
It is given by; χ =
2
E
Where O = Observed frequency
E = Expected frequency
The computed value of χ is compared with that of tabulated χ2 for a given significance level and degrees
2
of freedom.
χ =
2
E
4. The characteristic of this distribution are defined by the number of degrees of freedom
(d.f.) which is given by
d.f. = (r-1) (c-1),
Where r is the number of rows and c is number of columns corresponding to a chosen
level of significance, the critical value found from the chi squared table
5. The calculated value of χ2 is compared with the tabulated value χ2 for (r-1) (c-1) degrees
of freedom at a certain level of significance. If the computed value of χ2 is greater than
the tabulated value, the null hypothesis of independence is rejected. Otherwise we
accept it.
Example
A sample of 200 people where a particular devise was selected of these 100 were given a drug and the
others were not given any drug. The results are as follows
Drug No drug Total
Cured 65 55 120
Not cured 35 45 80
Total 100 100 200
[email protected] Page 49
Solution
Let us take the null hypothesis that the drug is not effective in curing the disease.
Applying the χ2 test
The expected cell frequencies are computed as follows
R1C1 120 100
E11 = = = 60
n 200
R2C1 80 100
E21 = = = 40
n 200
R2C2 80 100
E22 = = = 40
n 200
Arranging the observed frequencies with their corresponding frequencies in the following table we get
O E (O – E) 2 (O – E) 2 /E
65 60 25 0.417
35 40 25 0.625
35 40 25 0.417
45 40 25 0.625
Σ(O – E) 2 /E = 2.084
O E
2
χ =
2
E
= 2.084
V= (r –1) (c-1) = (2 – 1) (2 –1) = 1; tabulated( 0.05) = 3.841
2
The calculated value of χ2 is less than the table value. The hypothesis is accepted. Hence the drug is not
effective in curing the disease.
[email protected] Page 50
It is concerned with the proposition that several populations are homogenous with respect to some
characteristic of interest e.g. one may be interested in knowing if raw material available from several
retailers are homogenous. A random sample is drawn from each of the population and the number in
each of sample falling into each category is determined. The sample data is displayed in a contingency
table
The analytical procedure is the same as that discussed for the test of independence
Example
A random sample of 400 persons was selected from each of three age groups and each person was
asked to specify which types of TV programs they preferred. The results are shown in the following table
Type of program
Age group A B C Total
Under 30 120 30 50 200
30 – 44 10 75 15 100
45 and above 10 30 60 100
Total 140 135 125 400
Test the hypothesis that the populations are homogenous with respect to the types of television
program they prefer, at 5% level of significance.
Solution
Let us take hypothesis that the population are homogenous with respect to different types of television
program they prefer
Applying χ2 test
O E (O – E) 2 (O – E) 2 /E
120 70.00 2500.00 35.7143
10 35.00 625.00 17.8571
10 35.00 625.00 17.8571
30 67.50 1406.25 20.8333
75 33.75 1701.56 50.4166
30 33.75 14.06 0.4166
50 62.50 156.25 2.500
15 31.25 264.06 8.4499
60 31.25 826.56 26.449
Σ(O – E) 2 /E = 180.4948
O E
2
χ =
2
E
[email protected] Page 51
The calculated value of χ2 is greater than the table value. We reject the hypothesis and concluded that
the population are not homogenous with respect to the type of TV programs preferred, thus the
different age groups vary in choice of TV programs.
Summary of Formulae in Hypothesis Testing
X1 X B
Z=
S X1X 2
S12 S22
Where S
X1X 2 n1 n2
At = level of significance
For n < 30
X1 X 2
t= at n1 + n2 – 2 d.f
S X1X 2
n1 n2
where S Sp
X1X 2 n1n2
and S p
n1 1 S12 n2 1 S22
n1 n2 2
[email protected] Page 52
Pq
Where: Sp =
n
P = Proportion found in sample
q=1–p
= hypothetical proportion
(d) Difference between proportions
P1 P2
Z=
S P1 P2
Where:
pq pq
S P1 P2
n1 n2
p n p2 n2
p= 1 1
n1 n2
q=1–P
(e) Chi-square test
O E
2
2
X = E
Where O = observed frequency
Column total × Row total
E= = expected frequency
Sample Size
Decision making and planning in an organization involves forecasting which is one of the time series
analysis.
a) Drastic changes e.g. in the advent of a major competitor, period of war or sudden change of
taste.
b) For long term forecasting internal and external pressures makes historical data less effective.
1. Moving Average
[email protected] Page 53
Periodical data e.g. monthly sales may have random fluctuation every month despite a general trend
being evident. Moving average helps in smoothing away these random changes.
A moving average is the forecast for a period that takes the average of the previous periods.
Example:
The table below represents company sales, calculate 3 and 6 monthly moving averages, for the data
Months Sales
January 1200
February 1280
March 1310
April 1270
May 1190
June 1290
July 1410
August 1360
September 1430
October 1280
November 1410
December 1390
Solution.
[email protected] Page 54
And so on…
Jan + Feb + Mar + Apr + May + Jun 1200 +1280 +1310 +1270 +1190 +1290
July forecast = 6 = 6
And so on…
April 1263
May 1287
June 1257
1) The more the number of periods in the moving average, the greater the smoothing effect.
3) The more the randomness of data with underlying trend being constant then the more the periods
should be involved in the moving averages.
[email protected] Page 55
2. Exponential smoothing
This method involves automatic weighing of past data with weights that decrease exponentially with
time.
Example
Using the previous example and smoothing constant 0.3 generate monthly forecasts
January 1200
Solution
[email protected] Page 56
Since there were no forecasts before January we take Jan to be the forecast for February.
Feb – 1200
For March;
=1224
Note:
The higher the value, the more the forecast is sensitive to the current status.
It is a statistical devise used to measure the change in the level of prices, wages output and other
variables at given times, relative to their level at an earlier time which is taken as the base for
comparison purposes
Pn
Qn
[email protected] Page 57
Where pn is the price of a commodity in the current year (the year for which the price index to be
calculated)
Where po is the price of the same commodity in the base year (the year for comparison purposes)
LASPEYRE’S INDEX p q n o q pn o
P q o o
× 100 q po o
× 100
PAASCHE’S INDEX p q n n q qn n
Pq o n
× 100 q po n
× 100
p qn n
Value index = P qo o
× 100
wpn
po o
100
Laspeyre’s Price index w o
Where w0 are the proportions of the total expected in the basic period. This formula is frequently used
to calculate retail price index.
[email protected] Page 58
For comparison purposes if two series have different base years, it is difficult to compare them directly.
In such cases, it is necessary to change the base year of one of the series (or both) so that both have the
same base.
It is also necessary to keep the index relevant to current conditions hence the need to change the base
from time to time.
Example;
Price index 100 104 108 109 112 120 125 140
[email protected] Page 59
When changing the base year, it is advisable to update the weights used in the base year.
A chain based index is one where the index is calculated every year using the previous year as the base
year. This type of index measures rate of change from year to year.
This method is suitable where weights are changing rapidly and items are constantly being brought into
the index and unwanted items taken out. It can be a price or quantity index
[email protected] Page 60
[email protected] Page 61