0% found this document useful (0 votes)
59 views99 pages

Statistics I Notes

This document outlines the syllabus for a statistics course. It covers topics like descriptive statistics, probability, random variables, and common probability distributions. Descriptive statistics involves numerical and graphical techniques to summarize data, like measures of central tendency and dispersion. Probability concepts are introduced, including sample spaces, events, and axioms of probability. Random variables are defined as variables that can take on numerical values from a probability distribution. Common discrete probability distributions to be covered include the binomial, Poisson, and normal distributions.

Uploaded by

Racknarock
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views99 pages

Statistics I Notes

This document outlines the syllabus for a statistics course. It covers topics like descriptive statistics, probability, random variables, and common probability distributions. Descriptive statistics involves numerical and graphical techniques to summarize data, like measures of central tendency and dispersion. Probability concepts are introduced, including sample spaces, events, and axioms of probability. Random variables are defined as variables that can take on numerical values from a probability distribution. Common discrete probability distributions to be covered include the binomial, Poisson, and normal distributions.

Uploaded by

Racknarock
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 99

BSMA 301 Statistics

Dr. Eyram Kwame

October 12, 2020

1 / 99
Outline
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression
2 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression

3 / 99
Recommended Textbooks

I Navidi William Statistics for Engineers and


Scientists, 4th Edition, McGraw-Hill Education,
2015.[1]
I Ross M. Sheldon Introduction to Probability
and Statistics for Engineers and Scientists, 3rd
Edition, Elsevier Academic Press 2009. [2]

4 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression

5 / 99
Introduction

I Statistics is the science of


learning from data. It comprises
data collection, Description, and
Analysis, for making Inferences.
I The two main branches of
statistics are Descriptive and
Inferential Statistics.
I Information is a processed data.
6 / 99
Introduction

I A population is the total number


of all the items of interest. It is
often very large and may be
infinite. A descriptive measure of
a population is called a parameter.
I A sample is a set of data drawn
from a population. A descriptive
measure of a sample is called a
statistic.
7 / 99
Introduction

I Descriptive statistics deals with


methods of organizing,
summarizing, and presenting data
in a convenient and informative
way.
I Statistical Inference is the process
of making an estimation,
prediction, or decision about a
population based on sample data.
8 / 99
Introduction

I Due to the size of populations, a


sample of a reasonable size is
adequate for conclusions or
estimations about the population
based on the information provided
I Simple random, Stratified random,
Cluster and Convenience
sampling are various Sampling
Techniques [1]
9 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression

10 / 99
Numerical Descriptive Techniques

I Central Location is measured


with Mean, Median and Mode.
Arithmetic mean of a data set is
N n
1Õ 1Õ
µ= xi or x̄ = xi
N n
i=1 i=1
Note that µ is population mean
while x̄ is sample mean.
11 / 99
Numerical Descriptive Techniques (contd.)
I Dispersion (or Variability) is
measured with Range and
Variance. The Variance for a
population σ 2 and a sample s2 are
N
1 Õ
σ2 = (xi − µ)2,
N
i=1
n
1 Õ
s =
2
(xi − x̄)2
n−1
i=1
12 / 99
Numerical Descriptive Techniques (contd.)

I Relative Standing is a measure of


the relationship between a data set
and the rest of the data. Relative
standing is measured with
percentile ranking and quartiles.
The lower quartile Q L is the 25th percentile of a
data set. The middle quartile Q M is the median
or 50th percentile. The upper quartile QU is the
75th percentile.

13 / 99
Graphical Descriptive Techniques

I Histogram, Bar and Pie Chart are


various types of graphical
representation of data
I In this lecture, we shall use
Matlab and Microsoft Excel to
present processed data graphically

14 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression

15 / 99
Elements of Probability

I Sample Space: is the set of all


possible outcomes of an
experiment. It is denoted by S.
Let n (S) denote the total count of
all the items in the sample space.
I An Event: is any subset A of the
sample space.
Let n (A) denote the total count of
all the items in the event.
16 / 99
Elements of Probability

I Mutually Exclusive Events: two


or more events are said to be
mutually exclusive if both or all
events cannot occur at the same
time.
I Probability: is the measure of
the likelihood of an event
occurring during an experiment.
I The probability of A is

n(A)
P (A) = . (1)
n (S)
17 / 99
Axioms of Probability
For any event A of an experiment,
having a sample space S, the
following axioms hold
I A1: 0 ≤ P(A) ≤ 1

I A2: P (S) = 1

I A3: For any sequence of mutually

exclusive events
A1, A2, A!3, · · · , An, we have
Øn Õ n
P Ai = P(Ai )
i=1 i=1 18 / 99
Conditional Probability
I The Intersection of events A and
B is the event that both event A
and event B occur at the same
time. It is denoted as A ∩ B
I Conditional Probability is the
probability of an event (say A)
given that an event (say B) has
occurred:
P(A ∩ B)
P(A|B) = (2)
P(B)
19 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression

20 / 99
Random Variables

I A random variable (r.v.) is a


variable (say X) that can be
assigned a possible numerical
values, which are outcomes of a
random phenomenon.
I Consider the rolling of two fair
dice. If our interest only lies in
knowing the sum of the values but
21 / 99
Random Variables (contd.)

not the individual values of the


dice, then our r.v. X is the sum
with possible values
{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}.

22 / 99
Random Variables

I There are two types of random


variables.
I A discrete r.v. is any r.v. that can
take on a countable number of
values.
Eg: X = number of heads
observed in an experiment that
flips a coin 3 times.
23 / 99
Random Variables (contd.)

I A continuous random variable


is any r.v. with uncountable
values. An excellent example of a
continuous r. v. is the time to
complete a task (X = time to write
a statistics exam).

24 / 99
Random Variables

I A probability distribution is a
table, formula, or graph that
describes the values of a random
variable and the probability
associated with these values.
I The probability distribution of the
r.v. X, which is the number of

25 / 99
Random Variables (contd.)

heads obtained in an experiment


that flips a coin 3 times is given as
x 0 1 2 3
p(x) 0.125 0.375 0.375 0.125

26 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression

27 / 99
D.P.D

I The requirement for a distribution


to be considered as a Discrete
Probability Distribution is as
follows:
0 ≤ p(x) ≤ 1 (3)
Õ
p(x) = 1 (4)
x
28 / 99
D.P.D (contd.)

The population mean µ of a discrete


r.v. is the weighted average of all its
values. The parameter is called the
Expected value (Expectation) of X
and it is represented as
Õ
E(X) = µ = xp(x) (5)
x

29 / 99
D.P.D (contd.)

The population variance of a discrete


r.v. is defined as
Õ
Var(X) = σ = 2
(x − µ)2p(x)
x
Õ
= x 2p(x) − µ2
x

30 / 99
D.P.D

Properties of Expected value and


Variance
1. E(c) = c
2. Var(c) = 0
3. E(X + c) = E(X) + c
4. Var(X + c) = Var(X)
5. E(cX) = cE(X)
6. Var(cX) = c2Var(X)
31 / 99
An Example

Using a historical records, the manager of a company


has determined that the probability distribution of X,
the number of employees absent per day
x 0 1 2 3 4 5 6
p(x) 0.005 0.025 0.31 0.34 0.22 0.08 0.02
(a) Compute the following probabilities
(i) P(2 ≤ X ≤ 5)
(ii) P(X > 5)
(iii) P(X < 4)

(b) Calculate the mean and the standard deviation of


the population

32 / 99
Bivariate Distribution
A Bivariate Probability
Distribution of random variables X
and Y is a table or formula that gives
the joint probabilities p(x, y) for all
pairs of x and y.
The requirement for a Discrete
Bivariate Distribution is as follows:
Õ 0≤Õp(x, y) ≤ 1 (6)
p(x, y) = 1 (7)
x y 33 / 99
Bivariate Distribution

Given p(x, y) for x and y, we have


Õ
E(X) =µ x = xp(x) (8)
Õx
E(Y ) =µ y = yp(y) (9)
y
where p(x) and p(y) are known as
the marginal probabilities of x and y
respectively.
34 / 99
Bivariate Distribution
The variance and covariance are given as
Õ
Var(X) = σx2 = (x − µ x )2 p(x)
x
Õ
= x 2 p(x) − µ2x (10)
x
Õ
Var(Y ) = σy2 = (y − µ y )2 p(y)
y
Õ
= y 2 p(y) − µ2y (11)
y
ÕÕ
COV(X,Y ) = σxy = (x − µ x )(y − µ y )p(x, y)
x y
ÕÕ
= xyp(x, y) − µ x µ y (12)
x y 35 / 99
Bivariate Distribution

The coefficient of correlation, which


measure the strength of the linear
relationship between two random
variables is defined as
σx y
r= (13)
σx σy
Note that |r | ≤ 1.

36 / 99
Bivariate Distribution

Properties of Expected values and


Variance
1. E(X + Y ) = E(X) + E(Y )
2. Var(X + Y ) = Var(X) + Var(Y ) + 2COV(X,Y )
If X and Y are independent,
COV(X,Y ) = 0 and thus
3 Var(X + Y ) = Var(X) + Var(Y )

37 / 99
Example

Given the Bivariate distribution of r.v. X and Y as


x
y 1 2
1 0.28 0.42
2 0.12 0.18

(a) Compute the marginal probabilities, p(x) and


p(y).
(b) Compute µ x , µ y , σx and σy .
(c) Compute COV(X,Y ) and r
(d) Compute E(X + Y ) and Var(X + Y )

38 / 99
Example
After analysing several months of sales data, the
owner of an appliance store produced the ff joint
probability distribution of the refrigerators and stoves
sold daily;
Refrigerators
Stoves 0 1 2
0 0.08 .14 0.12
1 0.09 0.17 0.13
2 0.05 0.18 0.04
(a) Compute COV(X,Y ) and r
(b) Compute the following conditional probabilities
(i) P(1 Ref | 0 Stove)
(ii) P(2 Ref | 2 Stove)
39 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression

40 / 99
Binomial Distribution

I The Binomial experiment consists


of a fixed number of trials. We
represent the # of trials with n.
I Each trial has 2 possible
outcomes. We label one outcome
as success and the other as failure.
I The probability of success is p
and that of failure is 1 − p.
41 / 99
Binomial Distribution

I The trials are independent. Thus,


the outcome of one trial does not
affect the outcome of any other
trials.
I If X represents the number of
successes that occur in the n trials,
then X is said to be a binomial r.v.
with parameters (n, p).
42 / 99
Binomial Distribution

 
n x
P(X = x) = p (1 − p)n−x
x
(14)
n!
= p x (1 − p)n−x
x!(n − x)!
µ = E(X) =np (15)
σ 2 = Var(X) =np(1 − p) (16)
43 / 99
Examples of Bin(n, p)

Given a Binomial r.v. X with n = 10


and p = 0.6, compute
(a) P(X = 3)
(b) P(X = 5)
(c) P(X ≤ 4)
(d) P(6 ≤ X ≤ 9)

44 / 99
Examples of Bin(n, p)

A certain type of tomato seed germinates 80% of the


time. A backyard farmer planted 25 seeds. What is
the probability that
(a) exactly 20 of the seeds germinated?
(b) more than 20 of the seeds germinated?
(c) 24 or fewer of the seeds germinated?
(d) What is the expected number of germinated
seeds?
(e) What is the standard deviation?

45 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression

46 / 99
Poisson Distribution

A Poisson experiment is
characterized by the following
properties
I The # of successes that occur in
any interval is independent of that
which occurs in another.
I The probability of a success in an
interval is the same for all
equal-size intervals.
47 / 99
Poisson Distribution

I The probability of a success in an


interval is proportional to the size
of the interval.
I The probability of more than one
success in an interval approaches
zero (0) as the interval becomes
smaller.

48 / 99
Poisson Distribution

I The Poisson r.v. takes the # of


successes that occur in a period of
time or an interval of space in a
Poisson experiment.
I The probability that a Poisson r.v.
X assumes a value x in a specific
interval with parameter λ > 0 is
λ x e−λ
P(X = x) = ∀ x = 0, 1, 2, · · ·
x!
(17)49 / 99
Poisson Distribution

The Expectation and Variance of


X ∼ Poisson(λ) are given as
E(X) =λ (18)
Var(X) =λ (19)
Example:
Given X ∼ Poisson(2), compute
(a) P(X = 0)
(b) P(X ≤ 3)
(c) P(X ≥ 5) 50 / 99
Poisson Distribution

The # of accidents that occur at a


busy intersection is a Poisson
distribution with mean of 3.5 per
week. Find the probability of the
following events;
(a) No accident occur in one week
(b) Five or more accidents occur in
one week
(c) two accident will occur today
51 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression
52 / 99
Normal Distribution

A r.v. X is said to be normally


distributed with parameters µ and σ 2
and we write X ∼ N(µ, σ 2), if its
p.d.f. is
(x − µ)
 2 
1
f (x) = √ exp −
σ 2π 2σ 2
(20)
for all x ∈ R
53 / 99
Normal distribution

The normal p.d.f. f (x) is a


bell-shaped curve that is symmetric
about µ.

Figure 1: The bell-shaped Normal Distribution curve is symmetric about µ


54 / 99
Normal distribution

The parameters µ and σ 2 represent


the mean and variance of the
distribution respectively.
Thus
E[X] = µ, Var(X) = σ 2
Given a normal r.v. X and constants
α, β, the r.v. defined as Y = αX + β
is normally distributed with mean
αµ + β and variance α2 σ 2.
55 / 99
Normal distribution

Therefore, if X ∼ N(µ, σ 2), then


X−µ
Z= (21)
σ
is a normal r.v. with mean 0 and
variance 1.
The r.v Z is called the standard
normal distribution.
We shall obtain probabilities of a
normal r.v. by converting it into the
standard normal r.v. 56 / 99
Normal distribution
Thus,
X − µ b− µ
 
P(X < b) =P <
σ σ
b− µ
 
=P Z <
σ
Similarly
a− µ X − µ b− µ
 
P(a < X < b) =P < <
σ σ σ
a−µ b− µ
 
=P <Z<
σ σ
b− µ a − µ
  
=P Z < −P Z <
σ σ
57 / 99
Normal distribution

Figure 2: P(Z < −a) and P(Z > a)


Note that due to symmetry, we have
P(Z < −a) = P(Z > a) (22)
58 / 99
Example of Normal distribution

Q1. If X ∼ N(3, 16), compute


(a) P(X < 12)
(b) P(X < −2)
(c) P(3 < X < 8)

59 / 99
Example of Normal distribution

Q2. The power W dissipated in a


resistor is proportional to the square
of the voltage V (i.e. W = rV 2).
If r = 2.5 and V can be assumed to
be normally distributed with mean 5
and standard deviation 1, compute
(a) E[W]
(b) P(W > 150)
60 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression
61 / 99
Parameter Estimation

We can use sample data to estimate a


population parameter in two ways.
A Point Estimator draws inferences
about a population by estimating the
value of an unknown parameter by
using a single value or a point.

62 / 99
Parameter Estimation

An Interval Estimator draws


inferences about a population by
estimating the value of an unknown
parameter by using an interval.
An Unbiased Estimator of a
population is an estimator whose
expected value is equal to that
parameter.
63 / 99
Parameter Estimation

Note: The sample mean x̄ is an


unbiased estimator of the population
mean µ (i.e. E[ x̄] = µ).
An unbiased estimator is said to be
Consistent if the difference between
the estimator and the parameter
grows smaller as size grows larger.
σ2
Var( x̄) = (23)
n
64 / 99
Parameter Estimation

If there are two or more unbiased


estimators for a parameter, the
estimator with the least variance is
said to have relative efficiency.

65 / 99
Confidence Intervals for the mean of a Normally Distributed Population

Known σ 2
Suppose that x1, x2, · · · , xn is a
sample from a normally distributed
population having an unknown mean
µ but a known variance σ 2.
Though x̄ is an unbiased estimator of
µ, we do not expect x̄ = µ but rather
x̄ ≈ µ.
66 / 99
Confidence Intervals for the mean of a Normally Distributed Population

Based on the Central Limit


Theorem (see page 204 of the
textbook), we have

n( x̄ − µ)
z= (24)
σ
Therefore, ∃ α s.t.

n( x̄ − µ)
 
P −zα/2 < < zα/2 = 1− α
σ 67 / 99
Confidence Intervals for the mean of a Normally Distributed Population

The probability 1 − α is called the


Confidence Level.
Therefore, the Confidence Interval
Estimate of µ for known variance is
given as
σ
 
µ ∈ x̄ ± √ zα/2 (25)
n
68 / 99
Confidence Intervals for Normal mean with an unknown σ 2

Suppose that x1, x2, · · · , xn is a


sample from a normally distributed
population having an unknown mean
µ and variance σ 2. To construct a
(1 − α) ∗ 100% confidence interval,
we define a new random variable

n
t= ( x̄ − µ) (26)
s
with n − 1 degrees of freedom.
69 / 99
Confidence Intervals for Normal mean with an unknown σ 2

Note that
n
1 Õ
s =
2
(xi − x̄)2
n−1
i=1
and

n( x̄ − µ)
 
P −tα/2, n−1 < < tα/2, n−1 = 1−α
s

70 / 99
Confidence Intervals for Normal mean with an unknown σ 2

Therefore, the Confidence Interval


Estimate of µ for an unknown
variance is given as
 
s
µ ∈ x̄ ± √ tα/2, n−1 (27)
n

71 / 99
Confidence Intervals for Normal mean with an unknown σ 2

Suppose that when a signal having value µ is


transmitted from location A, the value received at
location B is normally distributed with mean µ and
variance 4. To reduce the error, the same value is
sent 9 times. If the sequence of values received are
5, 8.5, 12, 15, 7, 9, 7.5, 6.5, 10.5
Construct
I 95% confidence interval for µ
I 99% confidence interval for µ
I 95% and 99% confidence intervals for µ
assuming the variance is unknown.

72 / 99
Confidence Intervals for Variance (σ 2 ) of a Normal Distribution

Suppose that x1, x2, · · · , xn is a


sample from a normally distributed
population having an unknown mean
µ and variance σ 2. We can construct
a confidence interval for σ 2 by using
the fact that the sample variance s2 is
an unbiased consistent estimator of
σ 2.
73 / 99
Confidence Intervals for Variance (σ 2 ) of a Normal Distribution

We define a new random variable


s2
(n − 1) 2 ∼ χn−1
2
(28)
σ
where χn−1
2 is known as the

chi-squared distribution with n − 1


degrees of freedom.

74 / 99
Confidence Intervals for σ 2

Hence
 2 
s
P χ1−α/2,
2
≤ (n − 1) ≤ χ 2
= 1−α
n−1
σ 2 α/2, n−1


!
(n − 1)s2 (n − 1)s2
P ≤σ ≤ 2
2
= 1−α
χα/2, n−1
2 χ1−α/2, n−1
(29)
75 / 99
Confidence Intervals for σ 2

The weights of a random samples of


cereal boxes that are supposed to be
weighing 1kg are listed below.
1.05, 1.03, 0.98, 1.0, 0.99, 0.97,
1.01, 0.96.
Estimate the variance of the entire
population of cereal box weights
with 90% confidence.
76 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression
77 / 99
Hypothesis Testing

Instead of constructing a confidence


interval for a parameter of a
population with a known
distribution, we shall make an
emphatic statement about the
parameter and then use the available
sample data to test the validity or
otherwise of our statement.
78 / 99
H.T for the Mean of a Normally Distributed Population

Suppose that x1, x2, · · · , xn is a


sample of size n from a population,
which is normally distributed with
an unknown mean µ and a known
(unknown) variance σ 2.

79 / 99
H.T for the Mean of a Normally Distributed Population

Suppose we are interested in testing


the null hypothesis
H0 : µ = µ0 (30)
against the alternative hypothesis
H1 : µ , µ0 (31)
where µ0 is a specified constant.
80 / 99
H.T for the Mean of a Normally Distributed Population

Decision H0 is True H0 is False


Reject H0 Type I Error Correct Decision
Do not reject H0 Correct Decision Type II Error

The significant level α test is to


reject H0 if
σ
| x̄ − µ0 | > zα/2 √ (32)
n
and accept otherwise.
81 / 99
H.T for the Mean of a Normally Distributed Population
Thus to say that

n
I Reject H0 if x̄ − µ0 > zα/2

σ

n
I Accept H0 if x̄ − µ0 ≤ zα/2

σ

82 / 99
One-Sided Hypothesis Tests

When the testing statement is given


as follows:
H0 : µ = µ0
(33)
H1 : µ > µ0
we reject H0 when x̄, the point
estimate of µ0 is much greater than
µ0

83 / 99
One-Sided Hypothesis Tests

Thus √ 
n 
I Reject H0 if x̄ − µ0 > zα
σ
√ 
n 
I Accept H0 if x̄ − µ0 ≤ zα
σ
The decision criteria is called
one-sided critical region.

84 / 99
Summary

H0 H1 Test√Statistic (TS) Significant Level α Test


n
µ = µ0 µ , µ0 µ Reject H0 if |T S| > zα/2

√σ
x̄ − 0
n
µ ≤ µ0 µ > µ0 µ Reject H0 if T S > zα

√σ x̄ − 0
n
µ ≥ µ0 µ < µ0 µ Reject H0 if T S < −zα

σ x̄ − 0

85 / 99
Examples

Test the following


H : µ = 100
I 0 with σ = 10,
H1 : µ , 100
n = 100, x̄ = 100 and α = 0.05
H : µ = 50
I 0 with σ = 15,
H1 : µ < 50
n = 100, x̄ = 48 and α = 0.05
H : µ = 50
I 0 with σ = 5, n = 9,
H1 : µ > 50
x̄ = 51 and α = 0.03
86 / 99
Example

A random sample of 18 young adult men (20 - 30 yrs


old) were sampled. Each person was asked how
many minutes of sport he watched on TV daily. The
responses are listed below.
64, 50, 48, 65, 74, 66, 37, 45, 68, 65, 58, 55, 52, 63,
59, 57, 74, 65
Test to determine at 5% significance level whether
there is enough statistical evidence to infer that the
mean amount of TV watched daily by all young men
is greater than 50 minutes.

87 / 99
Recommended Textbooks
Introduction
Descriptive Statistics
Numerical Descriptive Techniques
Graphical Descriptive Techniques
Elements of Probability
Random Variables
Discrete Probability Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Parameter Estimation
Hypothesis Testing
Introduction to Regression
88 / 99
Introduction to Regression

Regression analysis is a technique


for developing mathematical model
that describes the relationship
between a set of variables.
In many situations, there is a single
variable Y , which depends on other
set of variables x1, x2, · · · , xr .

89 / 99
Introduction to Regression

The simplest type of relationship


between Y and the input variables
x1, x2, · · · , xr , is a linear
relationship.
Thus
Y = β0 +β1 x1 +β2 x2 +· · ·+βr xr (34)

90 / 99
Introduction to Regression

However, (34) is almost never


attainable. The best that can be
obtained is a relationship subject to
random error.
Thus a linear regression equation is
Y = β0 + β1 x1 + β2 x2 + · · · + βr xr + ξ
(35)
with E[ξ] = 0. Hence
E[Y |x] = β0 + β1 x1 + β2 x2 +· · ·+ βr xr
91 / 99
Introduction to Regression

The constants βi ∀ i = 0, 1, · · · , r are


called the regression coefficients and
are usually estimated from a data set.
A regression equation containing a
single independent variable (i.e.
r = 1) is called a simple regression
equation whereas an equation
containing many independent
variables (i.e. r > 1) is called a
multiple regression equation. 92 / 99
Introduction to Regression

Least Squares Estimators of


Regression Parameters
Consider a simple linear regression
equation
Y = β0 + β1 x + ξ (36)
Then we can rewrite the equation as
Y − β0 − β1 x = ξ

93 / 99
Introduction to Regression

We define the sum of squared errors


as
Õ n
SS = (Yi − β0 − β1 xi )2 (37)
i=1
The least squares method chooses
estimators of β0 and β1 that
minimizes SS
94 / 99
Introduction to Regression

The lest square method estimates


β0 = ȳ − β1 x̄ (38)
Sxy
β1 = 2 (39)
Sx
where
n

x̄ = xi
n
i=1
95 / 99
Introduction to Regression

n

ȳ = yi
n
i=1
n
1 Õ
Sx y = xi yi − n x̄ ȳ
n−1
i=1
n
1 Õ
Sx2 = xi2 − n x̄ 2
n−1
i=1
96 / 99
Introduction to Regression
Attempting to analyze the
relationship between advertising and
sales, the owner of a furniture store
recorded the monthly advertising
budget ($) and sales ($1,000) for 8
months as follows
Advert 23 46 60 28 33 25 31 36
Sales 9.6 11.3 12.8 8.9 12.5 12.0 11.4 12.6
How much should the store spend on
adverting if a sales value of $50,000
is desired? 97 / 99
Thank you

98 / 99
References
[1] Navidi, W.
Statistics for Engineers and Scientists, fourth ed.
McGraw-Hill, 2015.
[2] Ross, S.
Introduction to Probability and Statistics for Engineers and
Scientists, fourth ed.
Elsevier Academic Press, 2009.

99 / 99

You might also like