0% found this document useful (0 votes)
411 views321 pages

T-Distribution and F-Distribution Relations

The document provides information about intermediate statistics for economics, including: 1) It introduces concepts such as sample space, events, and probability and discusses statistical experiments, sample points, population/sample space, and different types of events. 2) It explains the classical, relative, and axiomatic definitions of probability and how probability can be used to quantify the likelihood of different outcomes in random experiments. 3) It distinguishes between populations and samples, noting that a sample is a subset of a population that is selected using a prescribed method to make inferences about the larger population.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
411 views321 pages

T-Distribution and F-Distribution Relations

The document provides information about intermediate statistics for economics, including: 1) It introduces concepts such as sample space, events, and probability and discusses statistical experiments, sample points, population/sample space, and different types of events. 2) It explains the classical, relative, and axiomatic definitions of probability and how probability can be used to quantify the likelihood of different outcomes in random experiments. 3) It distinguishes between populations and samples, noting that a sample is a subset of a population that is selected using a prescribed method to make inferences about the larger population.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 321

Department of Distance and Continuing Education

University of Delhi
nwjLFk ,oa lrr~ f'k{kk foHkkx
fnYyh fo'ofo|ky;

B.A.
B.A. (Hons.)
(Hons.) Economics
Political Science
Semester-II
Semester-I
DisciplineCourse Credits-4
Specific Course (DSC-6)
Course Credit-4
INTERMEDIATE STATISTICS
UNDERSTANDING FOR ECONOMICS
POLITICAL THEORY
(Department of Economics)

As As
perper
thethe
UGCF-2022
UGCF andand
National
National
Education
Education
Policy
Policy
2020
2020
Intermediate Statistics for Economics

Editorial Board
Prof. J. Khuntia, V.A.Rama Raju,
Vajala Ravi, Devender

Content Writers
Dr. Pooja Sharma, Taramati, Ashish Kumar Garg

Academic Coordinator
Deekshant Awasthi

© Department of Distance and Continuing Education


ISBN: 978-81-19169-57-3
1st edition: 2023
E-mail: [email protected]
[email protected]

Published by:
Department of Distance and Continuing Education under
the aegis of Campus of Open Learning/School of Open Learning,
University of Delhi, Delhi-110 007

Printed by:
School of Open Learning, University of Delhi

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

• Corrections/Modifications/Suggestions proposed by Statutory Body, DU/Stakeholder/s in the Self


Learning Material (SLM) will be incorporated in the next edition. However, these
corrections/modifications/suggestions will be uploaded on the website https://sol.du.ac.in. Any
feedback or suggestions can be sent to the email- [email protected]

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

INTERMEDIATE STATISTICS FOR ECONOMICS


Study Material: Lesson 1-10

TABLE OF CONTENT

Name of Lesson Content Writers Page No


LESSON 1 Sample Space, Events, and Probability Pooja Sharma 1-16
LESSON 2 Sampling Distribution and Central Taramati 17-25
Limit Theorem
LESSON 3 Characteristics of Estimators Ashish Kumar Garg 26-65
LESSON 4 Methods of Point Estimation Ashish Kumar Garg 66-99
LESSON 5 Cramer Rao Inequality Ashish Kumar Garg 100-138
LESSON 6 Interval Estimation Ashish Kumar Garg 139-173
LESSON 7 Interval Based Distribution Ashish Kumar Garg 174-210
LESSON 8 Statistical Hypothesis Ashish Kumar Garg 211-239
LESSON 9 Error in Hypothesis Testing and Ashish Kumar Garg 240-273
Power of Test
LESSON 10 Testing of Equality of Mean and Ashish Kumar Garg 274-316
Variance

About Contributors
Contributor's Name Designation
Dr. Pooja Sharma Associate Professor, Daulat Ram College, University of Delhi
Taramati Guest Faculty, Kirori Mal College, University of Delhi
Ashish Kumar Garg Assistant Professor, Ramjas College, University of Delhi

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

LESSON 1

SAMPLE SPACE, EVENTS, AND PROBABILITY

STRUCTURE

1.1 Learning Objectives


1.2 Introduction
1.3 Sample and Population
1.3.1 Statistical or Random Experiments
1.3.2 Sample Point, Event
1.3.3 Population or Sample Space of an Experiment
1.3.4 Events, Set Theory and Venn Diagrams
1.3.5 De Morgan’s laws
1.4 Probability
1.4.1 Classical Definition of Probability
1.4.2 Relative Definition of Probability (by Von Mises)
1.4.3 Axiomatic Definition of Probability
1.5 Summary
1.6 Glossary
1.7 Answers to In-Text Questions
1.8 Self-Assessment Questions
1.9 References
1.10 Suggested Readings

1.1 LEARNING OBJECTIVES


After reading this lesson, students will be able:
1. To understand the concept of sample space and population and their significance.
2. To comprehend the need for the concept of sample and population in the context of
probability.
3. To understand the concept of probability in the context of random experiments.
4. To visualize the applications of probability in real life and understand the definition
of probability.

1|Page

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

5. To be able to differentiate between the sample space, events, sample points, and
random experiments.
6. To get familiarized with the technique of the Venn diagram, its usage in defining
events, types of events and
7. To understand the properties of probability and various operations to comprehend the
working of probabilities.
1.2 INTRODUCTION

This unit introduces the concept of ‘probability’ to the students. The phenomenon of
probability indicates the presence of randomness and the existence of some element of
uncertainty. Whenever we face a situation in which there is more than one possible outcome
that can occur, the concept of probability renders a technique for quantifying the chances or
likelihood associated with every possible outcome. There are several instances that involve
chances and thus the notion of probability is applicable. For example, in political elections,
based on exit polls it is plausible to predict that a certain political party could come into power.
By deploying a database of the previous days and considering various parameters such as
temperature, humidity, pressure, etc., the meteorologists use specific tools or techniques to
predict weather forecasts and determine that there are 60 out of 100 chances that it would rain
today.
Another example from day-to-day life is that ‘since it is supposed to rain tomorrow, it is very
likely I will use my raincoat when I go to work. Similarly, flipping a coin involves the
probability of getting either a head or a tail is 0.5 and playing with dice involves one out of six
chances that the required number will come. Thus, the concept of probability can be applied
to several interesting events.
Probability is a mathematical term and the study of probability as a branch of mathematics is
over 300 years. This chapter enables the students to understand and estimate the likelihood of
various possibilities of events and outcomes. Various elementary concepts used in
comprehending the concept of probability will be discussed and explained, such as Sample,
population, random experiments, Venn diagram, sample points, events, types of events etc.

1.3 SAMPLE AND POPULATION

The discipline of Statistics deals with organizing and summarizing data for drawing
conclusions based on the information collected in the form of data. An investigation or
experiment that results in a well-defined collection of objects, constitutes what is known as
‘Population’.
There can be several types of population. One study on a particular type of medicine will lead
to a collection of particular capsules during a specified period. Another investigation might
2|Page

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

involve a population consisting of students getting enrolled in BA honors Economics. If the


desired information is available for all the objects in the population, it is called a ‘census’.
A subset of the population is considered as a ‘sample’. A sample is selected in some prescribed
manner. “Sample is a means to an end rather than the end itself”. The technique for
generalizing from a sample to a population gathered within the branch of our discipline called
“Inferential Statistics”.

Deductive Reasoning

Probability

Population Sample

Inferential Statistics

Inductive Reasoning

Figure 1: Relationship between Population and Sample ‘a two-way process.


It can be visualized from the figure above that a sample and a population both can be deployed
to examine and assess the data also called ‘inference’. There are two fundamental approaches
for inference, deductive and inductive reasoning.
When a sample is derived from the given population, then the concept of probability is used to
infer anything regarding the population. This method of inference is called deductive
reasoning. However, when the sample is used to deduct or infer the population, inferential
statistics is deployed for inferring the population. The technique is referred to as ‘inductive
reasoning. Thus, the role of probability is explicit and well-defined as it plays a critical role
in inferring the sample derived from the population. It is crucial in the deductive method of
statistical inference or research.
Having understood the difference between the sample and population and the relationship
between them, also their role played in statistical inference, it is crucial to comprehend the kind
of experiments or data collection.

3|Page

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1.3.1 Statistical or Random Experiments


Any activity or process whose outcome is subject to uncertainty is considered an experiment.
Experiments generally suggest careful controlled testing of the situation or planned testing in
the laboratory. However, in the disciple of statistics, experiments refer to a wider scope of trials
such as tossing a coin once or several times, selecting a card from the deck, obtaining a
particular blood type from a group of individuals, etc.
Any process of observation or measurement that has more than one possible outcome and for
which there is uncertainty about which outcome will actually materialize is referred to as a
‘random experiment’. For example, tossing a coin, throwing a pair of dice, drawing a card from
the deck of cards.
1.3.2 Sample Point, Event
Each member or outcome of a sample space or population is called Sample Point and event.
It is also called an element of sample space. Let us consider the example of the toss of the coin
for which the sample space is S = {H, T}. The number of elements in the sample space or
population is n(S) = 2. Each element of the sample space that is H and T are known as a Sample
point. In general, n(S) is the number of sample points, a number of times the experiment is
repeated.
Consider an event B which is defined as Event B: Tail appears: B={T}. The number of
elements in event B is 1, denoted by n(B)=1
In a random experiment of the toss of a coin, suppose the event A denotes the event that Head
appears. A = {H}. The number of elements in event A is 1, denoted by n(A)=1
A+B = S: {H}+{T} = {H,T} = S
Let us consider another example of tossing two fair coins. The sample space or population for
this experiment is given by Sample Space: {HH, HT, TH, TT}
The number of elements in the sample space is 4, denoted by n (S) = 4.
Consider an event B that at least the head appears on one of the coins in the toss of two coins
simultaneously. Event B can be represented as
B= {HH, HT, TH},
The number of elements in event B is 3, represented by n(B) = 3
Trial & Events: An experiment is repeated under essentially identical conditions but does not
give unique results. It may result in several possible outcomes. The experiment is called a Trial
and the outcomes are called events. For example, throwing a coin once is an experiment, and
4|Page

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

getting a Head or Tail is an event. Planting a sapling is a Trial and whether it survives, or dies
is an Event. Sitting for an examination is a Trial and getting grades such as A, B, C, D, and E
are events.
Exhaustive Events: All possible outcomes of an experiment constitute collectively exhaustive
events. For example, tossing a coin result in two exhaustive cases which are Head and Tail.
Planting a sapling leads to two exhaustive cases which are Survival and Death. Sitting for an
examination where a student is awarded only 5 grades results in those many exhaustive
numbers of cases.
Favourable Events: All those outcomes of an experiment that lend themselves to the
objectives or favour of the experiments are favourable events. For example, a gambler betting
on an Ace in a game of cards where every draw of cards decides the winner or loses has 4
favourable events, and betting on a black card has 13+13 = 26 favourable events.
Mutually Exclusive Events: Events are said to be mutually exclusive if happening of one
event prevents the occurrence of other events at the same time. Such events are also referred
to as disjoint events since they have no element in common. For example, in athletics meet
involving 10 challengers if any one of them wins then the remaining 9 winning cannot happen
and hence are mutually exclusive. Similarly, in a toss of coin, occurrence of Head or Tail are
mutually exclusive.
Equally Likely Events: Two events are said to be equally likely if one of them is as likely to
happen as the other. For example, in tossing a fair coin once, the outcomes Head and Tail are
equally likely. In a throw of 6-faced dice, all the six numbers 1,2,3,4,5,6 are equally likely. If
a person suffers a minor heart attack, the death or survival outcomes are not equally likely.
Independent Events: If the happening of one event is not affected by the happening (or not
happening) of another event, such events are said to be independent. For example, successively
throwing a dart on the dartboard and getting a perfect score in every throw are independent
events. However, a person throwing the dart once, practicing, and then throws it for the second
time. The event of getting a perfect score in both throws is not independent.
Example: 1 Trial : Tossing of one fair coin
Events : Occurrence of Head, the occurrence of Tail.
Exhaustive events : Occurrence of Head
Mutually exclusive events: Head and Tail
Equally Likely Events: Head and Tail

5|Page

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Example 2: Trial: Tossing of Two fair coins


Events: Occurrence of Two Heads
Occurrence of One Head
Occurrence of Zero Head
Exhaustive Events: HH, HT, TH, TT
Favourable Events: a) HH
b) HT, TH
c) TT
Mutually exclusive event: Occurrence of Two Heads and Occurrence of Two
Tails
Equally likely events: a) getting at least one Head (HH, HT, TH) is equally
likely as getting at least one Tail (TT, TH, HT)
b) getting both heads is not equally likely as getting at least one head.
Independent Events: getting a Head in the second Toss is independent of
getting a Head in the first Toss.
1.3.3 Population or Sample Space of an Experiment
The set of all possible outcomes of an experiment is called Population or simply Sample
Space, denoted by S. Let us consider an example of tossing one fair coin. This is an example
of a random experiment since this involves two plausible outcomes. A head or a tail can appear
in a single toss of a fair coin. For such an experiment the total number of outcomes is two,
therefore the sample space is denoted by
The sample space: S = {H, T},
The number of elements in the sample space or population is n(S) = 2
Either both tosses result in a Head or both Tosses result in a Tail or the first Toss result in a
Head while the second results in Tail or the first Toss results in a Tail and the second results
in a Head.
Let us consider another example of tossing two fair coins. The sample space or population for
this experiment is therefore given by

6|Page

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Sample Space: {HH, HT, TH, TT}


The number of elements in the sample space or population is 4, n (S) = 4.
Consider another example of rolling a die,
Sample space S = {1,2,3,4,5,6},
The number of elements in the sample space is n (S) = 6
And if the same dice is rolled twice, n(S)= 36 = 62 is the Sample space.
If rolled thrice, n(S) = 216 = 63 is the Sample Space.

A CASE STUDY

Consider another example of rolling a dice, The sample space for the random experiment of
rolling dice is given by the Sample space S = {1,2,3,4,5,6},
The number of elements in the sample space is 6, denoted by n (S) = 6,
Let event E be an event that reflects even numbers that appear on dice, as represented by
E= {2, 4, 6},
The number of elements in event E is 3, represented by n(E)=3
There are several varieties of events as described in the next section.

IN-TEXT QUESTIONS

1. Events are said to be _____________. if the occurrence of one event prevents the
occurrence of another event at the same time.
2. If event A represents an event that at least a head appears, and event B represents an
event that only the tail appears. Events A and B are equally likely True / False
3. In the occurrence of the event: {Head} in a single throw of the coin, the occurrence of
event {Tail} is disjoint. The two events are called
a) Mutually exhaustive b) Equally likely c) Both
4. In an experiment consisting of tossing two coins, if event A represents an Event that at
least a Head occurs and event B represents that at least a Tail occurs, then
a) Events A and B are equally likely (True/False)
b) Events A and B are mutually exclusive (True/False)
c) Events A and B together form an exhaustive set (True/False)
7|Page

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1.3.4 Events, Set theory and Venn Diagrams


An event can be considered a set, therefore the relationships and results from elementary set
theory can be used to study events of any random experiment. Some of the fundamental
operations of set theory can therefore be applied to events such as.
1. The complement of an event A is denoted by A'. A complement represented as A' is
the set of all outcomes in the sample space S that are not contained in set A.
2. The union of the two events A and B is denoted by A ∪ B. A union B can also be read
as “A or B” or in both events. In other words, the union of two events includes outcomes
for which both A and B occur as well as outcomes for which exactly one occurs. It
means all outcomes in at least one of the events.
3. The intersection of the two events, A and B, denoted by A ∩ B is read as “A and B”.
The intersection of two events indicates an event consisting of all outcomes that are in
both A and B.
4. A null event is an event consisting of no outcomes whatsoever and is denoted by ∅.
Suppose there are two events A and B, and it is given that A ∩ B = ∅. then A and B are
said to be mutually exclusive or disjoint events.
1.3.5 De Morgan’s laws
a. The complement of the union of events A and B is equal to the intersection of the
complement of A and the complement of B.
(A ∪ B) ' = A ‘∩ B '

b. The complement of the intersection of event A and B is equal to the union of the
complement of A and the complement of B.
(A ∩ B) ' = A' ∪ B'
The events can be represented by using the Venn diagram as shown in the diagrams below.

A B

Fig 1: Population or Sample Space Fig 2: Event A and B are disjoint

8|Page

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

A∩ 𝐵

A A
B

Fig 3: Events A and B are not disjoint


All elements in the sample space belong to the rectangle that represents the entire population
as shown in figure 1. Event A is represented by the oval in orange colour and event B is
represented by the oval shape in blue inside the rectangle, as shown in figure 2. The rectangle
is the population of sample space and events A, and B are the subsets of the sample space.
Events A and B have nothing in common, such events are referred to as disjoint events. These
events are also referred to as mutually exclusive events. Event A is represented by the oval in
orange colour and event B is represented by the oval shape in blue inside the rectangle. The
rectangle is the population of sample space and events A, and B are the subsets of the sample
space. In this case events, A and B have common elements therefore, events A and B are not
disjoint sets.

IN-TEXT QUESTIONS

1. Consider an experiment in which each of the three vehicles taking a particular freeway
exit turns left (L) or right (R) at the end of the exit ramp. Outline the sample space and
events.
2. The two events E1 and E2 are mutually exclusive, where E1 is the event consisting of
numbers less than 3 and E2 is the event that consists of numbers greater than 4. (True/
False)
3. If the two events have some common elements, the two events are not ____________.

1.4 PROBABILITY

In the realm of random experiments, the key objective of the probability of any event A is to
assign a number P(A) to event A. This value P(A) is called the probability of event A which
gives a unique measure of the chances that the event will occur.
9|Page

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

In other words, the probability is the chance of happening or occurrence of an event such as it
might rain today, team X will probably win today, or I may win the lottery. Largely, probability
is a measure of uncertainty.
1.4.1 Classical Definition of Probability
It is also called a priori or mathematical definition of probability. The probabilities are derived
from purely deductive reasoning. This implies that one does not throw a coin to state that the
probability of obtaining a head, or a tail is ½. However, there are cases where possibilities that
arise cannot be regarded as equally likely. For example, the Probability of a recession next
year Probability of GDP value next year. Similarly, the possibility of whether it will rain, or
the outcome of an election is not equally likely.
If an experiment results in mutually exclusive and equally likely outcomes. If m outcomes are
favorable to event A and n is the total number of outcomes in the sample space, then
𝑚 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑓𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑡𝑜 𝐴
P(A) = , 𝑜𝑟
𝑛 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠

𝐹𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐸𝑣𝑒𝑛𝑡𝑠 𝑛(𝐴)


= =
𝐸𝑥ℎ𝑎𝑢𝑠𝑡𝑖𝑣𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐸𝑣𝑒𝑛𝑡𝑠 𝑛(𝑆)

In a single throw of a die, the total occurrences or sample space is n = 6. All are mutually
exclusive and equally likely.
1.4.2 Relative Definition of Probability (by Von Mises)
If a trial is repeated a large number of times under essentially homogeneous and identical
conditions, then the limiting value of the relative frequency which is the ratio of absolute
frequencies to the total number of occurrences is called the probability of happening of events.
𝑚
P(A) = lim
𝑛→∞ 𝑛

IN-TEXT QUESTIONS

6. In a toss of two coins simultaneously, the probability of getting exactly 2 heads P(E)
no. of possible outcomes / total outcomes
7. In the toss of 3 coins simultaneously, the probability of getting exactly two heads.
8. What is the probability of getting at least 1 head when two coins are tossed
simultaneously?
9. Prob of getting almost 2 tails when three coins are tossed simultaneously.

10 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

10. Probability of getting at least 2 heads when three coins are tossed simultaneously.
11. Probability of getting a greater number of tails than heads when three coins are tossed
simultaneously.
1.4.3 Axiomatic Definition of Probability
The axiomatic approach to probability was provided by Russian Mathematician A.N.
Kolmogorov and includes both the above definitions. In order to ensure that the probability
assignments of values P(A) for a particular event in the sample space S, is consistent with the
intuitive notion of probability, all assignments of values of probability P(A) must satisfy the
following properties or Axioms.
1. For any event A, the probability of event A, given by P(A) is non-positive P(A) ≥0. In
other words, the probability that event A will occur can either be zero or some positive
number. The probability of event A can never be negative.
The Axiom 1 reflects the intuitive notion that the chance of A occurring should be non-
negative and is known as the Axiom of non-negativity.
2. The probability of the entire sample space is 1, that is P(S) = 1. In other words, the
probability that the entire sample space will occur is 100 percent, which means it will
surely occur. This is known as the Axiom of Certainty.
The sample space by definition is the event that must occur when the experiment is
performed. The sample space S contains all possible outcomes, therefore the maximum
possible probability is assigned to sample space S.
3. If A1, A2, A3, ………... are the infinite collection of disjoints events, then
P (A1 ∪A2 ∪ A3, ……….) = ∑∞ 𝑖=1 𝑃(𝐴)

This indicates that the probability of the union of all disjoint events belonging to the sample
space sums the chances of all individual events.
The third Axiom formalizes the idea that if we wish the probability that at least one of a number
of events will occur, given that no two events can occur simultaneously, then the chance of at
least one occurring is the sum of the chances of the individual events. This is known as the
axiom of finite additivity.
4. The probability of an event always lies between 0 and 1.
0< = P(A) < = 1,
P(A) = 0 means event A will not occur.
P(A) = 1 means event A will occur certainly.
11 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

5. Let the ∅ be the null event. The event contained no outcomes whatsoever. This property
mainly reflects Axiom 3 indicating the finite collection of disjoint events.
Therefore, P (∅) = 0, the probability of a null event is zero.
6. If A, B, and C are mutually exclusive events, the probability that any one of them will
occur is equal to the sum of probabilities of either individual occurrence.
P(A+B+C+...........) = P (AUBUC…….) = P(A) +P(B)+P(C) +...............
7. If A, B, C …… are a mutually exclusive and collectively exhaustive set of events the
sum of the probability of their individual occurrences is 1. However, if A, B, C ……
are any events, they are said to be statistically independent if the probability of their
occurring together is equal to the product of their individual probabilities. P(A∩B∩C)
= Probability of events A, B, and C occurring together or jointly or simultaneously,
also referred to as Joint probability.
P(A), P(B), and P(C) are called unconditional marginal or individual probabilities.
8. If events A, B, and C …...are not mutually exclusive then,
P(A+B) or P(AUB) = P(A) +P(B) - P(A∩B)
Where P(AB) is the joint probability that the two events occur simultaneously, that is
P (A∩ 𝐵). However, if A and B are mutually exclusive then,

P(A∩B) = P (∅) = 0
For every event A, there is an event A', called as a complement of A

P (A + A') = P (A ∪ A’) = P (S) = 1

P (A A') = P (A ∩A’) = P (∅) = 0

IN-TEXT QUESTIONS

12. An unbiased dice is thrown. What is the probability of getting


(i) a multiple of 3
(ii) a number less than 5
(iii) an even prime number
(iv) a prime number
(v) a factor of 6
13. A dice is thrown once, find the probability of getting
a) An odd number
12 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

b) A multiple of 3
c) A factor of 5
14. Two dice are thrown together, find the probability of getting
a) An even number on both
b) Sum as a perfect square
c) Different numbers on both
d) A total of at least 10
e) Sum as a multiple of 3
f) A multiple of 2 on one and a multiple of 3 on other
g) Sum as an even number OF PROBABILITY
1.5 SUMMARY

This lesson familiarized the students with the basic concepts of sample space and population
along with their significance. The notion of probability was introduced with help of random
experiments. Various applications of probability in real life are presented in the chapter.
Certain important concepts related to probability such as space, events, sample points, and
random experiments are described in the chapter. The basic difference between the sample,
population, sample points, and events have been emphasized. The types of events such as
disjoint events, mutually exhaustive, and exclusive events have been explained. Further, the
concept of the Venn diagram is also presented in the chapter. The notion of probability by
using classical and relative definition has been introduced. Later the properties of probabilities
are also discussed in the chapter.

1.6 GLOSSARY

1. Sample: “Sample is a means to an end rather than the end itself”.


2. Population: An investigation or experiment that results in a well-defined collection
of objects, constitutes what is known as ‘Population’.
3. Deductive Reasoning: When a sample is derived from the given population, then the
concept of probability is used to infer anything regarding the population. This method
of inference is called deductive reasoning.
4. Inductive Reasoning: When the sample is used to deduct or infer the population,
inferential statistics is deployed for inferring the population. The technique is referred
to as ‘inductive reasoning.
13 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

5. Random Experiment: Any process of observation or measurement that has more than
one possible outcome and for which there is uncertainty about which outcome will
actually materialize. Such an experiment is referred to as ‘random experiment’.
6. Sample Point or Event: Each member or outcome of sample space or population is
called Sample Point. It is also called an element of sample space.
7. Mutually Exclusive: Events are said to be mutually exclusive if the occurrence of one
event prevents the occurrence of another event at the same time. Such events are also
referred to as disjoint events since they have no element in common.
8. Equally Likely: The events are called equally likely when two events are said to be
equally likely if one event is as likely to occur as the other.
9. Collectively Exhaustive: The events are collectively exhaustive if the events exhaust
all possible outcomes of an experiment.
10. De Morgan’s Law: The complement of the union of two sets A and B is equal to the
intersection of the complement of the sets A and B. This is De Morgan’s first law.
1.7 ANSWERS TO IN-TEXT QUESTIONS
1. Mutually Exclusive
2. False
3. Both
3 The sample space S; {LLL, RLL, LRL, LLR, LRR, RLR, RRL, RRR}
The event that exactly one of the three vehicles turns right: A
The elements in event A: {RLL, LRL, LLR}
The event that at most one of the vehicles turns right: B
The elements in the event B: {LLL. RLL, LRL, LLR}
In the event that all three vehicles turn in the same direction: C
The elements in the event C: {LLL, RRR}
4. E1 = {1,2}, E2 = {5,6}. The two events are mutually exclusive. True
5. Disjoint
6. ¼
7. 3/8
8. ¾
9. 7/8
10. ½

14 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

11. ½
12. Total number of possible outcomes = 6= n(S)
(i) a multiple of 3
Number of favorable outcomes = 2 {3 and 6}
Hence P (getting multiple of 3) = 2/6 = 1/3
ii) a number less than 5
Number of favorable outcomes = 4 {1, 2, 3, 4}
Hence, P (getting number less than 5) = 4/6 = 2/3
iii) an even prime number
Number of favorable outcomes = 1 {2}
Hence, P (getting an even prime number) = 1/6
iv) a prime number
Number of favorable outcomes = 3 {2,3,5}
Hence the P (getting a prime number) = 3/6 = 1/2
v) a factor of 6
Number of favorable outcomes= 4 {1, 2, 3, 6}
Hence, P (getting a factor of 6) = 4/6 = 2/3
13. a) 1/2 b) 1/3 c) 1/3
14. a) 1/4 b) 7/36 c) 5/6 d) 1/6 e) 1/3 f) 11/36 g) ½

1.8 SELF-ASSESSMENT QUESTIONS

1. Two six-faced dice are rolled together, or dice is rolled twice. The total number of
possible outcomes are 36.
2. (i) Prove that the probability of null event is zero, P (∅) = 0.
(ii) Prove that for any two events A and B
P(AUB) = P(A) +P(B) - P(AB)

15 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1.9 REFERENCES

• Devore, J. (2012). Probability and statistics for engineers, 8th ed. Cengage Learning.
• John A. Rice (2007). Mathematical Statistics and Data Analysis, 3rd ed. Thomson
Brooks/Cole
• Larsen, R., Marx, M. (2011). An introduction to mathematical statistics and its
applications. Prentice Hall.
• Miller, I., Miller, M. (2017). J. Freund’s mathematical statistics with applications, 8th
ed. Pearson.
• Demetri Kantarelis, D. and Malcolm O. Asadoorian, M. O. (2009). Essentials of
Inferential Statistics, 5th edition, University Press of America.
• Hogg, R., Tanis, E., Zimmerman, D. (2021) Probability and Statistical inference,
10TH Edition, Pearson
1.10 SUGGESTED READINGS

• Rice, J. A. (2006). Mathematical statistics and data analysis. Cengage Learning.

• Larsen, R. J., & Marx, M. L. (2005). An introduction to mathematical statistics.


Prentice Hall.

16 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

LESSON 2

SAMPLING DISTRIBUTION AND CENTRAL LIMIT THEOREM

STRUCTURE
2.1 Learning Objectives
2.2 Introduction
2.3 Central Limit Theorem
2.4 Sampling Distribution
2.5 Summary
2.6 Answer to Intext Questions
2.7 Self Assessment Questions
2.8 References
2.1 LEARNING OBJECTIVES
After reading this chapter you will be familiar with the following topics.
1. Central limit theorem
2. Distribution of x and ∑ 𝑥𝑖
3. Sampling distribution of x and 𝑠 2
2.2 INTRODUCTION
This chapter will familiarize you with the most celebrated statistical theorem, the central limit
theorem. This will also help you to learn the importance of large sample sizes. If the sample
sizes are large enough then irrespective of the distribution of population, x and it will tend to
normal distribution. Similarly, it is important to understand the sampling distribution as we
can estimate the population parameters from the samples. Different methods of drawing
samples i.e. with replacement and without replacement have been discussed in the chapter. The
mean and standard deviation of the sampling distribution is explained in the chapter.
2.3 CENTRAL LIMIT THEOREM
According to the central limit theorem when the sample size n is large enough then every
distribution tends to normal distribution irrespective of the distribution of population. Let us
assume random samples are obtained by the method of random stimulation from the population
each of sample size n and if n is large enough then this will give normal distribution with mean
µ and variance. As the sample size increases the precision of the estimates increases.

17 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

For large sample size n, if are independently and identically distributed random samples with
identical mean µ and variance then we can find the distribution of x and ∑ 𝑋𝑖 .
2.3.1 distribution of x
To find the distribution of x we must know the mean and variance of x. For large sample size
n, if 𝑋1 , 𝑋2 , 𝑋3 … . 𝑋𝑛 are independently and identically distributed random samples with
identical mean µ and variance 𝜎 2 then mean and variance of x is as follows
𝑆𝑢𝑚 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
Mean of x = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
𝑆𝑢𝑚 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 𝑋1 + 𝑋2 + 𝑋3 ….+𝑋𝑛
x = =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 𝑛
𝑋1 + 𝑋2 + 𝑋3 ….+𝑋𝑛
Mean of x= E(x) = E( )
𝑛
E(aX)=aE(X) where ‘a’ is a constant and X is a random variable.
Similarly E(X+Y)=E(X)+E(Y)
𝑋1 + 𝑋2 + 𝑋3 ….+𝑋𝑛 𝐸(𝑋1) + 𝐸(𝑋2 )+𝐸( 𝑋3 )….+𝐸(𝑋𝑛 )
Therefore, the mean of x= E( )=
𝑛 𝑛
Since all the random samples 𝑋1 , 𝑋2 , 𝑋3 … . 𝑋𝑛 are independently and identically distributed
random variable with mean µ. Therefore each of the 𝐸(𝑋1) , 𝐸(𝑋2 ), 𝐸( 𝑋3 ) … . , 𝐸(𝑋𝑛 ) have
identical mean µ. There are n samples each with mean µ so, the 𝐸(𝑋1) + 𝐸(𝑋2 ) +
𝐸( 𝑋3 ) … . +𝐸(𝑋𝑛 )=nµ.
𝐸(𝑋1) + 𝐸(𝑋2 )+𝐸( 𝑋3 )….+𝐸(𝑋𝑛 ) µ+µ+µ+µ…µ 𝑛µ
E(x)= = = =µ
𝑛 𝑛 𝑛
So, we have proved the mean of x = E(x)= µ
𝑋1 + 𝑋2 + 𝑋3 ….+𝑋𝑛
Variance of x = V(x)= V( )
𝑛
V(aX)=𝑎2 𝑉(𝑋)
𝑋 𝑋 𝑋 𝑋
V(x)= V( 𝑛1) + 𝑉( 𝑛2) + 𝑉( 𝑛3) … 𝑉( 𝑛𝑛)
1 1 1 1
V(x)=𝑛2 𝑉(𝑋1 ) + 𝑛2 𝑉(𝑋2 ) + 𝑛2 𝑉(𝑋3 ) … + 𝑛2 𝑉(𝑋𝑛 )
1
V(x)=𝑛2 (𝑉(𝑋1 ) + 𝑉(𝑋2 ) + 𝑉(𝑋3 ) … + 𝑉(𝑋𝑛 ))
Since 𝑉(𝑋1 ), 𝑉(𝑋2 ), 𝑉(𝑋3 ) … , 𝑉(𝑋𝑛 ) each have identical variance of 𝜎 2 and there are n
samples so, total 𝜎 2 + 𝜎 2 + 𝜎 2 +…..+𝜎 2 will be equal to n𝜎 2
1
V(x)=𝑛2 (𝜎 2 + 𝜎 2 + 𝜎 2 +…..𝜎 2 )
1
V(X) = 𝑛2 ∗ 𝑛𝜎 2

18 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝜎2
V(x) = 𝑛
𝜎2
Therefore, mean and variance of x is µ and respectively. To standardize the distribution of
𝑛
X with the mean and variance we get
x̅−µ x̅−µ
𝑍= 2
= 𝜎 ~N(0,1)
√𝜎 √𝑛
𝑛

Example: Birth rate in a country is believed to be 1.57 per women. Assume the population
standard deviation is 0.4. If a random sample of 160 women is selected, what is the probability
that the sample mean will fall between 1.52 and 1.62?
Solution: X: birth rate in a country per women

1.52−1.57 x̅−µ 1.62−1.57


P(1.52< X<1.62)=P( 0.4 < 𝜎 < 0.4 )
√160 √𝑛 √160

=P( -1.36<Z<1.36 ) = P(Z<1.36)-P(Z<-1.36)


=P( -1.36<Z<1.36 )=0.9131-0.0869 = 0.8262
2.3.2 distribution of ∑ Xi
to find the distribution of Random variable Xi
# Rule E ( X + Y ) = E ( X ) + E (Y )

Since X1 , X 2 , X 3 ,..., X n ~ N (  ,  2 ) , So E( X1) = , E( X 2 ) =  and so on.

X i = X1 + X 2 + ... + X n
E(X i ) = E( X1 + X 2 + ... + X n )
E(X i ) = E( X1) + E( X 2 ) + ... + E( X n )
E(X i ) =  +  + .... + 
So, adding  n times will give the result as E(X i ) =  n Eq. (1)

To find variance of V (X i ) = V ( X1 + X 2 + X 3 + ... X n ) .

19 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Since X1 + X 2 + X 3 + ... X n ~ N (  ,  2 ) , So, V ( X1 ) =  2 , V ( X 2 ) =  2 .....V ( X n ) =  2

V (X i ) = V ( X1) + V ( X 2 ) + V ( X 3 ) + ...V ( X n )


V (X i ) =  2 +  2 + ... +  2
So, adding 2 n times will give the result as
V (X i ) = n 2
eqn.(2)
So, X i ~ N (n , n 2 ) and
X i − n
z= ~ N (0,1)
n 2

Example: the cost of repairing a car after an accident is $6200 and standard deviation of $650.
A study was carried out on 65 vehicles that had been involved in the accidents. Calculate the
probability that the total repair bill for the vehicles exceeded $4,00,000.
Let the total repair bill for the cars be normally distributed with n=65 cars and mean cost of
$6200. So, the total cost of repairing is 65*6200 with the variance of 65*6502
To calculate the total cost should exceed 400000
400000−403000
P(T>400,000)= P(Z> )
5240
P(Z>-0.572)= P(Z<0.572)=0.71634
IN-TEXT QUESTION
1. Question. Consider a random sample of size 30 taken from a Normal distribution with
Mean 60 and variance 25. Let the sample mean be denoted by X . So, calculate the
probability that X assumes a value greater than 62.
2. Question. In a large population the distribution of a variable has mean 165 and standard
deviation 25 units. If a random sample of size 35 is chosen, find the approximate
probability that the sample mean lies between 162 and 170.
2.4 SAMPLING DISTRIBUTION
From a population of N observation if k samples are drawn with or without replacement with
the sample size of n. The sample statistics for each sample can be obtained and this gives the
sampling distribution. The population parameters can be estimated from the sampling
distribution. This gives the sampling distribution of random variables.
If the samples are drawn with replacement from population with the sample size of n. then total
number of samples obtained will be k=𝑁 𝑛

20 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

If the samples are drawn with replacement from population with the sample size of n. then total
number of samples obtained will be k=𝐶𝑛𝑁
If k samples are drawn with replacement, then the distribution of x can be obtained.
Steps required to make sampling distribution of x
1. Draw the k number of samples from the population
2. Obtain the sample statistic x from each of the obtained sample
3. Determine the probability of occurrence each x.
4. Represent x with the associated probability in tabular format.
5. Hence, the sampling distribution of x is obtained.
Example: XYZ insurance company deals in term life insurance policies and sells three tenures
of insurance policies 25 years, 40 years and 65 years. 20% of all purchasers select 25 year
policy, 50% select a 40 year policy and remaining 30% choose 65 years box. Let x1 and x2
denote the insurance tenure selected by two independently selected insurance holder.
Solution: So, since no information is given in the question regrading the selection procedure
of samples so, we will assume with-replacement random sample.
Total samples obtained is 𝑁 𝑛 = 32 =9 since there are three types of policies available and we
have to select the sample of 2 i.e. sample size is 2.
So, (25,25), (25,40),(25,65),(40,40),(40,25),(40,65),(65,65),(65,25),(65,40).
By obtaining the mean of each sample we get
Samples Mean of samples probability
(25,25) 25 0.2*0.2
(25,40) 32.5 0.2*0.5
(25,65) 45 0.2*0.3
(40,40) 40 0.5*0.5
(40,25) 32.5 0.5*0.2
(40,65) 52.5 0.5*0.3
(65,65) 65 0.3*0.3
(65,25) 45 0.3*0.2
(65,40) 52.5 0.3*0.5

21 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Some numbers are repeated in the distribution by adding the respective probabilities we obtain
the sampling distribution as follows.
x P(x)
25 0.04
32.5 0.20
40 0.25
45 0.12
52.5 0.30
65 0.09

E(x)= x*P(x)
E(x)=25*0.04+32.5*0.20+40*0.25+45*0.12..+65*0.09
E(x)= 44.5
The mean of x is µ.
The standard deviation of the sampling distribution of x is known as standard error of the
𝜎
distribution represented by 𝜎𝑥 and is calculated by √𝑛.
Similarly sampling distribution of 𝑆 2 can be obtained from the samples.
IN-TEXT QUESTION
Example: XYZ insurance company deals in term life insurance policies and sells three tenures
of insurance policies 25-years, 40-years and 65-years. 20% of all purchasers select 25-year
policy, 50% select a 40-year policy and remaining 30% choose 65-years box. Let x1 and x2
denote the insurance tenure selected by two independently selected insurance holder.
2.5 SUMMARY
This chapter has familiarized you with basic concepts related to sampling distribution and
central limit theorem. This unit have familiarized you with distribution of x and ∑ 𝑥𝑖 . This
chapter have explained you the concept of sampling distribution with and without replacement.
Mean of x and standard error of x have been discussed in the chapter. We have tried to explain
the concepts with examples to create better understanding of the concept. This will help you
to think about economic theories in a much better way.
2.6 ANSWERS TO INTEXT QUESTION
1. n = 30,  = 60, 2 = 25

22 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

 2 
X ~ N  ,  
 n 
 25 
X ~ N  60, 
 30 
 X −  62 − 60 

P( X  62) = P   5 
 
 n 30 

P( X  2.19) = 0.5 − P ( 0  z  2.19)


P( Z  2.19) = 0.5 − 0.4857
P ( Z  2.19) = 0.0143
2
2. X ~ N (165, 25 )
Where, sample size (n) = 35
We have to find the distribution of sample mean
 2 
X ~ N   ,   , Since  = 165 , 2 = 252
 n 
 2
X ~ N 165, 25 
 35 
252 2
Note that i.e., is variance
35 n
2 
Standard deviation is =
n n
 162 − 165 170 − 165 
 z
To find P(162  X  170) = P 25 25 
 
 35 35 
= P ( −0.70  z  1.18)
= P ( z  1.18) + P( z  0.70)
as P( z  0.70) = P( z  −0.70)
= 0.3810 + 0.2580

23 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

= 0.639

As the sample size increases then distribution of X will tend to normal distribution. For a
distribution to be approximated to normal distribution, sample size must be atleast 30 or in
other words n  30. As the sample size increases, even discrete distribution approximates to
normal distribution.
3. Solution: So, since no information is given in the question regarding the selection
procedure of samples so, we will assume with replacement random sample.
Total samples obtained is 𝑁 𝑛 = 32 =9 since there are three types of policies available and we
have to select the sample of 2 i.e. sample size is 2.
So, (25,25), (25,40),(25,65),(40,40),(40,25),(40,65),(65,65),(65,25),(65,40).
Samples Variance of samples probability
(25,25) 0 0.2*0.2
(25,40) 112.5 0.2*0.5
(25,65) (25-45)^2+(65-45)^2=800 0.2*0.3
(40,40) 0 0.5*0.5
(40,25) 112.5 0.5*0.2
(40,65) (40-52.5)^2+(65-52.5)^2=312.5 0.5*0.3
(65,65) 0 0.3*0.3
(65,25) 800 0.3*0.2
(65,40) 312.5 0.3*0.5

24 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

So, the sampling distribution of 𝑆 2 .


𝑆2 P(𝑆 2 )
0 0.38
112.5 0.20
312.5 0.30
800 0.12

2.7 SELF ASSESSMENT QUESTIONS


1. It is assumed that the number of claims arriving at an insurance company per working
day has a mean of 50 and standard deviation of 15. A survey was conducted over 50
working days. Calculate the probability that the sample mean number of claims arriving
per working day was less than 40.
2. Let Y denote the engine power of a new car that is launched in three models that differ
in power market survey shows that 30% and 40% customers wants to buy the car with
5bhp power and 2 bhp respectively. The rest prefer 4bhp model. Derive sampling
distribution of average engine power using sample size of 2, if the samples are obtained
through random sampling.
3. Let x denotes the storage capacity of a new pen drive that is launched in three models
that differ in storage space. Market survey shows that 20% customer buys the drive
with 16GB storage space, while 30%buys the drive with 32GB space. The rest prefer
the model with 64GB space. Derive the sampling distribution of average storage space
in a pen drive using sample size of 2 with random sampling.
2.8 REFERENCES
• Devore, J. (2012). Probability and statistics for engineers, 8th ed. Cengage Learning.
• John A. Rice (2007). Mathematical Statistics and Data Analysis, 3rd ed. Thomson
Brooks/Cole
• Larsen, R., Marx, M. (2011). An introduction to mathematical statistics and its
applications. Prentice Hall.
• Miller, I., Miller, M. (2017). J. Freund’s mathematical statistics with applications, 8th
ed. Pearson.
• Demetri Kantarelis, D. and Malcolm O. Asadoorian, M. O. (2009). Essentials of
Inferential Statistics, 5th edition, University Press of America.

25 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

LESSON 3
CHARACTERISTICS OF ESTIMATORS

STRUCTURE
3.1 Learning Objectives
3.2 Introduction
3.3 Characterstic of Estimators
3.3.1 Unbiasedness
3.3.2 Consistency
3.3.3 Efficiency
3.3.4 Sufficiency
3.4 In-Text Questions
3.5 Summary
3.6 Glossary
3.7 Answer to in-text questions
3.8 References
3.9 Suggested Readings
3.1 LEARNING OBJECTIVES
One of the main objectives of Statistics is to draw inferences about a population from the
analysis of a sample drawn from that population. Two important problems in statistical
inference are
(i) Estimation
(ii) Testing of Hypothesis.
The theory of estimation was founded by Prof. R.A. Fisher in a series of fundamental papers
round about 1930.
3.2 INTRODUCTION
Let us consider a random variable 𝑋 with 𝑝, 𝑑. 𝑓. 𝑓(𝑥, 𝜃). In most common applications, though
not always, the functional form of the population distribution is assumed to be known except
for the value of some unknown parameter(s) 𝜃 which may take any value on a set Θ. This is
26 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

expressed by writing the p.d.f. in the form 𝑓(𝑥, 𝜃), 𝜃 ∈ Θ. The set Θ, which is the set of all
possible values of 𝜃 is called the parameter space. Such a situation gives rise not to one
probability distribution but a family of probability distributions which we write as { f
(𝑥, 𝜃), 𝜃 ∈ Θ }, e.g., if 𝑋 ∼ 𝑁(𝜇, 𝜎 2 ), then the parameter space Θ = [(𝜇, 𝜎 2 ): −∞ < 𝜇 <
∞; 0 < 𝜎 < ∞ ]
In particular, for 𝜎 2 = 1, the family of probability distributions is given by:
{𝑁(𝜇, 1); 𝜇 ∈ Θ},whereΘ = {𝜇: −∞ < 𝜇 < ∞}
In the following discussion we shall consider a general family of distributions:
{𝑓(𝑥; 𝜃1 , 𝜃2 , … , 𝜃𝑘 ): 𝜃𝑖 ∈ Θ, 𝑖 = 1,2, … , 𝑘}.
Let us consider a random sample 𝑥1 , 𝑥2 , … , 𝑥𝑛 of size 𝑛 from a population, with probability
function 𝑓(𝑥; 𝜃1 , 𝜃2 , … , 𝜃𝑘 ), where 𝜃1 , 𝜃2 … , 𝜃𝑘 are the unknown population parameters. There
will then always be an infinite number of functions of sample values, called statistics, which
may be proposed as estimates of one or more the parameters.
Evidently, the best estimate would be one that falls nearest to the true value of the parameter
to be estimated. In other words, the statistic whose distribution concentrates as closely as
possible near the true value of the parameter may be regarded the best estimate. Hence the
basic problem of the estimation in the above case, can be formulated as follow

We wish to determine the functions of the sample observations :


𝑇1 = 𝜃ˆ1 (𝑥1 , 𝑥2 , … , 𝑥𝑛 ), 𝑇2 = 𝜃ˆ2 (𝑥1 , 𝑥2 , … , 𝑥𝑛 ), … , 𝑇𝑘 = 𝜃ˆ𝑘 (𝑥1 , 𝑥2 , … , 𝑥𝑛 ),
such that their distribution is concentrated as closely as possible near the true value of the
parameter. The estimating functions are then referred to as estimators.
3.3 CHARACTERISTICS OF ESTIMATORS
The following are some of the criteria that should be satisfied by a good estimator.
(i) Unbiasedness
(ii) Consistency
(iii) Efficiency
(iv) Sufficiency.
Now we shall now, briefly, explain these terms one by one.
3.3.1 Unbiasedness :
An estimator 𝑇𝑛 = 𝑇(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) is said to be an unbaised estimator of 𝛾(𝜃) if 𝐸(𝑇𝑛 ) =
𝛾(𝜃), for all 𝜃 ∈ Θ
We have seen that in sampling from a population with mean 𝜇 and variance 𝜎 2

27 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝐸(𝑥‾) = 𝜇
2) 2 2) 2
and 𝐸(𝑠 ≠ 𝜎 but 𝐸(𝑆 =𝜎 .
1
𝑠 2 = 𝑛 ∑𝑛𝑖=1 (𝑥𝑖 − 𝑥‾)2
1
Hence there is a reason to prefer 𝑆 2 = 𝑛−1 ∑𝑛𝑖=1 (𝑥𝑖 − 𝑥‾)2 , to the sample variance.
Note: If 𝐸(𝑇𝑛 ) > 𝜃, 𝑇𝑛 is said to be positively biased and if 𝐸(𝑇𝑛 ) < 𝜃, it is said to be
negatively biased, the amount of bias 𝑏(𝜃) being given by
𝑏(𝜃) = 𝐸(𝑇𝑛 ) − 𝛾(𝜃), 𝜃 ∈ Θ
Example 1. Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 is a random sample from a normal population 𝑁(𝜇, 1). Then
1
Show that 𝑡 = 𝑛 ∑𝑛𝑖=1 𝑥𝑖2 , is an unbiased estimator of 𝜇 2 + 1.
Solution. We are given: 𝐸(𝑥𝑖 ) = 𝜇, 𝑉(𝑥𝑖 ) = 1 ∀𝑖 = 1,2, … , 𝑛
2
Now 𝐸(𝑥𝑖2 ) = 𝑉(𝑥𝑖 ) + (𝐸(𝑥𝑖 )) = 1 + 𝜇 2.
1 1 1
∴ 𝐸(𝑡) = 𝐸 (𝑛 ∑𝑛𝑖=1 𝑥𝑖2 ) = 𝑛 ∑𝑛𝑖=1 𝐸(𝑥𝑖2 ) = 𝑛 ∑𝑛𝑖=1 (1 + 𝜇 2 ) = 1 + 𝜇 2

hence t is an unbiased estimator of 1 + 𝜇 2


Example 2. If 𝑇 is an unbiased estimator for 𝜃, show that 𝑇 2 is a biased estimator for 𝜃 2 .
Solution. Since 𝑇 is an unbiased estimator for 𝜃, we have 𝐸(𝑇) = 𝜃
Also Var (𝑇) = 𝐸(𝑇 2 ) − {𝐸(𝑇)}2 = 𝐸(𝑇 2 ) − 𝜃 2 ⇒ 𝐸(𝑇 2 ) = 𝜃 2 + Var (𝑇),
(Var 𝑇 > 0). Since 𝐸(𝑇 2 ) ≠ 𝜃 2 , 𝑇 2 is a biased estimator for 𝜃 2 .
Example 3. Let X be distributed in the Poisson form tith paraneter 𝜃. Show that only unbiased
estimator of exp (−(𝑘 + 1)𝜃), 𝑘 > 0, is 𝑇(𝑋) = (−𝑘)𝑥 so that 𝑇(𝑥) > 0 if 𝑥 is even and
T(𝑥) < 0 if 𝑥 is odd.
𝑒 −𝜃 𝜃𝑥
𝐸(𝑇(𝑋)} = 𝐸{(−𝑘)𝑥 }, 𝑘 > 0 = ∑∞ 𝑥
𝑥=0 (−𝑘) ( )
𝑥!
Solution. (−𝑘𝜃)𝑥
= 𝑒 −𝜃 ∑∞
𝑥=0 { } = 𝑒 −𝜃 ⋅ 𝑒 −𝑘𝜃 = 𝑒 −(1+𝑘)𝜃
𝑥!
⇒ 𝑇(𝑋) = (−𝑘)𝑥 is an unbiased estimator for exp {−(1 + 𝑘)𝜃}, 𝑘 > 0.
3.3.2: Consistency :
An estimator 𝑇𝑛 = 𝑇(𝑥1 , 𝑥2 , … , 𝑥𝑛 ), based on a random sample of size 𝑛, is said to be
consistent estinator of 𝛾(𝜃), 𝜃 ∈ Θ, the parameter space, if 𝑇𝑛 converges to 𝛾(𝜃) in probability,
𝑃
i.e. if 𝑇𝑛 → 𝛾(𝜃) as 𝑛 → ∞. In other twords, 𝑇𝑛 is a consistent cstinator of 𝛾(𝜃) if for edery
𝜀 > 𝜃, 𝑛 > 0, there exists a positive integer 𝑛 ≥ 𝑚(𝜀, 𝜂) such that
𝑃{|𝑇𝑛 − 𝛾(𝜃)| < 𝜀} → 1 as 𝑛 → ∞ ⇒ 𝑃{|𝑇𝑛 − 𝛾(𝜃)| < 𝜀} > 1 − 𝜂; ∀𝑛 ≥ 𝑚
where m is very large value of n.

28 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Note 1. If 𝑋1 , 𝑋2 , … , 𝑋𝑛 is a random sample from population with finite mean


𝐸𝑋𝑖 = 𝜇 𝑤ℎ𝑖𝑐ℎ 𝑖𝑠 𝑓𝑖𝑛𝑖𝑡𝑒 𝑡ℎ𝑒𝑛 by Khinchine's weak law of large numbers (𝑊LLN), we have
𝑛
1 𝑝
𝑋‾n = ∑ 𝑋𝑖 ⟶ 𝐸(𝑋𝑖 ) = 𝜇, as 𝑛 → ∞
𝑛
𝑖=1

Hence sample mean (𝑋‾𝑛 ) is always a consistent estimator of the population mean (𝜇).
Note 2. Obviously consistency is a property concerning the behaviour of an estimator for
indefinitely large values of the sample size 𝑛, Lee, as 𝑛 → ∞. Nothing is regarded of its
Moreover, if there exists a consistent estimator, say, 𝑇n of 𝛾(𝜃), then infinitely many such
behaviour for finite 𝑛. eftimators can be constructed, eg.
𝑛−𝑎 1 − (𝑎/𝑛) 𝑝
𝑇𝑛′ = ( ) 𝑇𝑚 = [ ] 𝑇𝑛 → 𝑇𝑛 ⟶ 𝛾(𝜃), as 𝑛 → ∞
𝑛−𝑏 1 − ((b/𝑛)
and hence, for different values of 𝑎 and 𝑏, 𝑇𝑛 is also consistent for 𝛾(𝜃).
Invariance Property of Consistent Estimators.
Theorem : If 𝑇𝑛 is a consistent estimator of 𝛾(𝜃) and 𝜓(𝛾(𝜃)) is a contimuous functiog of
𝛾(𝜃), then 𝜓(𝑇𝑛 ) is a consistent estimator of 𝜓(𝛾(𝜃)).
𝑝
Proof. Since 𝑇𝑛 is a consistent estimator of 𝛾(𝜃), 𝑇𝑛 ⟶ 𝛾(𝜃 as 𝑛 → ∞, i.e. for every 𝜀 >
0, 𝜂 > 0, ∃ a positive integer 𝑛 ≥ 𝑚(𝜀, 𝜂) such that
𝑃{|𝑇𝑛 − 𝛾(𝜃)| < 𝑒 ∣> 1 − 𝑛, ∀𝑛 ≥ 𝑚
Since 𝜓(⋅) is a continuous function, for every 𝜀 > 0, however small, ∃ a positive number 𝜀1
such that | 𝜓(𝑇𝑛 ) − 𝜓|𝛾(𝜃)| ∣< 𝜀1, whenever |𝑇𝑛 − 𝛾(𝜃)| < 𝜀, i.e.,
|𝑇𝑛 − 𝛾(𝜃)| < 𝜀 ⇒ |𝜓(𝑇𝑛 ) − 𝜓|𝛾(𝜃)|| < 𝜀1
For two events 𝐴 and 𝐵, if 𝐴 ⇒ 𝐵, then
𝐴 ⊆ 𝐵 ⇒ 𝑃(𝐴) ≤ 𝑃(𝐵) or 𝑃(𝐵) ≥ 𝑃(𝐴)
So
𝑃[∣ 𝜓(𝑇𝑛 ) − 𝜓(𝛾(𝜃)|| < 𝜀1 ] ≥ 𝑃[|𝑇𝑛 − 𝛾(𝜃)| < 𝜀]
⇒ 𝑃[∣ 𝜓(𝑇𝑛 ) − 𝜓(𝛾(𝜃)|| < 𝜀1 ] ≥ 1 − 𝜂; ∀𝑛 ≥ 𝑚
𝑝
𝜓(𝑇𝑛 ) ⟶ 𝜓[𝛾(𝜃)], as 𝑛 → ∞ or 𝜓(𝑇𝑛 ) is a consistent estimator of 𝛾(𝜃).

29 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Sufficient Conditions for Consistency


Theorem 2. Let {𝑇𝑛 } be a sequence of estimators such that for all 𝜃 ∈ Θ,
(i) 𝐸𝜃 (𝑇𝑛 ) → 𝛾(𝜃), 𝑛 → ∞ and
(ii) Var𝜃 (𝑇𝑛 ) → 0, as 𝑛 → ∞. Then 𝑇𝑛 is a consistent estimator of 𝛾(𝜃).
Example 1. If 𝑋1 , 𝑋2 … , 𝑋𝑛 are random observations on a Bernoulli variate 𝑋 taking the value
𝑋 with probability 𝑝 and the value 0 with probability
(1 − 𝑝), show that :
Σ𝑥𝑖 Σ𝑥𝑖
(1 − ) is a consistent estimator of 𝑝(1 − 𝑝).
𝑛 𝑛
Solution. Since 𝑋1 , 𝑋2 , … , 𝑋𝑛 are i.i.d Bernoulli variates with parameter ' 𝑝 ',
𝑛

𝑇 = ∑ 𝑥𝑖 − 𝐵(𝑛, 𝑝) ⇒ 𝐸(𝑇) = 𝑛𝑝 and Var (𝑇) = 𝑛𝑝𝑞


𝑖=1
𝑛
1 𝑇 1 1
𝑋‾ = ∑ 𝑥𝑖 = ⇒ 𝐸(𝑋‾) = 𝐸(𝑇‾) = , 𝑛𝑝 = 𝑝
𝑛 𝑛 𝑛 𝑛
𝑖=1

𝑇 1 𝑝𝑞
Var (𝑋‾) = Var (𝑛) = 𝑛2 , Var (𝑇) = 𝑛 → 0 as 𝑛 → ∞.

Since 𝐸(𝑋‾) → 𝑝 and Var (𝑋‾) → 0, as 𝑛 → ∞; 𝑋‾ is a consistent estimator of 𝑝. Also


Σ𝑥𝑖 Σ𝑥
(1 − 𝑖 ) = 𝑋‾(1 − 𝑋‾), being a polynomial in 𝑋‾, is a continuous function of 𝑋‾.
𝑛 𝑛
Since 𝑋‾ is consistent estimator of 𝑝, by the invariance property of consistent estimators
(Theorem 17.1), 𝑋‾(1 − 𝑋‾) is a consistent estimator of 𝑝(1 − 𝑝).

3.3.3 Efficient Estimators:


Even if we confine ourselves to unbiased estimates, there will, in general, exist more than one
consistent estimator of a parameter. For example, in sampling from a normal population
𝑁(𝜇, 𝜎 2 ), when 𝜎 2 is known, sample mean 𝑥‾ is an unbiased and consistent estimator of 𝜇 .
From symmetry it follows immediately that sample median (𝑀𝑑) is an unbiased estimate of
𝜇, which is same as the population median. Also, for large 𝑛,
1
𝑉(𝑀𝑑) =
4𝜋𝑓12
Here 𝑓1 = Median ordinate of the parent distribution. = Modal ordinate of the parent
distribution.

30 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1 1
=[ exp {−(𝑥 − 𝜇)2 /2𝜎 2 }] =
𝜎√2𝜋 𝑥=𝜇 𝜎√2𝜋
2
1 𝜋𝜎
∴ 𝑉(𝑀𝑑) = ⋅ 2𝜋𝜎 2 =
4𝑛 2𝑛

median is also an unbiased and consistent estimator of 𝜇. Thus, there is a necessity of some
further criterion which will enable us to choose between the estimators with the common
property of consistency. Such a criterion which is based on the variances of the sampling
distribution of estimators is usually known as efficiency. If, of the two consistent estimators
𝑇1 , 𝑇2 of a certain parameter 𝜃, we have
𝑉(𝑇1 ) < 𝑉(𝑇2 ), for all 𝑛 then 𝑇1 is more efficient than 𝑇2 for all samples sizes.

We have seen above :


𝜎2 𝜋𝜎2 𝜎2
For all 𝑛, 𝑉(𝑥‾) = and for large 𝑛, 𝑉(𝑀𝑑) = = 1.57
𝑛 2𝑛 𝑛

Since 𝑉(𝑥‾) < 𝑉(𝑀𝑑), we conclude that for normal distribution, sample mean is more efficient
estimator for 𝜇 than the sample median, for large samples at least.
Most Efficient Estimator: If in a class of consistent estimators for a parameter, there exist
Vone whose sampling variance is less than that of any such estimator, it is called the most
efficient estimator. Wheneoer such an estimator exists, it provides a criterion for measurement
of efficiency of the other estimators.
Efficiency If 𝑇1 is the most efficient estimator with variance 𝑉1 and 𝑇2 is ary other estimator
with variance 𝑉2, then the efficiency 𝐸 of 𝑇2 is defined as :
𝑉1
𝐸=
𝑉2
Obvionsly, E cannot exceed unity.
If 𝑇, 𝑇1 , 𝑇2 , … , 𝑇𝑛 are all estimators of 𝛾(𝜃) and Var (𝑇) is minimum, then the efficiency 𝐸𝑖 of
𝑇i , (𝑖 = 1,2, … , 𝑛) is defined as :
Var 𝑇
𝐸𝑖 = ; 𝑖 = 1,2, … , 𝑛
Var 𝑇𝑖
Obviously 𝐸𝑖 ≤ 1; 𝑖 = 1,2, … 𝑛. For example, in the normal samples, since sample mean 𝑥‾ is
the most efficient estimator of 𝜇 , the efficiency E of 𝑀𝑑 for such samples, (for large 𝑛 ), is :
𝑉(𝑥‾) 𝜎 2 /𝑛 2
𝐸= = = = 0.637.
𝑉(𝑀𝑑) 𝜋𝜎 2 /(2𝑛) 𝜋

31 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Example 1. A random sample (𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 , 𝑋5 ) of size 5 is drawn from a normal population


with unknowon mean 𝜇. Consider the following estimators to estimate 𝜇 :
𝑋1 +𝑋2 +𝑋1 +𝑋4 +𝑋5
(i) 𝑡1 = ,
5
𝑋1 +𝑋2
(ii) 𝑡2 = + 𝑋3
2
2𝑋1 +𝑋2 +𝜆𝑋3
(iii) 𝑡3 = . where 𝜆 is such that 𝑡3 is an unbiased estimator of 𝜇.
3

Find 𝜆. Are 𝑡1 and 𝑡2 mbiased? State giving reasons, the estimator which is best among 𝑡1 , 𝑡2
and 𝑡3
Solution. We are given :
𝐸(𝑋𝑖 ) = 𝜇, Var (𝑋𝑖 ) = 𝜎 2 , ( say ); Cov (𝑋𝑖 , 𝑋𝑗 ) = 0, (𝑖 ≠ 𝑗 = 1,2, … , 𝑛)
1 1 1
(i) 𝐸(𝑡1 ) = 5 ∑5𝑖=1 𝐸(𝑋𝑖 ) = 5 ∑5𝑖=1 𝜇 = 5 , 5𝜇 = 𝜇 ⇒ 𝑡1 is an unbiased estimator of 𝜇.
1 1
(ii) 𝐸(𝑡2 ) = 2 𝐸(𝑋1 + 𝑋2 ) + 𝐸(𝑋3 ) = 2 (𝜇 + 𝜇) + 𝜇 = 2𝜇

⇒ 𝑡2 is not an unbiased estimator of 𝜇.


1
(iii) 𝐸(𝑡3 ) = 𝜇 ⇒ 3 𝐸(2𝑋1 + 𝑋2 + 𝜆𝑋3 ) = 𝜇
(∵ 𝑡3 is unbiased estimator of 𝜇)
∴ 2𝐸(𝑋1 ) + 𝐸(𝑋2 ) + 𝜆𝐸(𝑋3 ) = 3𝜇 ∴ 2𝜇 + 𝜇 + 𝜆𝜇 = 3𝜇 ⇒ 𝜆 = 0

1 1
𝑉(𝑡1 ) = {𝑉(𝑋1 ) + 𝑉(𝑋2 ) + 𝑉(𝑋3 ) + 𝑉(𝑋4 ) + 𝑉(𝑋5 )} = 𝜎 2
25 5
1 1 2 3
𝑉(𝑡2 ) = {𝑉(𝑋1 ) + 𝑉(𝑋2 )} + 𝑉(𝑋3 ) = 𝜎 + 𝜎 2 = 𝜎 2
4 2 2
1 1 5
𝑉(𝑡3 ) = {4𝑉(𝑋1 ) + 𝑉(𝑋2 )} = (4𝜎 2 + 𝜎 2 ) = 𝜎 2 (∵ 𝜆 = 0)
9 9 9
Since 𝑉(𝑡1 ) is least, 𝑡1 is the best estimator (in the sense of least variance) of 𝜇.
Example 2. 𝑋1 , 𝑋2, and 𝑋3 is a random sample of size 3 from a population with mean value 𝜇
ald variance 𝜎 2 , 𝑇1 , 𝑇2 , 𝑇3 are the estimators used to estimate mean value 𝜇, where 𝑇1 = 𝑋1 +
1
𝑋2 − 𝑋3 , 𝑇2 = 2𝑋1 + 3𝑋3 − 4𝑋2 , and 𝑇3 = 3 (𝜆𝑋1 + 𝑋2 + 𝑋3 )/3.

(i) Are 𝑇1 and 𝑇2 unbiased estimators ?


(ii) Find the value of 𝜆 such that 𝑇3 is unbiased estimator for 𝜇.
(iii) With this value of 𝜆 is 𝑇3 a consistent estimator ?
(iv) Which is the best estimator ?

32 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Solution. Since 𝑋1 , 𝑋2 , 𝑋3 is a random sample from a population with mean 𝜇 and variance
𝜎 2 , 𝐸(𝑋𝑖 ) = 𝜇, Var (𝑋𝑖 ) = 𝜎 2 and Cov (𝑋𝑖 , 𝑋𝑗 ) = 0, (𝑖 ≠ 𝑗 = 1,2, … , 𝑛)
(i) We have
𝐸(𝑇1 ) = 𝐸(𝑋1 ) + 𝐸(𝑋2 ) − 𝐸(𝑋3 ) = 𝜇 ⇒ 𝑇1 is an unbiased estimator of 𝜇
𝐸(𝑇2 ) = 2𝐸(𝑋1 ) + 3𝐸(𝑋3 ) − 4𝐸(𝑋2 ) = 𝜇 ⇒ 𝑇2 is an unbiased estimator of 𝜇.
1
(ii) We are given: 𝐸(𝑇3 ) = 𝜇 ⇒ 3 {𝜆𝐸(𝑋1 ) + 𝐸(𝑋2 ) + 𝐸(𝑋3 )} = 𝜇
1
⇒ (𝜆𝜇 + 𝜇 + 𝜇) = 𝜇 ⇒ 𝜆 + 2 = 3 ⇒ 𝜆 = 1
3
1
(iii) With 𝜆 = 1, 𝑇3 = 3 (𝑋1 + 𝑋2 + 𝑋3 ) = 𝑋‾. Since sample mean is a consistent estimator of
population mean 𝜇, by Weak Law of Large Numbers, 𝑇3 is a consistent estimator of 𝜇.
(iv) We have
Var (𝑇1 ) = Var (𝑋1 ) + Var (𝑋2 ) + Var (𝑋3 ) = 3𝜎 2
Var (𝑇2 ) = 4Var (𝑋1 ) + 9Var (𝑋3 ) + 16Var (𝑋2 ) = 29𝜎 2
1 1
Var (𝑇3 ) = [Var (𝑋1 ) + Var (𝑋2 ) + Var (𝑋3 )] = 𝜎 2 (∵ 𝜆 = 1)
9 3
Since Var (𝑇3 ) is minimum, 𝑇3 is the best estimator of 𝜇 in the sense of minimum variance.
Minimum Variance Unbiased (M.V.U.) Estimators:
If a statistic 𝑇 = 𝑇(𝑥1 , 𝑥2 , … , 𝑥𝑛 ), based on sample of size 𝑛 is such that
(i) 𝑇 is unbiased for 𝛾(𝜃), for all 𝜃 ∈ Θ and
(ii) It has the smallest variance among the class of all unbiased estimators of 𝛾(𝜃), then 𝑇
is called the minimum variance unbiased estimator (𝑀𝑉𝑈𝐸) of 𝛾(𝜃).
More precisely, 𝑇 is MVUE of 𝛾(𝜃) if
𝐸𝜃 (𝑇) = 𝛾(𝜃) for all 𝜃 ∈ Θ and Var𝜃 (𝑇) ≤ Var𝜃 (𝑇 ′ ) for all 𝜃 ∈ Θ
where 𝑇 ′ is any other unbiased estimator of 𝛾(𝜃).

We give below some important Theorems concerning MVU estimators.


Theorem 1. An M.V.U. is unique in the sense that if 𝑇1 and 𝑇2 are M.V.U. estimator for
𝛾(𝜃), then 𝑇1 = 𝑇2 , almost surely.

33 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Proof. We are given that


𝐸𝜃 (𝑇1 ) = 𝐸0 (𝑇2 ) = 𝛾(𝜃), for all 𝜃 ∈ Θ
and }
Var𝜃 (𝑇1 ) = Var0 (𝑇2 ), for all 𝜃 ∈ Θ
1
Consider a new estimator, 𝑇 = 2 (𝑇1 + 𝑇2 ) which is also unbiased since,

1
𝐸(𝑇) = {𝐸(𝑇1 ) + 𝐸(𝑇2 )} = 𝛾(𝜃)
2
1 1
Var (𝑇) = Var { (𝑇1 + 𝑇2 )} = Var (𝑇1 + 𝑇2 ) [∵ Var (𝐶𝑋) = 𝐶 2 Var (𝑋)]
2 4
1
= {Var (𝑇1 ) + Var (𝑇2 ) + 2Cov (𝑇1 , 𝑇2 )}
4
1
= {Var (𝑇1 ) + Var (𝑇2 ) + 2𝜌√Var (𝑇1 )Var (𝑇2 )}
4
1
= Var (𝑇1 )(1 + 𝜌),
2
where 𝜌 is Karl Pearson's co-efficient of correlation between 𝑇1 and 𝑇2 .
Since 𝑇1 is the 𝑀𝑉𝑈 estimator, Var (𝑇) ≥ Var (𝑇1 )
1 1
⇒ Var (𝑇1 )(1 + 𝜌) ≥ Var (𝑇1 ) ⇒ (1 + 𝜌) ≥ 1 ⇒ 𝜌 ≥ 1
2 2
Since |𝜌| ≤ 1, we must have 𝜌 = 1, i.e., 𝑇1 and 𝑇2 must have a linear relation of the form:
𝑇1 = 𝛼 + 𝛽𝑇2 where 𝛼 and 𝛽 are constants independent of 𝑥1 , 𝑥2 , … , 𝑥𝑛 but may depend on
𝜃, i.e., we may have 𝛼 = 𝛼(𝜃) and 𝛽 = 𝛽(𝜃).
Taking expectation of both sides then we get
𝜃 = 𝛼 + 𝛽𝜃
Also we get Var (𝑇1 ) = Var (𝛼 + 𝛽𝑇2 ) = 𝛽 2 Var (𝑇2 )
⇒ 1 = 𝛽 2 ⇒ 𝛽 = ±1
But since 𝜌(𝑇1 , 𝑇2 ) = +1, the coefficient of regression of 𝑇1 on 𝑇2 must be positive.
∴ 𝛽=1⇒𝛼=0
so we get 𝑇1 = 𝑇2 as desired.
Theorem 2. Let 𝑇1 and 𝑇2 be unbiased estimators of 𝛾(𝜃) with efficiencies 𝑐1 and 𝑐2
respectively anf 𝜌 = 𝜌𝜃 be the correlation coefficient between them. Then
√𝑒1 𝑒2 − √(1 − 𝑒1 )(1 − 𝑒2 ) ≤ 𝜌 ≤ √𝑒1 𝑒2 + √(1 − 𝑒1 )(1 − 𝑒2 )
Proof. Let 𝑇 be minimum variance unbiased estimator of 𝛾(𝜃). Then we are given :

34 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝐸𝜃 (𝑇1 ) = 𝛾(𝜃) = 𝐸𝜃 (𝑇2 ) ∀𝜃 ∈ Θ


𝑉𝜃 (𝑇) 𝑉 𝑉
𝑒1 = = , ( say ) ⇒ 𝑉1 =
𝑉𝜃 (𝑇1 ) 𝑉1 𝑐1
𝑉𝜃 (𝑇) 𝑉 𝑉
𝑒2 = = , ( say ) ⇒ 𝑉2 =
𝑉𝜃 (𝑇2 ) 𝑉2 𝑐2
Let us consider another estimator: 𝑇3 = 𝜆𝑇1 + 𝜇𝑇2 ,
which is also unbiased estimator of 𝛾(𝜃) i.e.,
𝐸(𝑇3 ) = (𝜆 + 𝜇)𝛾(𝜃) = 𝛾(𝜃) ⇒ 𝜆 + 𝜇 = 1
𝑉𝜃 (𝑇3 ) = 𝑉(𝜆𝑇1 + 𝜇𝑇2 ) = 𝜆2 𝑉(𝑇1 ) + 𝜇 2 𝑉(𝑇2 ) + 2𝜆𝜇Cov (𝑇1 , 𝑇2 )
𝜆2 𝜇 2 𝜆𝜇𝜌
= 𝑉( + +2⋅ )
𝑒1 𝑐2 √𝑒1 𝑒2

But 𝑉𝜃 (𝑇3 ) ≥ 𝑉, since 𝑉 is the minimum variance.


𝜆2 𝜇 2 2𝜌𝜆𝜇
∴ + + ≥ 1 = (𝜆 + 𝜇)2
𝑒1 𝑒2 √𝑒1 𝑒2
1 1 𝜌
⇒ ( − 1) 𝜆2 + ( − 1) 𝜇 2 + 2𝜆𝜇 ( − 1) ≥ 0
𝑐1 𝑐2 √𝑒1 𝑒2
1 𝜆 2 𝑝 𝜆 1
⇒ ( − 1) ( ) + 2 ( − 1) ( ) + ( − 1) ≥ 0,
𝑐1 𝜇 √𝑐1 𝑒2 𝜇 𝑐2

which is quadratic expression in (𝜆/𝜇).


1 1
Note that: 𝑒𝑖 < 1 ⇒ 𝑐 > 1 or (𝑐 − 1) > 0, 𝑖 = 1,2
𝑖 𝑖
We know that 𝐴𝑋 2 + 𝐵𝑋 + 𝐶 ≥ 0∀𝑥, 𝐴 > 0, 𝐶 > 0; if and only if
Discriminant = 𝐵 2 − 4𝐴𝐶 ≤ 0

𝜌 2 1 1 2
( − 1) − ( − 1) ( − 1) ≤ 0 ⇒ (𝜌 − √𝑒1 𝑒2 ) − (1 − 𝑒1 )(1 − 𝑒2 ) ≤ 0
√𝑒1 𝑒2 𝑒1 𝑒2
∴ 𝜌2 − 2√𝑒1 𝑒2 𝜌 + (𝑒1 + 𝑒2 − 1) ≤ 0
This implies that 𝜌 lies between the roots of the equation :
𝜌2 − 2√𝑒1 𝑐2 𝜌 + (𝑒1 + 𝑒2 − 1) = 0
1
which are given by : 2 {2√𝑒1 𝑒2 ± 2√𝑒1 𝑒2 − (𝑒1 + 𝑒2 − 1)} =

35 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

√𝑒1 𝑒2 ± √(𝑒1 − 1)(𝑒2 − 1) Hence we have:


√𝑒1 𝑒2 − √(𝑒1 − 1)(𝑒2 − 1) ≤ 𝜌 ≤ √𝑒1 𝑒2 + √(𝑒1 − 1)(𝑒2 − 1)
⇒ √𝑒1 𝑒2 − √(1 − 𝑒1 )(1 − 𝑒2 ) ≤ 𝜌 ≤ √𝑒1 𝑒2 + √(1 − 𝑒1 )(1 − 𝑒2 )
3.3.4 Sufficient Estimators:
An estimator is said to be sufficient for a parameter, if it contains alf the information in the
sample regarding the parameter.
If T = t(𝑥1 , 𝑥2 … , 𝑥𝑛 ) is an estimator of a parameter 𝜃, based on a sample 𝑥1 , 𝑥2 , … , 𝑥𝑛 of size
𝑛 from the population with density 𝑓(𝑥, 𝜃) such that the conditional distribution of
𝑥1 , 𝑥2 , … , 𝑥𝑛 given T is independent of 𝜃 then T is sufficient estimator for 𝜃.
Example Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be a random sample from a Bernoulli population with pararaeter '
𝑝 ', 0 < 𝑝 < 1, i.e.,
1, with probability 𝑝
𝑥i = {
0, with probability 𝑞 = (1 − 𝑝)
Then
𝑇 = 𝑡(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) = 𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 follow 𝐵(𝑛, 𝑝)
𝑛
∴ 𝑃(𝑇 = 𝑘) = ( ) 𝑝k (1 − 𝑝)𝑛−𝑘 ; 𝑘 = 0,1,2, … , 𝑛
𝑘
The conditional distribution of (𝑥1 , 𝑥2 , … , 𝑥n ) given 𝑇 is :
𝑃(𝑥1 ∩ 𝑥2 ∩ … ∩ 𝑥𝑛 ∩ 𝑇 = 𝑘)
𝑃(𝑥1 ∩ 𝑥2 ∩ … ∩ 𝑥𝑛 ∣ 𝑇 = 𝑘) =
𝑃(𝑇 = 𝑘)
𝑝 (1 − 𝑝)𝑛−𝑘
k
1
𝑛 = 𝑛
( ) 𝑝𝑘 (1 − 𝑝)𝑛−𝑘 ( )
= 𝑘 𝑘
𝑛

0, if ∑ 𝑥𝑖 ≠ 𝑘
{ 𝑖=1

Since this does not depend on ' 𝑝 , 𝑇 = ∑𝑛𝑖=1 𝑥𝑖 is sufficient for ' 𝑝 '.
FACTORIZATION THEOREM (Neymann).
The necessary and sufficient condition for a distribution to admit sufficient statistic is
provided by the 'factorization theorem' due to Neymann.
Statement 𝑇 = 𝑡(𝑥) is sufficient for 𝜃 if and only if the joint density function 𝐿 (say), of the
sample values can be expressed in the form:
𝐿 = 𝑔𝜃 [𝑡(𝑥)] ⋅ ℎ(𝑥)

36 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

where (as indicated) 𝑔𝜃 [𝑡(𝑥)] depends on 𝜃 and 𝑥 only through the value of 𝑡(𝑥) and ℎ(𝑥) is
independent of 𝜃.
Remarks 1. It should be clearly understood that by 'a function independent of 𝜃 , we not
only mean that it does not involve 𝜃 but also that its domain does not contain 𝜃. For example,
the function:
1
𝑓(𝑥) = , 𝑎 − 𝜃 < 𝑥 < 𝑎 + 𝜃; −∞ < 𝜃 < ∞
2𝑎
depends on 𝜃.
2. It should be noted that the original sample 𝑋 = (𝑋1 , 𝑋2 , … , 𝑋𝑛 ), is always a sufficient
statistic.
3. The most general form of the distributions admitting sufficient statistic is Koopman's
form and is given by: 𝐿 = 𝐿(𝐱, 𝜃) = 𝑔(𝑥) ⋅ ℎ(𝜃). exp {𝑎(𝜃)𝜓(𝑥)] where ℎ(𝜃) and
𝑎(𝜃) are functions of the parameter 𝜃 only and 𝑔(𝑥) and 𝜓(𝑥) are the functions of the
sample observations only.
4. Invariance Property of Sufficient Estimator: If 𝑇 is a sufficient estimator for the
parameter 𝜃 ayd if 𝜓(𝑇) is a one to one function of 𝑇, then 𝜓(𝑇) is sufficient for 𝜓(𝜃).
5. Fisher-Neyman Criterion. A statistic 𝑡1 = 𝑡(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) is sufficient estimator of
parimeter 𝜃 if and only if the likelihood function (joint p.d.f of the sample) can be
expressed as :
𝑛

𝐿 = ∏ 𝑓(𝑥𝑖 , 𝜃) = 𝑔1 (𝑡1 , 𝜃) ⋅ 𝑘(𝑥1 , 𝑥2 , … , 𝑥𝑛 )


𝑖=1

where 𝑔1 (𝑡1 , 𝜃) is the p.d.f. of the statistic 𝑡1 and 𝑘(𝑥1 , 𝑥2 … . 𝑥𝑛 ) is a function of sample
observations only, independent of 𝜃.
Note that this method requires the working out of the p.d.f. (p.m.f.) of the statistic 𝑡1 =
𝑡(𝑥1 , 𝑥2 , … , 𝑥𝑛 ), which is not always easy.

Example 1. Let 𝑥1 , 𝑥2 , … , 𝑥1 be a random sample from a uniform population on [0, 𝜃]. Find
asufficient estimator for 𝜃.
1
, 0 ≤ 𝑥𝑖 ≤ 𝜃
Solution. We are given: 𝑓𝜃 (𝑥𝑖 ) = { 𝜃
0, otherwise
1, if 𝑎 ≤ 𝑏 𝑘(0,𝑥𝑖 )𝑘(𝑥𝑖 ,𝜃)
Let 𝑘(𝑎, 𝑏) = }. then 𝑓0 (𝑥𝑖 ) = ,
0, if 𝑎 > 𝑏 𝜃

37 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑛 𝑛
𝑘(0, 𝑥𝑖 )𝑘(𝑥𝑖 , 𝜃) 𝑘 (0, min 𝑥𝑖 ) ⋅ 𝑘 ( max 𝑥𝑖 , 𝜃)
1≤𝑖≤𝑛 1≤𝑖≤𝑛
𝐿 = ∏ 𝑓𝜃 (𝑥𝑖 ) = ∏ [ ]= = 𝑔0 (𝑡(𝑥) ∣ ℎ(𝑥)
𝜃 𝜃𝑛
𝑖=1 𝑖=1

where
𝑘{𝑡(𝐱), 𝜃}
𝑔0 [𝑡(𝐱)] = , 𝑡(𝑥) = max 𝑥𝑖 and ℎ(𝑥) = 𝑘 (0, min 𝑥𝑖 )
𝜃𝑛 1≤𝑖≤𝑛 1≤𝑖≤𝑛

Hence by Factorization theorem, 𝑇 = max1≤𝑖≤𝜋 𝑥𝑖 , is sufficient statistic for 𝜃.

Aliter. We have
𝑛
1
𝐿 = ∏ 𝑓(𝑥𝑖 , 𝜃) = ; 0 < 𝑥𝑖 < 𝜃
𝜃𝑛
𝑖=1

If 𝑡 = max(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) = 𝑥(𝑛) , then 𝑝. 𝑑. 𝑓. of 𝑡 is given by :


= 𝑛{𝐹(𝑥𝑛 )]𝑛−1 ⋅ 𝑓(𝑥(𝑛) )
𝑔(𝑡, 𝜃)
𝑥 ∗
1 𝑥
We have 𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∫ 𝑓(𝑥, 𝜃)𝑑𝑥 = ∫ , 𝑑𝑥 =
0 0 𝜃 𝜃
𝑥(𝑛) 𝑛−1 1 𝑢 𝑛−1
∴ 𝑔(𝑡, 𝜃) = 𝑛 { } ( ) = 𝑛 [𝑥(𝑛) ]
𝜃 𝜃 𝜃
Hence by Fisher - Neymann criterion, the statistic 𝑡 = 𝑥(𝑛) , is sufficient estimator for 𝜃.
Exampla 17.14. Let 𝑥1 , 𝑥2 , … , 𝑥1 be a random sample from 𝑁(𝜇, 𝜎 2 ) population. Find
sufficient esfimators for 𝜇 and 𝜎 2 .
Solution. Let us write 𝜃 = (𝜇, 𝜎 2 ); −∞ < 𝜇 < ∞, 0 < 𝜎 2 < ∞.
Then
𝑛 𝑛 𝑛
11
𝐿 = ∏ 𝑓0 (𝑥𝑖 ) = { } ⋅ exp {− 2 ∑ (𝑥𝑖 − 𝜇)2 }
𝜎√2𝜋 2𝜎
𝑖=1 𝑖=1
𝑛 𝑛
1 1
=( ) exp {− 2 (∑ 𝑥𝑖2 − 2𝜇 ∑ 𝑥𝑖 + 𝑛𝜇 2 )}
𝜎√2𝜋 2𝜎
𝑖=1
= 𝑔𝜃 [𝑡(𝑥)] ⋅ ℎ(𝑥)

1 𝑛 1
where 𝑔𝜃 [𝑓(𝑥)] = (𝜎√2𝜋) exp [− 2𝜎2 {𝑓2 (𝑥) − 2𝜇𝜇1 (𝑥) + 𝑛𝜇 2 }]

𝑡(𝑥) = |𝑡1 (𝑥), 𝑡2 (𝑥)| = (Σ𝑥1 , Σ𝑥𝑖2 ) and ℎ(𝑥) = 1


38 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Thus 𝑡1 (𝑥) = Σ𝑥1 is sufficient for 𝜇 and 𝑡2 (𝑥) = ∑𝑥12 , is sufficient for 𝜎 2 .

Example 3. Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from a distribution with p.d.f.:


𝑓(𝑥, 𝜃) = 𝑒 −(𝑥−𝜃) , 𝜃 < 𝑥 < ∞; −∞ < 𝜃 < ∞
Obtain sufficient statistic for 𝜃.
Solution. Here
𝑛 𝑛 𝑛

𝐿 = ∑ 𝑓(𝑥𝑖 𝜃) = ∑ {𝑒 −(𝑥𝑖 −𝜃) } = exp (− ∑ 𝑥𝑖 ) × exp (𝑛𝜃)


𝑖=1 𝑖=1 𝑖=1

Let 𝑌1 , 𝑌2 , … , 𝑌𝑛 denote the order statistics of the random sample such that 𝑌1 < 𝑌2 < ⋯ <
𝑌𝑛 . The p.d.f. of the smallest observation 𝑌1 is given by:
𝑔1 (𝑦1 , 𝜃) = 𝑛[1 − 𝐹(𝑦1 )]𝑛−1 𝑓(𝑦1 , 𝜃),
where 𝐹(⋅) is the distribution function corresponding to 𝑝 ⋅ 𝑑. 𝑓. 𝑓(⋅).
Thus the likelihood function (") of X1 , X2 , … , X𝑛 may be expressed as
𝑛
𝑛𝜃
exp (− ∑𝑛𝑖=1 𝑥𝑖 )
𝐿 =𝑒 exp (− ∑ 𝑥𝑖 ) = 𝑛exp (−𝑛(𝑦1 − 𝜃)) { }
𝑛exp (−𝑛𝑦𝑖 )
𝑖=1
exp (− ∑𝑛𝑖=1 𝑥𝑖 )
= 𝑔1 (m 𝑥𝑖 , 𝜃) { }
𝑛exp (−𝑛𝑦𝑖 )
Hence by Fisher-Neymann criterion, the first order statistic 𝑌1 = min(𝑋1 , 𝑋2 , … , 𝑋𝑛 ) is a
sufficient statistic for 𝜃.
3.4 IN-TEXT QUESTIONS
Question: 1
Let 𝑋1 , 𝑋2 , … , 𝑋𝑁

be identically distributed random variable with mean 2 and variance 1. Let N be a random
variable follows Poisson distribution with mean 2 and independent of
𝑋i′ S. Let 𝑆𝑁 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑁 , then Var (SN ) is equals
A. 4
B. 10
C. 2
D. 1

39 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question: 2
Let 𝐴 and 𝐵 be independent Random Variables each having the uniform distribution on
[0,1].
Let 𝑈 = min{𝐴, 𝐵} and 𝑉 = max{𝐴, 𝐵}, then Cov (𝑈, 𝑉) is equals
A. -1/36
B. 1/36
C. 1
D. 0
Question: 3
Let 𝑋1 , 𝑋2 , 𝑋3 be random sample from uniform (0, 𝜃 2 ), 𝜃 > 1 Then maximum likelihood
estimation (mle) of 𝜃
2
A. 𝑋(1)
B. √X(3)
C. √X(1)
D. 𝛼𝑋(1) + (1 − 𝛼)𝑋(3) ; 0 < 𝛼 < 1
Question: 4
For the discrete variate with density:

1 6 1
𝑓(𝑥) = 𝐼(−1) (𝑥) + 𝐼(0) (𝑥) + 𝐼(1) (𝑥).
8 8 8

Which of the following is TRUE?


1
A. 𝐸(𝑋) = 2
1
B. 𝑉(𝑋) = 2
1
C. 𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } ≤ 4
1
D. 𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } ≥ 4
Question: 5
Lęt 𝑋𝑖 , 𝑌𝑖 ; (𝑖 = 1,2)
be a i.i.d random sample of size 2 from a standard normal distribution. What is the
distribution W is given by

40 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

√2(𝑋1 + 𝑋2 )
𝑊=
√(𝑋2 − 𝑋1 )2 + (𝑌2 − 𝑌1 )2

A. t-distribution with 1 d.f


B. t-distribution with 2 d.f
C. Chi-square distribution with 2 d.f
D. Does not determined
Question: 6
The moment generating function of a random variable X is
given by
1 1 𝑡 1 2𝑡 1 3𝑡
𝑀𝑋 (𝑡) = + 𝑒 + 𝑒 + 𝑒 , −∞ < 𝑡 < ∞
6 3 3 6

Then P(X ≤ 2) equals

1
A. 3

1
B. 6

1
C. 2

5
D. 6

Question: 7
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from
1 1
𝐔 (𝜃 − 2 , 𝜃 + 2) distribution, where 𝜃 ∈ ℝ. If
𝑋(1) = min{𝑋1 , 𝑋2 , … , 𝑋𝑛 } and
𝑋(𝑛) = max{𝑋1 , 𝑋2 , … , 𝑋𝑛 }. Define
1 1
𝑇1 = (𝑋(1) + 𝑋(𝑛) ), 𝑇2 = (3𝑋(1) + 𝑋(𝑛) + 1)
2 4

41 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1
and 𝑇3 = 2 (3𝑋(𝑛) − 𝑋(1) − 2)
an estimator for 𝜃, then which of the following is/are
TRUE?
A. 𝑇1 and 𝑇2 are MLE for 𝜃 but 𝑇3 is not MLE for 𝜃

B. 𝑇1 is MLE for 𝜃 but 𝑇2 and 𝑇3 are not MLE for 𝜃

C. 𝑇1 , 𝑇2 and 𝑇3 are MLE for 𝜃

D. 𝑇1 , 𝑇2 and 𝑇3 are not MLE for 𝜃


Question: 8
Let 𝑋 and 𝑌 be random variable having joint probability density function
𝑘
𝑓(𝑥, 𝑦) = ; −∞ < (𝑥, 𝑦) < ∞
(1 + 𝑥 2 )(1 + 𝑦 2 )
Where k is constant, then which of the following is/are TRUE?
1
A. k = 𝜋2
1 1
B. 𝑓(𝑥) = 𝜋 1+𝑥 2 ; −∞ < 𝑥 < ∞
C. P(X = Y) = 0
D. all of the above
Question : 9
Lę 𝑋1 , 𝑋2 , … , 𝑋𝑛
be sequence of independently and identically distributed random variables with the
probability density function
1 2 −𝑥
𝑓(𝑥) = {2 𝑥 𝑒 , if 𝑥 > 0 and let
0, otherwise
𝑆𝑛 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛
Then which of the following statement is/are TRUE?

𝑆𝑛 −3𝑛
A. ∼ 𝑁(0,1) for all 𝑛 ≥ 1
√3𝑛

42 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑆
B. For all 𝜀 > 0, 𝑃 (| 𝑛𝑛 − 3| > 𝜀) → 0 as
n→∞

𝑆𝑛
C. → 1 with probability 1
𝑛

D. Both A and B
Question : 10
Let 𝑋, 𝑌 are i.i.d Binomial (𝑛, 𝑝) random variables. Which of the following are true?

A. 𝑋 + 𝑌 ∼ Bin (2𝑛, 𝑝)

B. (X, Y) ∼ Multinomial (2n; p, p)

C. Var (X − Y) = E(X − Y)2

D. option A and C are correct.


Question: 11
Let 𝑋 and 𝑌 be continuous random variables with the joint probability density function
1 2 2
1
𝑓(𝑥, 𝑦) = 𝑒 −2(𝑥 +𝑦 ) ; (𝑥, 𝑦) ∈ ℝ2
2𝜋
Which of the following statement is/are TRUE?
1
A. 𝑃(𝑋 > 0) = 2
1
B. P(X > 0 ∣ Y < 0) = 2
1
C. P(X > 0, Y < 0) = 4
D. All of the above
Question: 12
Let X and 𝑌 are random variable with 𝐸[𝑋] = 𝐸[𝑌], then which of the following is NOT
TRUE?
A. E{E[X ∣ Y]} = E[Y]
B. V(𝑋 − 𝑌) = 𝐸(𝑋 − 𝑌)2
C. 𝐸[𝑉(𝑋 ∣ 𝑌)] + 𝑉[𝐸(𝑋 ∣ 𝑌)] = 𝑉(𝑋)
D. X and Y have same distribution

43 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question : 13
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from Exp (𝜃 ) distribution, where 𝜃 ∈ (0, ∞).
1
If 𝑋‾ = 𝑛 ∑𝑛𝑖=1 𝑋𝑖 , then a 95% confidence interval for 𝜃 is

2
𝜒2𝑛,0.95
A. (0, ]
𝑛𝑋‾

2
𝜒2𝑛,0.95
B. [ , ∞)
𝑛𝑋‾

2
𝜒2𝑛,0.95
C. (0, ]
2𝑛𝑋‾

2
𝜒2𝑛,0.95
D. [ , ∞)
2𝑛𝑋‾

Question: 14
𝑋𝑖 , 𝑖 = 1,2, …
be independent random variables all distributed according to the PDF 𝑓𝑥 (𝑥) = 1,0 ≤ 𝑥 ≤ 1.
Define
𝑌𝑛 = 𝑋1 𝑋2 𝑋3 … 𝑋𝑛 , for some integer n. Then Var (𝑌𝑛 ) is equal to

𝑛
A. 12

1 1
B. − 22𝑛
3𝑛

1
C. 12𝑛

1
D. 12

Question : 15
Let 𝑋1 , 𝑋2 , … , 𝑋4
be i.i.d random variables having continuous distribution.
Then

44 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑃(𝑋3 < 𝑋2 < max(𝑋1, 𝑋4 )) equal


A. 1/2
B. 1/3
C. 1/4
D. 1/6

Question : 16
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from
1 1
U (𝜃 − 0 , 𝜃 + 2) distribution, where 𝜃 ∈ ℝ. If
𝑋(1) = min{𝑋1 , 𝑋2 , … , 𝑋𝑛 } and
𝑋(𝑛) = max{𝑋1 , 𝑋2 , … , 𝑋𝑛 }

Consider the following statement on above:


1
1. 𝑇1 = 2 (𝑋(1) + 𝑋(𝑛) ) is consistent for 𝜃
1
2. 𝑇2 = (3𝑋(1) + 𝑋(𝑛) + 1)
4
is unbiased consistent for 𝜃
Select the correct answer using code given below:

A. 1 only
B. 2 only
C. Both 1 and 2
D. Neither 1 nor 2
Question: 17
Lęt 𝐹𝑛 be a sequence of DFs defined by
0, 𝑥<0
1
𝐹𝑛 (𝑥) = {1 − , 0 ≤ 𝑥 ≤ 𝑛 and let
𝑛
1, 𝑛≤𝑥
lim𝑛→∞ 𝐹𝑛 (𝑥) = 𝐹(𝑥)
then which of the following is/are TRUE?

45 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

0, 𝑥<0
A. 𝐹(𝑥) = {
1, 𝑥≥0

B. 𝐸(𝑋𝑛𝑘 ) → 𝐸(𝑋 𝑘 ) for any k ≥ 1

C. F is distribution function of the RVX degenerate at 𝑥 = 0

D. all of the above


Question : 18
Lę 𝑋𝑖 (𝑖 = 1,2, … , 𝑛)
be a random sample drawn from uniform distribution on [𝜃, 𝜃 + 1],
then which of the following statement is/are correct?

1
A. (𝑋‾ − 2) is an unbiased estimate of 𝜃
𝑋(1) +𝑋(𝑛) 1
B. − 2 is an unbiased estimate of 𝜃
2

C. (𝑋(1) , 𝑋(𝑛) ) is jointly sufficient but not complete for 𝜃


D. all options are correct.

Question: 19
Lęt {𝑋𝑛 , 𝑛 ≥ 1} be i.i.d uniform (−1,2) random variables and let 𝑆𝑛 = ∑𝑛𝑘=1 𝑋𝑘 ⋅
Then, as n → ∞

𝑆𝑛 1
A. → 2 in probability
𝑛

𝑠𝑛 1
B. → 2 in distribution
𝑛

C. P(𝑆𝑛 ≤ 𝑛) → 1 as n → ∞
D. all options are correct

46 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question : 20
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛
denote random sample of size n from a uniform population with probability density function
1 1
𝑓(𝑥, 𝜃) = 1; 𝜃 − 2 ≤ 𝑥 ≤ 𝜃 + 2 , −∞ < 𝑥 < ∞

𝑋(𝑛) +𝑋(1)
Define 𝑇𝑛 = .
2
A. 𝑇𝑛 is consistent for 𝜃
B 𝑇𝑛 is MLE for 𝜃
C. 𝑇𝑛 is unbiased consistent for 𝜃
D. all options are correct
Question : 21
The cumulative distribution function of a random variable X given by
0, if 𝑥 < 0
4
, if 0 ≤ 𝑥 < 1
𝐹(𝑥) = 9
8
, if 1 ≤ 𝑥 < 2
9
{1, if 𝑥 ≥ 2
Which of the following statements is (are) TRUE?

A. The random variable 𝑋 takes positive probability only at least two points

5
B. 𝑃(1 ≤ 𝑋 ≤ 2) = 9

21
C. E(X) = 3

4
D. 𝑃(0 < 𝑋 < 1) = 9
Question: 22
Let A and B be events in a sample space S such that
1 1
𝑃(𝐴) = 2 = 𝑃(𝐵) and 𝑃(𝐴𝑐 ∩ 𝐵 𝑐 ) = 3. Which of the following is correct?

47 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

5
A. 𝑃(𝐴 ∪ 𝐵 𝑐 ) = 6

5
B. 𝑃(𝐴 ∪ 𝐵 𝑐 ) ≤ 6

C. 𝑃(𝐴 ∩ 𝐵) ≥ min{𝑃(𝐴), 𝑃(𝐵)}

D. A and B are independent


Question: 23
Let X and 𝑌 be i.i.d random variable having distribution
function
1 1
+ tan−1 (𝑥) − ∞ < 𝑥 < ∞
𝐹(𝑥) =
2 𝜋
Which of the following is NOT TRUE?

1 1
A. 𝑓(𝑥, 𝑦) = 𝜋2 (1+𝑥 2)(1+𝑦 2) ; −∞ < (𝑥, 𝑦) < ∞
1 1
B. 𝑓(𝑥) = 𝜋 1+𝑥 2 ; −∞ < 𝑥 < ∞

C. Φ𝑋+𝑌 (𝑡) = 𝑒 −2𝑖|𝑡|


D. E(X) does not exist
Question : 24
Suppose 𝑟1.23 and 𝑟1.234 are sample multiple correlation coefficient of 𝑋1 on 𝑋2 , 𝑋3 and 𝑋1 on
𝑋2 , 𝑋3 , 𝑋4 respectively. Which of the following is possible?

A. 𝑟1.23 = −0.3, 𝑟1.234 = 0.7

B. 𝑟1.23 = −0.5, 𝑟1.234 = −0.7

C. 𝑟1.23 = 0.3, 𝑟1.234 = 0.7

D. 𝑟1.23 = 0.7, 𝑟1.234 = −0.3

48 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question : 25
If 𝑋1 , 𝑋2 , … , 𝑋𝑛 is a random sample from a population with density 𝑓(𝑥, 𝜃) =
𝜃−1
if 0<𝑥 < 1
{𝜃𝑥
0, otherwise
Where 𝜃 > 0 is an unknown parameter, what is a 100(1 − 𝑎)%
confidence interval for 𝜃?

2
𝜒𝛼 (2𝑛) 𝜒2 𝛼 (2𝑛)
1−
2 2
A. [ , ]
2∑𝑛
𝑖=1 ln 𝑋𝑖 2∑𝑛
𝑖=1 ln 𝑋𝑖

2
𝜒𝛼 (𝑛) 𝜒2 𝛼 (𝑛)
1−
2 2
B. [ , ]
−2∑𝑛
𝑖=1 ln 𝑋𝑖 −2∑𝑛
𝑖=1 ln 𝑋𝑖

2
𝜒𝛼 (2𝑛) 𝜒2 𝛼 (2𝑛)
1−
2 2
C. [ , ]
−2∑𝑛
𝑖=1 ln 𝑋𝑖 −2∑𝑛
𝑖=1 ln 𝑋𝑖

2
𝜒𝛼 (𝑛) 𝜒2 𝛼 (𝑛)
1−
2 2
D. [2∑𝑛 ,𝑛 ]
𝑖=1 ln 𝑋𝑖 2∑𝑖=1 ln 𝑋𝑖

Question : 26
Suppose that 𝑟 ball are drawn one at time without replacement from a bag containing n white
and m black balls. Let 𝑆𝑟 be the number of black balls drawn, then var (𝑆𝑟 )
is equal to
𝑚𝑛𝑟
A. (𝑚 + 𝑛 − 𝑟)
(𝑚+𝑛)2 (𝑚+𝑛+1)
𝑚𝑛𝑟
B. (𝑚 + 𝑛 − 𝑟)
(𝑚+𝑛)2 (𝑚+𝑛)
𝑚𝑛𝑟
C. (𝑚 + 𝑛 − 𝑟)
(𝑚+𝑛)2 (𝑚+𝑛−1)
𝑚𝑛𝑟
D. (𝑚 + 𝑛 − 𝑟)
(𝑚+𝑛)2 (𝑚−𝑛)

Question: 27
Let 𝐹𝑛 be a sequence of DFs defined by

49 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

0, 𝑥<0
1
𝐹𝑛 (𝑥) = {1 − , 0 ≤ 𝑥 ≤ 𝑛
𝑛
1, 𝑛≤𝑥
and let lim𝑛→∞ 𝐹𝑛 (𝑥) = 𝐹(𝑥)

then which of the following is NOT TRUE?

0, 𝑥<0
A. 𝐹(𝑥) = {
1, 𝑥≥0

B. 𝐸(𝑋𝑛𝑘 ) → 𝐸(𝑋 𝑘 ) for any k ≥ 1

C. 𝑋𝑛 converge in probability to 0

D. 𝐹 is distribution function of the RVX degenerate at 𝑥 = 0

Question : 28
The Cumulative distribution function of a random variable 𝑋 is given by
0, 𝑥<2
1 7
𝐹(𝑥) = { (𝑥 2 − ) , 2≤𝑥<3
10 3
1, 𝑥≥3

Which of the following statements is(are) TRUE?

A. 𝐹(𝑥) is continuous everywhere

B. F(x) increases only by jumps


1
C. 𝑃(𝑋 = 2) = 16
5
D. 𝑃 (𝑋 = ∣ 2 ≤ 𝑋 ≤ 3) = 0
2

50 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question: 29
Let 𝑋 and 𝑌 be two independent standard normal random variables. Then the probability
|𝑋|
density function of 𝑍 = |𝑌| is

√1/2
𝑓(𝑧) = { √𝜋 𝑒 −𝑧/2 𝑧 −1/2 if 𝑧 > 0
0, otherwise
2 −𝑧 2/2
𝑒 if 𝑧 > 0
𝑓(𝑧) = { √𝜋
0, otherwise
−𝑧
𝑒 if 𝑧 > 0
𝑓(𝑧) = {
0, otherwise
2 1
⋅ if 𝑧 > 0
𝑓(𝑧) = {√𝜋 (1 + 𝑧 2 )
0, otherwise

Question: 30
If the joint moment generating function of the random variables X and Y is
2 +18𝑡 2 +12𝑠𝑡)
M(s, t) = 𝑒 (𝑠+3𝑡+2𝑠
Which of the following is/are correct?
A. 𝐸(𝑋) < 𝐸(𝑌)
B. Corr (𝑋, 𝑌) > 0
C. Cov (𝑋, 𝑌) = 12
D. all of the above

3.5 SUMMARY
The main points which we have covered in this lessons are what is estimator and what is
consistency, efficiency and sufficiency of the estimator and how to get best estimator.
3.6 GLOSSARY
Motivation: These Problems are very useful in real life and we can use it in data science ,
economics as well as social sciemce.
Attention: Think how the best estimator are useful in real world problems.

51 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

3.7 ANSWER TO IN-TEXT QUESTIONS


Answer 1 : B
Explanation:
Let 𝑋1 , 𝑋2 , …

be identically distributed random variable and let N be a random variable.


Define 𝑆𝑁 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑁
Then E(SN ) = E(Xi ) ⋅ E(N) = 4
𝑉(𝑆𝑁 ) = 𝐸(𝑁)Var (𝑋𝑖 ) + [𝐸(𝑋𝑖 )]2 Var (𝑁) = 10

Answer 2 : B
Explanation:
If 𝐴 and 𝐵 be independent 𝑅𝑎𝑛𝑑𝑜𝑚 𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒 each having the uniform distribution on [0,1].
Let 𝑈 = min{𝐴, 𝐵} and 𝑉 = max{𝐴, 𝐵},
then
𝐸(𝑈) = 1/3, 𝐸(𝑉) = 2/3 and 𝑈𝑉 = 𝐴𝐵 and 𝑈 + 𝑉 = 𝐴 + 𝐵
Thus Cov (𝑈, 𝑉) = 𝐸(𝑈𝑉) − 𝐸(𝑈)
𝐸(𝑉) = 𝐸(𝐴𝐵) − 𝐸(𝑈)
1 2 1
E(V) = E(A) ⋅ E(B) − E(U) ⋅ E(V) = − =
4 9 36
Answer 3 : B
Explanation:
1
𝑋𝑖 ∼ 𝑈(0, 𝜃 2 ) 𝑓(𝑥) = ; 0 < 𝑥𝑖 < 𝜃 2
𝜃2

𝑋(3) ≤ 𝜃 2 ⇒ 𝜃ˆ ∈ [√𝑋(3) , ∞)
3
1
𝐿(𝑋, 𝜃) = ∏ 𝑓(𝑥𝑖 , 𝜃) =
𝜃6
𝑖=1
∂𝐿
⇒ ∂𝜃 < 0 there fore given function is decreasing then 𝜃ˆ = √𝑋(3)

Answer 4 : C
Explanation:
52 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

X −1 0 1

P(x) 1/8 6/8 1/8

1 6 1
E(X) = −1 × + 0 × + 1 × = 0
8 8 8

1 6 1 1
E(𝑋 2 ) = 1 × + 0 × + 1 × =
8 8 8 4

1 1
𝑉(𝑋) = 𝐸(𝑋 2 ) − {𝐸(𝑋)}2 = ⇒ 𝜎𝑋 =
4 2

𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } = 𝑃{|𝑋| ≥ 1} = 1 − 𝑃(|𝑋| < 1)

= 1 − 𝑃(−< 𝑋 < 1) = 1 − 𝑃(𝑋 = 0) = 1/4

1
𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } ≤ 4 [By Chebychev’s inequality]
Answer 5 : B
Explanation:
Let 𝑋𝑖 , 𝑌𝑖 ; (𝑖 = 1,2)
be a i.i.d random sample of size 2 from a standard normal

√2(X1 +X2 )
distribution. Then W = ∼ 𝑡(2)
√(X2 −X1 )2 +(Y2 −Y1 )2

Hence option (b) is correct.


Answer 6 : D
Explanation:

53 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Let 𝑋 be Random Variable with 𝑀𝑋 (𝑡) = 𝐸(𝑒 𝑡𝑋 ) = ∑etx P(X = x)


1
; 𝑥=0
6
1
; 𝑥=1
3
Then 𝑃(𝑋 = 𝑥) = 1
; 𝑥=2
3
1
{6 ; 𝑥 = 3

𝑃(𝑋 ≤ 2) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)


1 1 1 5
= + + =
6 3 3 6

Answer 7 : A
Explanation:
1 1
𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from U (𝜃 − 2 , 𝜃 + 2)
1 1
𝑓(𝑥) = 1; 𝜃 − < 𝑥𝑖 < 𝜃 +
2 2
1 1
𝜃ˆ ∈ [𝑋(𝑛) − , 𝑋(1) + ]
2 2
distribution of 𝑋 free from parameter, then
1 1
𝜃ˆ = 𝜆 (𝑋(𝑛) − ) + (1 − 𝜆) (𝑋(1) + ) ; 0 < 𝜆 < 1
2 2

1 1 3
Take 𝜆 = 2 , 4 and 4 then we obtained mle of 𝜃 are

1 1 1
(𝑋(1) + 𝑋(𝑛) ); 4 (3𝑋(1) + 𝑋(𝑛) + 1); 4 (3𝑋(1) + 𝑋(𝑛) + 1) respectively.
2

Hence option (a) is correct.


Answer 8 : D
Explanation:
Let 𝑋 and 𝑌 be random variable having joint probability
𝑘
density function 𝑓(𝑥, 𝑦) = (1+𝑥 2)(1+𝑦 2) ; −∞ < (𝑥, 𝑦) < ∞

54 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

∞ ∞ 1
∫−∞ ∫−∞ 𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦 = 1 ⇒ 𝑘 =
𝜋2
1 1
Since 𝑋 and 𝑌 are independent, then 𝑋 ∼ 𝑓(𝑥) = 𝜋 1+𝑥 2 ; −∞ < 𝑥 < ∞

P(X = Y) = 0{ There is no region occur corresponding to X = Y, then probability


corresponding to this region will be zero}

Answer 9 : D
Explanation:
Clearly, 𝑋1 , 𝑋2 , … , 𝑋𝑛
are i.i.d 𝐺(3,1) random variables. Then, 𝐸(𝑋𝑖 ) = 3 and Var (𝑋𝑖 ) = 3, 𝑖 = 1,2, …
Let 𝑆𝑛 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛 , then E(𝑆𝑛 ) = 3𝑛 and Var (𝑆𝑛 ) = 3𝑛

Now For option (a)

𝑆𝑛 −3𝑛
Using CLT ∼ 𝑁(0,1) for all 𝑛 ≥ 1
√3𝑛

For option (b)

𝑆 3𝑛 𝑆 3𝑛
lim𝑛→∞ 𝐸 ( 𝑛𝑛) = lim𝑛→∞ = 3; lim𝑛→∞ 𝑉 ( 𝑛𝑛) = lim𝑛→∞ 𝑛2 = 0
𝑛

By Using Convergence in probability condition


(Consistency Properties)

𝑆
For all 𝜀 > 0, 𝑃 (| 𝑛𝑛 − 3| > 𝜀) → 0 as n → ∞

For option (c)


𝑆𝑛
→3
𝑛
with probability 1 (By using convergent in probability condition)

55 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

For option (d)


𝑠𝑛 −𝐸(𝑠𝑛 ) 3(𝑛−√𝑛)−𝐸(𝑆𝑛 )
lim𝑛→∞ 𝑃 ( ≥ ) = 𝑃(𝑍 ≥ −√3) = 1 −
√Var (𝑆2 ) √Var (𝑆𝑤 )
𝑃(𝑍 ≤ −√3)
1
= 1 − Φ(−√3) ≥
2
Answer 10 : D
Explanation:
(A) Sum of independent binomial variate is also a binomial variate if corresponding
probability will be same
Then 𝑋 + 𝑌 ∼ Bin (2𝑛, 𝑝)

(B) When there are more than two variables include, the observation lead to multinomial
distribution.
(𝑋, 𝑌) not follows Multinomial (2𝑛; 𝑝, 𝑝)

(C) Var (X − Y) = E(X − Y)2 − {E(X − Y)}2 = E(X − 𝑌)2

(D) Cov (𝑋 + 𝑌, 𝑋 − 𝑌) = 𝑉(𝑋) − Cov (𝑋, 𝑌) + Cov (𝑌, 𝑋) − 𝑉(𝑌) = 0

{∴ X and Y are independent Cov (X1 Y) = Cov (Y, X) = 0}

Hence option D is correct.


Answer 11 : D
Explanation:
1 2 +𝑦 2 ) 1 2 1 2
1 1 1
The joint pdf of 𝑋 and 𝑌 is 𝑓(𝑥, 𝑦) = 2𝜋 𝑒 −2(𝑥 = 𝑒 −2(𝑥 ) × 𝑒 −2(𝑦 ) ; (𝑥, 𝑦)
√2𝜋 √2𝜋
∈ ℝ2
It is easy to see that 𝑋 and 𝑌 are i.i.d 𝑁(0,1) random
variables, and therefore,

1
𝑃(𝑋 > 0) = 2
1 1 1
𝑃(𝑋 > 0)𝑃(𝑌 < 0) = × =
2 2 4

56 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑃(𝑋 > 0, 𝑌 < 0) 1


𝑃(𝑋 > 0 ∣ 𝑌 < 0) = =
𝑃(𝑌 < 0) 2
Answer 12 : D
Explanation:
E{E[ X ∣ Y ]} = E[X] = E[Y] {Given that E[X] = E[Y]}

V(𝑋 − 𝑌) = 𝐸(𝑋 − 𝑌)2 − {𝐸(𝑋 − 𝑌)}2 = 𝐸(𝑋 − 𝑌)2


{ Since 𝐸[𝑋 − 𝑌] = 0}
𝐸[ V(X ∣ Y)] + V[E(X ∣ Y)] = V(X)
𝑋 and 𝑌 may or may not be same distribution.
Hence option (D) is correct.
Answer 13 : C
Explanation:
𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from Exp (𝜃)
distribution, where 𝜃 ∈ (0, ∞)
Then 2𝜃∑𝑛𝑖=1 𝑋𝑖 ∼ 𝜒2𝑛 2
⇒ P(0 < 2𝜃∑𝑛𝑖=1 𝑋𝑖 ≤ 𝜒2𝑛,0.95
2
)=
0.95
2
𝜒2𝑛,0.95
0 < 2𝜃∑𝑛𝑖=1 𝑋𝑖 ≤ 𝜒2𝑛,0.95
2
⇒ 𝜃 ∈ (0, ]
2𝑛𝑥‾
Hence option C is correct.
Answer 14 : B
Explanation:
𝑋1 , 𝑋2 , … , 𝑋𝑛 are independent, we have that E(𝑌𝑛 ) = E(𝑋1 ) × … × 𝐸(𝑋2 ). Similarly,
𝐸(𝑌𝑛2 ) = E(𝑋12 ) × … × 𝐸(𝑌𝑛2 ). Since
E(𝑋𝑖 ) = 1/2 and E(𝑌𝑖2 ) = 1/3 for i = 1,2, … , n
it follows that
1 1
Var (𝑌𝑛 ) = 𝐸(𝑌𝑛2 ) − [E(𝑌𝑛 )]2 = 𝑛 − 2𝑛
3 2
Hence option (B) is correct.
Answer 15 : C
Explanation:
Note that 𝑃(𝑋1 < 𝑋2 ) + 𝑃(𝑋2 < 𝑋1 ) + 𝑃(𝑋1 = 𝑋2 ) = 1
since the corresponding events are disjoint and exhaust

57 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

all the probabilities. But 𝑃(𝑋1 < 𝑋2 ) = 𝑃(𝑋2 < 𝑋1 )


by symmetry. Furthermore, 𝑃(𝑋1 = 𝑋2 ) = 0
1
since the random variables are continuous. Therefore, 𝑃(𝑋1 < 𝑋2 ) = 2. From above results
1
𝑃(𝑋3 < 𝑋2 < max(𝑋1, 𝑋4 )) = 4
Hence option C is correct.
Answer 16 : A
Explanation:
1 1
𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from U (𝜃 − , 𝜃 + )
2 2
1 1
𝑓(𝑥) = 1; 𝜃 − < 𝑥𝑖 < 𝜃 +
2 2
1 1
𝜃ˆ ∈ [𝑋(𝑛) − 2 , 𝑋(1) + 2] ; distribution of 𝑋
1 1
free from parameter, then 𝜃ˆ = 𝜆 (𝑋(𝑛) − 2) + (1 − 𝜆) (𝑋(1) + 2) ; 0 < 𝜆 < 1
1 1
Take 𝜆 = 2 , 4 we get
1
𝑇1 = 2 (𝑋(1) + 𝑋(𝑛) ) is MLE as well as consistent for 𝜃
1
𝑇2 = (3𝑋(1) + 𝑋(𝑛) + 1)
4
is MLE as well as consistent for 𝜃 but not unbiased…

Hence option (A) is correct.


Answer 17 : D
Explanation:
0, 𝑥 < 0
lim𝑛→∞ 𝐹𝑛 (𝑥) = 𝐹(𝑥) = {
1, 𝑥 ≥ 0
Note that 𝐹𝑛 is the DF of the RV𝑋𝑛 with PMF
1 1
𝑃(𝑋𝑛 = 0) = 1 − , 𝑃(𝑋𝑛 = 𝑛) =
𝑛 𝑛
And F is the distribution function of the RVX degenerate at
1
X = 0. We have 𝐸(𝑋𝑛𝑘 ) = 𝑛𝑘 (2) = 𝑛𝑘−1 where k
is a positive integer. Also 𝐸(𝑋 𝑘 ) = 0 so that
𝐸(𝑋𝑛𝑘 ) → 𝐸(𝑋 𝑘 ) for any k ≥ 1
𝑋𝑛 does not converge in probability to 0 because 𝑋𝑛
converges in probability to 1 .

58 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Hence option (d) is correct.


Answer 18 : D
Explanation:
Given that

𝑋 ∼ 𝑈[𝜃, 𝜃 + 1] ⇒ 𝑓(𝑥) = 1; 𝜃 ≤ 𝑋𝑖 ≤ 𝜃 + 1

For option (A)

1 2𝜃+1 1
𝐸 (𝑋‾ − 2) = −2=𝜃
2
1
(𝑋‾ − 2) is an unbiased estimate of 𝜃
For option (B)

𝑋(1) + 𝑋(𝑛) 1 2𝜃 + 1 1
𝐸( − )= − =𝜃
2 2 2 2
𝑋(1) +𝑋(𝑛) 1
− 2 is an unbiased estimate of 𝜃
2

For option (C)


The joint pdf of 𝑋1 , 𝑋2 , … , 𝑋𝑛 is given by 𝑓𝜃 (𝑋1 , 𝑋2 , … , 𝑋𝑛 ) = 1. 𝐼𝐴

Where 𝐴 = {(𝑥1 , 𝑥2 , … , 𝑥𝑛 ); 𝜃 ≤ 𝑋(1) ≤ 𝑋(𝑛) ≤ 𝜃 + 1}


(𝑋(1) , 𝑋(𝑛) ) is jointly sufficient for 𝜃
𝑋(1) +𝑋(𝑛) 1
Construct a non-zero function 𝑔(𝑇) = − 𝜃 − 2 ∀𝜃 > 0
2
𝑋(1) + 𝑋(𝑛) 1
E[g(T)] = 𝐸 [ −𝜃− ]=0
2 2
By-completeness property it is not complete sufficient statistics for 𝜃.

Hence option (d) is correct.


Answer 19 : D
Explanation:

59 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑆 1 𝑆
lim𝑛→∞ 𝐸 ( 𝑛𝑛) → 2 and lim𝑛→∞ 𝑉 ( 𝑛𝑛) → 0

𝑆𝑛 1
this implies → 2 in probability
𝑛
𝑆𝑛 1 𝑆𝑛 1
If → 2 in probability then → 2 in distribution
𝑛 𝑛

1
𝑆𝑛 −
2
Using CLT: 3
∼ 𝑁(𝟎, 1)as 𝐧 → ∞

4

𝑠𝑛 −𝐸(𝑆𝑛 ) 𝑛−𝐸(𝑆𝑛 )
𝐏( ≤ ) → 1 as 𝐧 → ∞
√𝑆𝑛 √𝑆𝑛

Hence option 𝑑 is correct.


Answer 20 : D
Explanation:
1 1
1, 𝜃 − 2 ≤ 𝑥𝑖 ≤ 𝜃 + 2
Here 𝐿 = 𝐿(𝜃; 𝑋1 , 𝑋2 , … , 𝑋𝑛 ) = {
0, otherwise
1 1
If 𝑋(1) , 𝑋(2) , … , 𝑋(𝑛) is order sample, then 𝜃 − 2 ≤ 𝑥(1) ≤ 𝑥(2 ) ≤ ⋯ ≤ 𝜃 + 2
1 1
Thus 𝐿 attains the maximum if 𝜃ˆ ∈ [𝑋(𝑛) − 2 , 𝑋(1) + 2]
Therefore the convex linear combination is also a MLE for 𝜃

1 1
𝜃ˆ = 𝜆 (𝑋(𝑛) − ) + (1 − 𝜆) (𝑋(1) + )
2 2

1 𝑋(𝑛) +𝑋(1)
take 𝜆 = 2 ⇒ 𝜃ˆ = 𝑇𝑛 = and also
2

𝑋(𝑛) + 𝑋(1)
𝐸( )=𝜃
2
𝑋(𝑛) +𝑋(1)
By property of MLE𝑇𝑛 = 2
is consistent for 𝜃
From above 𝑇𝑛 are unbiased, consistent and also MLE for 𝜃

60 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Hence option (D) is correct.


Answer 21 : B
Explanation:
Since F is continuous on R ∖ {0,1,2}, it follows that P(X = x) = 0∀x ∈ ℝ ∖ {0,1,2}
4
Further, P(X = 0) = F(0) − F(0−) = 9
4
P(X = 1) = F(1) − F(1−) =
9
1
and 𝑃(𝑋 = 2) = 𝐹(2) − 𝐹(2−) = 9
Therefore, we conclude that 𝑋 is a discrete random
variable positive probabilities at three points 0,1,2.
4 1 5
P(1 ≤ X ≤ 2) = P(X = 1) + P(X = 2) = + =
9 9 9
4 4 1 2
𝐸(𝑋) = ∑2𝑥=0 𝑥𝑃(𝑋 = 𝑥) = 0 × + 1 × + 2 × =
9 9 9 3
and 𝑃(0 < 𝑋 < 1) = 0.
Hence option B is correct.
Answer 22 : A
Explanation:
Notice that 𝐴 ∪ 𝐵 𝑐 = 𝐴 ∪ (𝐴𝑐 ∩ 𝐵 𝑐 )
1 1 5
Thus, 𝑃(𝐴 ∪ 𝐵 𝑐 ) = 𝑃(𝐴) + 𝑃(𝐴𝑐 ∩ 𝐵 𝑐 ) = 2 + 3 = 6
Hence option (A) is correct.
Answer 23 : C
Explanation:
Let 𝑋 and 𝑌 be i.i.d random variable having distribution function
1 1
𝐹(𝑥) = + tan−1 (𝑥) − ∞ < 𝑥 < ∞
2 𝜋
1 1
𝑋 ∼ 𝑓(𝑥) = ; −∞ < 𝑥 < ∞
𝜋 1 + 𝑥2
Φ𝑋 (𝑡) = 𝑒 −|𝑡| ; Φ𝑋+𝑌 (𝑡) = 𝐸(𝑒 𝑖𝑡(𝑥+𝑦) ) = 𝑒 −2|𝑡|

Both 𝑋 and 𝑌 are i.i.d random variable.

Hence option C is correct.


61 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Answer 24 : C
Explanation:
Since sample multiple correlation lies between 0 to 1
0 ≤ 𝑟1.23,…,𝑛 ≤ 1
So option C is only hold this condition

Hence option C is correct.


Answer 25 : C
Explanation:
We use the random variable
𝑄 = −2𝜃∑𝑛𝑖=1 ln 𝑋𝑖 ∼ 𝜒(2𝑛)
2

As the pivotal quantity. The 100(1 − 𝑎)% confidence interval


for 𝜃 can be constructed from
𝜒𝛼2 (2𝑛) 2
𝜒1− 𝛼 (2𝑛)
1−𝛼 = 𝑃 (𝜒𝛼2 (2𝑛) ≤𝑄≤ 2
𝜒1− α (2𝑛)) = 𝑃[ 2
≤𝜃≤ 2
]
2 2 −2∑𝑛𝑖=1 ln 𝑋𝑖 𝑛
−2∑𝑖=1 ln 𝑋𝑖
2
𝜒𝛼 (2𝑛) 𝜒2 𝛼 (2𝑛)
1−
2 2
Thus, 100(1 − 𝑎)% confidence interval for 𝜃 is given by [ , ]
−2∑𝑛
𝑖=1 ln 𝑋𝑖 −2∑𝑛
𝑖=1 ln 𝑋𝑖

Hence option C is correct.

Answer 26 : A
Explanation:
1, if the kth ball drawn is black
Let us define 𝑋𝑘 = { 𝑘 = 1,2, … , 𝑟
0, if the kth ball drawn is white

Then 𝑆𝑟 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑟

𝑚 𝑛
Also, P(𝑋𝑘 = 1) = 𝑚+𝑛′ , and P(𝑋𝑘 = 0) = 𝑚+𝑛

𝑚 𝑚𝑛
Thus E(𝑋𝑘 ) = 𝑚+𝑛 and V(𝑋𝑘 ) = (𝑚+𝑛)2

62 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

To compute cov (𝑋𝑗 , 𝑋𝑘 ), 𝑗 ≠ 𝑘

note that the random variable 𝑋𝑗 𝑋𝑘 = 1id

then jth and 𝑘th balls drawn are black, and = 0 otherwise.

𝑚 𝑚−1
Thus 𝐸(𝑋𝑗 , 𝑋𝑘 ) = 𝑃(𝑋𝑗 = 1, 𝑋𝑘 = 1) = 𝑚+𝑛 𝑚+𝑛−1

𝑚𝑛
and Cov (𝑋𝑗 , 𝑋𝑘 ) = (𝑚+𝑛)2(𝑚+𝑛−1)

𝑚𝑟 𝑚𝑛𝑟
Thus E(𝑆𝑟 ) = ∑𝑟𝐾=1 𝐸(𝑋𝑘 ) = 𝑚+𝑛 and 𝑉(𝑆𝑟 ) = (𝑚+𝑛)2 (𝑚+𝑛+1) (𝑚 + 𝑛 − 𝑟)

Hence option A is correct.


Answer 27 : C
Explanation:
0, 𝑥 < 0
lim𝑛→∞ 𝐹𝑛 (𝑥) = 𝐹(𝑥) = {
1, 𝑥 ≥ 0
Note that 𝐹𝑛 is the 𝐷𝐹 of the 𝑅𝑉𝑋𝑛 with PMF
1 1
𝑃(𝑋𝑛 = 0) = 1 − , 𝑃(𝑋𝑛 = 𝑛) =
𝑛 𝑛
And F is the distribution function of the RVX degenerate at X = 0.
1
We have 𝐸(𝑋𝑛𝑘 ) = 𝑛𝑘 (𝑛) = 𝑛𝑘−1
where k is a positive integer. Also 𝐸(𝑋 𝑘 ) = 0,
so that 𝐸(𝑋𝑛𝑘 ) → 𝐸(𝑋 𝑘 ) for any k ≥ 1

𝑋𝑛 does not converge in probability to 0 because 𝑋𝑛 converges in probability to 1 .


Hence option C is correct.
Answer 28 : D
Explanation:

63 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1 7 1
We have 𝐹(2) = 10 (4 − 3) = 6 and
𝐹(2− ) = 0. since 𝐹(2) ≠ 𝐹(2− ), the function 𝐹
is not continuous at 2
It is direct to see that 𝐹 is increasing in 𝑥 ∈ [2,3)
without any jump
1
P(X = 2) = F(2) − F(2− ) = 6
5
Since F is continuous at x = 5/2, we have P (X = 2) = 0, and therefore,
5
5 𝑃 (𝑋 = 2 , 2 ≤ 𝑋 ≤ 3)
𝑃 (𝑋 = ∣ 2 ≤ 𝑋 ≤ 3) =
2 𝑃(2 ≤ 𝑋 ≤ 3)
5
𝑃 (𝑋 = 2)
= =0
𝑃(2 ≤ 𝑋 ≤ 3)

Hence option D is correct.


Answer 29 : D
Explanation:
We know that the ratio of two independent standard normal random variables has Cauchy
𝑋
distribution, and therefore, the pdf of U = 𝑉 is
|𝑋|
Now, it is to verify that the pdf of 𝑍 = |𝑌|

2 1
⋅ if 𝑧 > 0
𝑓(𝑧) = {√𝜋 (1 + 𝑧 2 )
0, otherwise
B. Answer 30 : D
Explanation:
2 2
M(s, t) = 𝑒 (𝑠+3𝑡+2𝑠 +18𝑡 +12𝑠𝑡)
∂𝑀 ∂𝑀
𝐸(𝑌) = [ ] = 3; 𝐸(𝑋) = [ ] =1
∂𝑡 (0,0) ∂𝑠 (0,0)
∂2 𝑀
𝐸(𝑋𝑌) = [ ] = 15
∂𝑠 ∂𝑡 (0,0)
𝐸(𝑋) < 𝐸(𝑌)
Cov (𝑋, 𝑌) = 𝐸(𝑋𝑌) − 𝐸(𝑋) ⋅ 𝐸(𝑌) = 15 − 1 × 3 = 12
Cov (𝑋, 𝑌) > 0 this implies Corr (𝑋, 𝑌) > 0

64 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Hence option D is correct.


3.8 REFERENCES
• Devore, J. (2012). Probability and statistics for engineers, 8th ed. Cengage Learning.
• John A. Rice (2007). Mathematical Statistics and Data Analysis, 3rd ed. Thomson
Brooks/Cole
• Larsen, R., Marx, M. (2011). An introduction to mathematical statistics and its
applications. Prentice Hall.
• Miller, I., Miller, M. (2017). J. Freund’s mathematical statistics with applications, 8th
ed. Pearson.
• Demetri Kantarelis, D. and Malcolm O. Asadoorian, M. O. (2009). Essentials of
Inferential Statistics, 5th edition, University Press of America.
• Hogg, R., Tanis, E., Zimmerman, D. (2021) Probability and Statistical inference,
10TH Edition, Pearson
3.9 SUGGESTED READINGS
• S. C Gupta , V.K Kapoor, Fundamentals of Mathematical Statistics,Sultan Chand
Publication, 11th Edition.
• B.L Agarwal, Programmed Statistics ,New Age International Publishers, 2nd Edition.

65 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

LESSON 4
METHODS OF POINT ESTIMATION

STRUCTURE
4.1 Learning Objectives
4.2 Introduction
4.3 Mathods of Point Estimation
4.3.1 Method of Moments
4.3.2 Method of Maximum Likelihood
4.4 In-Text Questions
4.5 Summary
4.6 Glossary
4.7 Answer to In-Text Questions
4.8 References
4.9 Suggested Readings
4.1 LEARNING OBJECTIVES
Point estimators are functions that are used to find an approximate value of a population
parameter from random samples of the population. They use the sample data of a population
to calculate a point estimate or a statistic that serves as the best estimate of an
unknown parameter of a population.
4.2 INTRODUCTION
Most often, the existing methods of finding the parameters of large populations are unrealistic.
For example, when finding the average age of kids attending kindergarten, it will be impossible
to collect the exact age of every kindergarten kid in the world. Instead, a statistician can use
the point estimator to make an estimate of the population parameter.
Properties of Point Estimators
The following are the main characteristics of point estimators:
1. Bias
The bias of a point estimator is defined as the difference between the expected value of the
estimator and the value of the parameter being estimated. When the estimated value of the

66 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

parameter and the value of the parameter being estimated are equal, the estimator is considered
unbiased.
Also, the closer the expected value of a parameter is to the value of the parameter being
measured, the lesser the bias is.
2. Consistency
Consistency tells us how close the point estimator stays to the value of the parameter as it
increases in size. The point estimator requires a large sample size for it to be more consistent
and accurate.
You can also check if a point estimator is consistent by looking at its corresponding expected
value and variance. For the point estimator to be consistent, the expected value should move
toward the true value of the parameter.
3. Most efficient and unbiased
The most efficient point estimator is the one with the smallest variance of all the unbiased and
consistent estimators. The variance measures the level of dispersion from the estimate, and the
smallest variance should vary the least from one sample to the other.
Generally, the efficiency of the estimator depends on the distribution of the population. For
example, in a normal distribution, the mean is considered more efficient than the median, but
the same does not apply in asymmetrical distributions.
Point Estimation and Interval Estimation
The two main types of estimators in statistics are point estimators and interval estimators. Point
estimation is the opposite of interval estimation. It produces a single value while the latter
produces a range of values.
A point estimator is a statistic used to estimate the value of an unknown parameter of a
population. It uses sample data when calculating a single statistic that will be the best estimate
of the unknown parameter of the population.
On the other hand, interval estimation uses sample data to calculate the interval of the
possible values of an unknown parameter of a population. The interval of the parameter is
selected in a way that it falls within a 95% or higher probability, also known as the confidence
interval.
The confidence interval is used to indicate how reliable an estimate is, and it is calculated from
the observed data. The endpoints of the intervals are referred to as the upper and lower
confidence limits.
4.3 METHODS OF POINT ESTIMATORS
The following are some of the methods of point estimators.
67 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1. Method of Moments
2. Method of Maximum Likelihood
Now we shall now, briefly, explain these terms one by one.
4.3.1 Method of Moments :
The method of moments essentially amounts to equating sample moments and corresponding
population moments and solving the resulting equations for the parameters to be determined.
In most of the problems the parameter to be estimated is some known function of a given
(finite) number of population moments.
Suppose 𝐹(𝑥, 𝜃), 𝜃 ∈ Θ, (𝜃 may he vector valued i.e., 𝜃 = (𝜃1 , 𝜃2 , … . , 𝜃𝑘 ) is a distribution
function of a random variable 𝑋 with unknown parameter 𝜃. Let 𝑟 th moment about origin of
population be denoted by:
𝜇𝑟′ = 𝐸(𝑋 r ) 𝑟 = 1,2, … . , 𝑘
It is supposed to exist for 1 ≤ 𝑟 ≤ 𝑘.
Let 𝑋1 , 𝑋2 , … . , 𝑋𝑛 be a random sample of size 𝑛 from 𝐹(𝑥, 𝜃). Let the 𝑟 th sample moment
about origin be
𝑛
1
𝑚𝑟′ = ∑ 𝑋𝑖𝑟 , 𝑟 = 1,2, … , 𝑘
𝑛𝑖
𝑖=1

Equating the moments of population to the corresponding moments of the sample, we have
𝜇𝑟′ = 𝑚𝑟′ , 𝑟 = 1,2, … , 𝑘
The solution of these 𝑘 equations will give us the required estimators. We may also compare
the central moments of the population to the central moments of the sample to case the
problem.
Thus to estimate the 𝑘 parameters (𝜃1 , 𝜃2 , …𝑘 , 𝜃𝑘 ) we first obtain the mean (first moment about
origin) and the next (𝑘 − 1) moments (Central or Row).
In other words, let us suppose 𝜃 = 𝑔(𝜇1′ , 𝜇2′ , … , 𝜇𝑘′ ) where 𝑔 is some known numerical
function, then the muthod of moments consists in estimating 𝜃 by the statistic.
1 1 1
𝑇(𝑋1 , 𝑋2 , … , 𝑋𝑛 ) = 𝑔 ( Σ𝑋𝑖 , Σ𝑋𝑙 , … , Σ𝑋𝑖 )
𝑛 𝑛 𝑛
′ ′ ′ ),
= 𝑔(𝑚1 , 𝑚2 , … . 𝑚𝑟 say.
Note 1. The number of equations, (𝜇𝑟′ = 𝑚𝑟′ ) is taken equal to the number of unknown
parameters. If we must estimate 𝑘 parameters (𝜃1 , 𝜃2 , … , 𝜃𝑘 ) then we equate the population
and sample moments to obtain enough equations to provide the unique solutions for 𝜃𝑗 , 𝑗 =
1,2, 𝑘, where we may write.

68 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝜇𝑟′ = 𝑚𝑟′ (𝜃1 , 𝜃2 , … , 𝜃𝑘 )


2. Sometimes the number of equations required varies with the form of the distribution of 𝑋
and with the specific parameter of interest.
3. The method of moments is also applicable for estimation of joint moments. That is, we may
1
use 𝑛 Σ𝑋𝑖 𝑌i; to estimate 𝐸(𝑋𝑌); and so on.
Example 1. Estimate the parameter 𝜆 in sampling from a Poisson population by the method of
moments.
Suppose the five observations 8,9,10,12 and 15 are taken. Show that moment estimate of 𝜆 is
10.8.
Solution. Here, the probability (mass) function of a random variable 𝑋 is.
𝜆𝑥
𝑓(𝑥, 𝜆) = 𝑒 −𝜆 𝑥! , 𝑥 = 0,1,2, …
The first moment about origin is.
𝜇1′ = 𝐸(𝑋) = 𝜆
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 denote a random sample of size 𝑛 from this distribution. Then the first sample
moment about origin is given by
𝑛
1
𝑚1′ = ∑ 𝑋𝑖 = 𝑋‾
𝑛
𝑖=1

According to the method of moments, we equate population and sample moment, i.e., we set
𝜇1′ = 𝑚1
which gives
𝜆 = 𝑋‾ ⇒ 𝜆ˆ = 𝑋‾
Hence 𝑋‾ is the required method of moments estimator for 𝜆.
8+9+10+12+15
Numerical. 𝜆ˆ = 𝑋‾ = = 10 ⋅ 8.
5
Example 2. Obtain the estimator for parameter 𝑝 in Binomial distribution by the method of
moments.
Solution. The probability mass function of Binomial distribution is given by
𝑛
𝑓(𝑥, 𝑛, 𝑝) = 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 , 𝑥 = 0,1,2, … , 𝑛
The first moment about origin is

69 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝜇1′ = 𝐸(𝑋) = ∑ 𝑥 ⋅ 𝑛 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 = 𝑛𝑝


𝑥=0

Let 𝑋1 , 𝑋2 , … . , 𝑋𝑛 be a random sample of size 𝑚 from 𝑓(𝑥; 𝑛, 𝑝). The corresponding sample
moment about origin is
𝑚
1
𝑚1′ = ∑ 𝑋𝑖 = 𝑋‾
𝑚
𝑖=1

Equating the population moment to the corresponding sample moment, we get


= 𝑚1′ ⇒ 𝑛𝑝 = 𝜆‾
𝜇1′
𝑋‾
𝑝 =
𝑛
Example 3. Estimate the parameters 𝜇 and 𝜎 2 by the method of moments in the sampling
from normal population.
1 1 𝑥−𝜇 2
𝑒 2 𝜎 ) ,𝜎
2) − (
𝑓(𝑥, 𝜇, 𝜎 = > 0, −∞ ≤ 𝑥 ≤ ∞
𝜎√2𝜋

Solution. Here first moment of 𝑋 about origin, 𝜇1′ = 𝜇


Second moment of 𝑋 about origin,
𝜇2′ = 𝜎 2 + 𝜇 2 ⇒ 𝜎 2 = 𝜇2′ − (𝜇1′ )2
1
first sample moment, 𝑚1′ = 𝑛 ∑𝑛𝑖=1 𝑋𝑖 = 𝑋‾
1
Second sample moment 𝑚2′ = 𝑛 ∑𝑥𝑖2
According to the method of moments, we equate population and sample moments, i.e., we set
𝜇1′ = 𝑚𝑖 ⇒ 𝜇 = 𝑋‾
And
𝑛
1
𝜇2′ = 𝑚4′ ⇒ 𝜎 + 𝜇 = ∑ 𝑋𝑖2
2 2
n
𝑖=1

or
1 2
𝜎2 = Σ𝑋 − 𝜇 2
𝑛 𝑖
Solving for 𝜇 and 𝜎 we have
and

70 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝜇ˆ = 𝑋‾
1
𝜎ˆ2 = Σ𝑋𝑖2 − 𝑋‾ 2
𝑛
𝑛
1
= ∑ (𝑋𝑖 − 𝑋‾)2
𝑛
𝑖=1

4.3.2: Method of Maximum Likelihood :


from theoretical point of view, the most general method of estimation known is the method of
Maximum Likelihood Estimators (M.L.E.) which was initially formulated by C.F. Gauss but
as a general method of estimation was first introduced by Prof. R.A. Fisher and later developed
by him in a series of papers. Before introducing the method, we will first define Likelihood
Function.
Likelihood Function. Definition. Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be a random sample of size 𝑛 from a
population with density function 𝑓(𝑥, 𝜃). Then the likelihood function of the sample values
𝑥1 , 𝑥2 , … , 𝑥𝑛 , usually denoted by 𝐿 = 𝐿(𝜃) is their joint density function, given by:
𝑛

𝐿 = 𝑓(𝑥1 , 𝜃)𝑓(𝑥2 , 𝜃) … 𝑓(𝑥𝑛 , 𝜃) = ∏ 𝑓(𝑥𝑖 , 𝜃)


𝑖=1

𝐿 gives the relative likelihood that the random variables assume a particular set of values
𝑥1 , 𝑥2 , … , 𝑥𝑛 , For a given sample 𝑥1 , 𝑥2 , … , 𝑥𝑛 , 𝐿 becomes a function of the variable 𝜃, the
parameter.
The principle of maximum likelihood consists in finding an estimator for the unknown
parameter 𝜃 = (𝜃1 , 𝜃2 , … , 𝜃𝑘 ), say, which maximises the likelihood function 𝐿(𝜃) for
variations in parameter, i.e., we wish to find.
𝜃ˆ = (𝜃ˆ1 , 𝜃ˆ2 , … , 𝜃ˆ𝑘 ) so that
𝐿(𝜃ˆ) > 𝐿(𝜃) ∀𝜃 ∈ Θ, 𝑖. 𝑒, 𝐿(𝜃ˆ) = Sup 𝐿(𝜃)∀𝜃 ∈ Θ
Thus, if there exists a function 𝜃ˆ = 𝜃ˆ(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) of the sample values which maximises 𝐿
for variations in 𝜃, then 𝜃ˆ is to be taken as an estimator of 𝜃. 𝜃ˆ is usually called Maximum
Likelihood Estimator (M.L.E.). Thus 𝜃ˆ is the solution, if any, of
∂𝐿 ∂2 𝐿
= 0 and <0
∂𝜃 ∂𝜃 2
Since 𝐿 > 0, and log 𝐿 is a non-decreasing function of 𝐿; 𝐿 and log 𝐿 attain their extreme
values (maxima or minima) at the same value of 𝜃ˆ. The first of the two equations can be written
as

71 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1 ∂𝐿 ∂log 𝐿
⋅ =0 ⇒ =0
𝐿 ∂𝜃 ∂𝜃
a form which is much more convenient from practical point of view.
If 𝜃 is vector valued parameter, then 𝜃ˆ = (𝜃ˆ1 , 𝜃ˆ2 , … , 𝜃ˆ𝑘 ), is given by the solution of
simultaneous equations:
∂ ∂
log 𝐿 = log 𝐿(𝜃1 , 𝜃2 , … , 𝜃𝑘 ) = 0; 𝑖 = 1,2, … , 𝑘
∂𝜃𝑖 ∂𝜃𝑖
The above equations are usually referred to as the Likelihood Equations for estimating the
parameters.
Properties of Maximum Likelihood Estimators. We make the following assumptions,
known as the Regularity Conditions :
∂log 𝐿 ∂2 log 𝐿
(i) The first and second order derivatives, viz., ∂𝜃 and ∂𝜃2 exist and are continuous
functions of 𝜃 in a range 𝑅 (including the true value 𝜃0 of the parameter) for almost all 𝑥. For
∂ ∂2
every 𝜃 in 𝑅, |∂𝜃 log 𝐿| < 𝐹1 (𝑥) and |∂𝜃2 log 𝐿| < 𝐹2 (𝑥) where 𝐹1 (𝑥) and 𝐹2 (𝑥) are
integrable functions over (−∞, ∞).
∂3 ∂3
(ii) The third order derivative ∂𝜃3 log 𝐿 exists such that |∂𝜃3 , log 𝐿| < 𝑀(𝑥), where
𝐸[𝑀(𝑥)] < 𝐾, a positive quantity.
(iii) For every 𝜃 in 𝑅,

∂2 ∂2
𝐸 (− log 𝐿) = ∫ (− log 𝐿) 𝐿𝑑𝑥 = 𝐼(𝜃), is finite and non-zero.
∂𝜃 2 −∞ ∂𝜃 2
(iv) The range of integration is independent of 𝜃. But if the range of integration depends on 𝜃,
then 𝑓(𝑥, 𝜃) vanishes at the extremes depending on 𝜃.
This assumption is to make the differentiation under the integral sign valid.
Under the above assumptions M.L.E. possesses a number of important properties, which will
be stated in the form of theorems.
Theorem 1 (Cramer-Rao Theorem). "With probability approaching unity as 𝑛 → ∞, the

likelihood equation ∂𝜃 log 𝐿 = 0, has a solution which converges in probability to the true
value 𝜃0 ". In other words M.L.E.'s are consistent.
Note. MLE's are always consistent estimators but need not be unbiased. For example in
sampling from 𝑁(𝜇, 𝜎 2 ) population.
MLE (𝜇) = 𝑥‾ (sample mean), which is both unbiased and consistent estimator of 𝜇.
MLE (𝜎 2 ) = 𝑠 2 (sample variance), which is consistent but not unbiased estimator of 𝜎 2 .
Theorem 2. (Hazoor Bazar's Theorem). Any consistent solution of the likelihood equation
72 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

provides maximum of the likelihood with probability tending to unity as the sample size (n)
tends to infinity.
Theorem 3 (ASYMPTOTIC NORMALITY OF MLE'S). A consistent solution of the
likelihood equation is asymptotically normally distributed about the true value 𝜃0 . Thus, 𝜃ˆ is
1
asymptotically 𝑁 (𝜃0 , 𝐼(𝜃 )), as 𝑛 → ∞,
0
1 1
Note. Variance of M.L.E. is given by : 𝑉(𝜃ˆ) = 𝐼(𝜃) = ∂2
{𝐸(− 2 log 𝐿}
∂𝜃

Theorem 4. If M.L.E. exists, it is the most efficient in the class of such estimators. Theorem
17.15. If a sufficient estimator exists, it is a function of the Maximum Likelihood Estimator.
Proof. If 𝑡 = 𝑡(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) is a sufficient estimator of 𝜃, then Likelihood Function can be
written as
𝐿 = 𝑔(𝑡, 𝜃)ℎ(𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 ∣ 𝑡), where 𝑔(𝑡, 𝜃) is the density function of 𝑡 and 𝜃 and
ℎ(𝑥1 , 𝑥2 , … , 𝑥𝑛 ∣ 𝑡) is the density function of the sample, given 𝑡, and is independent of 𝜃.
∴ log 𝐿 = log 𝑔(𝑡, 𝜃) + log ℎ(𝑥1 , 𝑥2 , … , 𝑥𝑛 ∣ 𝑡)
∂log 𝐿 ∂
Differentiating w.r.t to 𝜃, we get: = ∂𝜃 log 𝑔(𝑡, 𝜃) = 𝜓(𝑡, 𝜃), (say),
∂𝜃

which is a function of 𝑡 and 𝜃 only.


∂ log 𝐿
M.L.E. of 𝜃 is given by = 0 ⇒ 𝜓(𝑡, 𝜃) = 0
∂𝜃

∴ 𝜃ˆ = 𝜂(𝑡) = Some function of sufficient statistic


⇒ 𝑡ˆ = 𝜉(𝜃ˆ) = Some function of M.L.E.
Hence the theorem.
Example 1. For random sampling from a normal population, find maximum likelihood
estimators for
(i) the population mean, when the population variance is known,
(ii) the population variance, when the population mean is known,
(iii) the simultaneous estimation of both the population means and variance.
1 𝑥−𝜇 2
1
Solution. Let the p.d.f. of Normal distribution be 𝑓(𝑥, 𝜇, 𝜎 2 ) = 𝜎√2𝜋 𝑒 −2( )
𝜎

Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 denote a random sample from this distribution.

73 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1 1 𝑥 −𝜇 2 1 1 𝑥𝑗 −𝜇 2
− ( 1 )
𝑒 2 𝜎 )
− (
𝐿 = 𝑒 2 𝜎
….
𝜎√2𝜋 𝜎√2𝜋
𝑛 𝑛
1 −
1
=( ) 𝑒 2𝜎2 ∑ (𝑥𝑖 − 𝜇)2
𝜎√2𝜋 𝑖=1

Then the likelihood function is


𝑛
1
log 𝐿 = −𝑛log √2𝜋 − 𝑛log 𝜎 − 2 ∑ (𝑥𝑖 − 𝜇)2
2𝜎
𝑖=1
2
(i) 𝜎 is known, then 𝐿 is a function of 𝜇 only. Thus, to estimate 𝜇, we have, the likelihood
equation.
∂log 𝐿 2𝑛
= 2 (𝑥‾ − 𝜇) = 0
∂𝜇 2𝜎
Solving the equation for 𝜇, we get
𝜇ˆ = 𝑥‾
(ii) If 𝜇 is known, then 𝐿 is a function of 𝜎 2 only. To find m.l.e. for 𝜎 2 we differentiate 𝐿
w.r.t. 𝜎 2 .
The likelihood equation is
∂log 𝐿 𝑛 1
= − + Σ(𝑥𝑖 − 𝜇)2 = 0.
∂𝜎 2 2𝜎 2 2𝜎 4
Solving this equation for 𝜎 2 , we get
𝑛
1
𝜎 = ∑ (𝑥𝑖 − 𝜇)2
2
𝑛
𝑖=1
2
(iii) To estimate 𝜇 and 𝑐 both, the likelihood equations are
∂log 𝐿 1
= 2 Σ(𝑥𝑖 − 𝜇) = 0, and
∂𝜇 𝜎
∂log 𝐿 𝑛 1
2
= − 2 + 2 Σ(𝑥𝑖 − 𝜇)2 = 0
∂𝜎 2𝜎 2𝜎
Solving these equations, we get
and
𝜇ˆ = 𝑥‾
𝑛 𝑛
1 1
𝜎ˆ 2 = ∑ (𝑥𝑖 − 𝜇ˆ) = ∑ (𝑥𝑖 − 𝑥‾)2 .
𝑛 𝑛
𝑖=1 𝑖=1

74 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Example 2. For the binomial distribution


𝑛
𝑓(𝑥, 𝑝) = 𝐶𝑥 𝑝 𝑥 𝑞 𝑝−𝑥 , 𝑥 = 0,1,2, … , 𝑛
find the maximum likelihood estimator of 𝑝 and its variance.

Solution. Let 𝑥1 , 𝑥2 , … . 𝑥𝑚 be a random sample of size 𝑚 from 𝑓(𝑥, 𝑝). Then


Likelihood function
𝑚

𝐿(𝑝) = ∏ 𝑓(𝑥𝑖 , 𝑝)
𝑖=1
𝑚
𝑛
=∏ 𝐶𝑥𝑖 𝑝 𝑥𝑖 𝑞 𝑛−𝑥𝑖
𝑖=1
𝑛

= 𝑝2𝑥𝑖 𝑞 𝑝𝑚−2𝑥𝑖 , (∏ 𝑛 𝐶𝑥𝑖 )


𝑖=1
𝑛
2𝑥𝑖
=𝑝 (1 − 𝑝)𝑚𝑚−∑ 𝑥𝑖 , (∏ 𝑛 𝐶𝑥𝑖 )
𝑖=1

Taking logarithm, we have


𝑛
𝑛
log 𝐿(𝑝) = Σ𝑥𝑖 log 𝑝 + (𝑛𝑚 − Σ𝑥𝑖 )log (1 − 𝑝) + log (∏ 𝐶𝑥𝑖 )
𝑖=1

∂log 𝐿(𝑝) Σ𝑥𝑖 𝑛𝑚 − Σ𝑥𝑖


=0= + (−1)
∂𝑝 𝑝 (1 − 𝑝)
Σ𝑥𝑖 𝑛𝑚 − Σ𝑥𝑖
⇒ =
𝑝 1−𝑝
⇒ (1 − 𝑝)Σ𝑥𝑖 = 𝑝(𝑛𝑚 − Σ𝑥𝑖 )
⇒ Σ𝑥𝑖 − 𝑝Σ𝑥𝑖 = 𝑛𝑚𝑝 − 𝑝Σ𝑥𝑖
⇒ 𝑛𝑚𝑝 = Σ𝑥𝑖
𝑚
Σ𝑥𝑖 𝑥‾ 1
∴ 𝑝ˆ = or [∵ 𝑥‾ = ∑ 𝑥𝑖 ]
𝑛𝑚 𝑛 𝑚
𝑖=1

75 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Example 3. Find the M.L.E. for the parameter 𝜆 of a Poisson distribution from 𝑛 = 6 sample
values, where the six observed values are 2,8,0,6,2 and 3.

Solution. The probability function of a Poisson distribution with parameter 𝜆 is


𝜆𝑥
𝑓(𝑥, 𝜆) = 𝑒 −𝜆 𝑥! ,
= 0 elsewhere 𝑥 = 0,1,2, … , ∞
Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 denote a random sample of size 𝑛 from this distribution, then the joint
probability function of (𝑥1 , 𝑥2 , … . , 𝑥𝑛 ) is
𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑛 , 𝜆) = 𝑓(𝑥1 , 𝜆)𝑓(𝑥2 , 𝜆) … 𝑓(𝑥𝑛 , 𝜆)
𝑥 𝑥
𝜆𝑥1 −𝜆𝜆𝑥 2 𝜆 𝑛 𝜆2 !
−𝜆 −𝜆
=𝑒 ⋅𝑒 2 … 𝑒 𝑛! 𝑥
𝑥1 !
For fixed values of 𝑥1 , 𝑥2 , … , 𝑥𝑛 this function is a function of 𝜆 only. Thus, the likelihood
function is
𝜆 𝑥𝑖
𝑛 −𝜆
𝐿(𝜆) = 𝜋𝑖=1 𝑒 𝑥𝑖 !
−𝑛𝜆 ∑𝑛
𝑖=1 𝑥1
𝑒
=
∏𝑛𝑖=1 (𝑥𝑖 !)
∴ The natural logarithm is
𝑛 𝑛

log 𝐿(𝜆) = −𝑛𝜆 + ∑ 𝑥𝑖 log 𝜆 − log ∏ (𝑥𝑖 !)


𝑖=1 𝑖=1

Differentiating w.r.t. 𝜆 and setting the derivative equal to zero, we get the likelihood equation

∂log 𝐿(𝜆) Σ𝑥𝑖


= −𝑛 + =0
∂𝜆 𝜆
Σ𝑥
which gives 𝜆ˆ = 𝑛 𝑖 = 𝑥‾, the sample mean
2+8+0+6+2+3 21
Numerical. 𝜆ˆ = 𝑥‾ = = = 3 ⋅ 5.
6 6

Example 4. Let 𝑋 be a random variable with the probability density function 𝑓(𝑥, 𝛽) = (𝛽 +
1)𝑥 𝛽 for 0 < 𝑥 < 1. 𝛽 > −1. Obtain the m.l.e. of 𝛽 based on a sample 𝑋1 , … . , 𝑋𝑛 from
𝑓(𝑥, 𝛽). State the invariance property of the m.l.e. and use it to write the m.l.e. of 2𝛽 2 + 1.
Solution. Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 denote a random sample from this distribution. Then the
likelihood function is

76 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝛽 𝛽 𝛽
𝐿(𝛽) = (1 + 𝛽)𝑥1 ⋅ (1 + 𝛽)𝑥2 … (1 + 𝛽)𝑥𝑛
𝑛
𝛽
= (1 + 𝛽)𝑛 ∏ 𝑥𝑖
𝑖=1

∴ The natural logarithm is


𝑛
𝛽
log 𝐿(𝛽) = 𝑛log (1 + 𝛽) + log ∏ 𝑥𝑖
𝑖=1
𝑛

= 𝑛log (1 + 𝛽) + 𝛽 ∑ log 𝑥𝑖
𝑖=1

Differentiating w.r.t. 𝛽 and setting the derivative equal to zero, we get the likelihood
equation
𝑛
∂log 𝐿(𝛽) 𝑛
= + ∑ nog 𝑥𝑖 = 0
∂𝛽 1+𝛽
𝑖=1
𝑛
or 𝑛 + (1 + 𝛽) ∑ log 𝑥𝑖 = 0
𝑖=1
𝑛 + Σlog 𝑥𝑖 1
𝛽ˆ = − ( ) = − (1 + )
∑ log 𝑥𝑖 log 𝐺
Since 0 ≤ 𝑥 ≤ 1, log 𝑥𝑖 ≤ 0 for all 𝑖 and 𝑛 > Σlog 𝑥𝑖 .
where 𝐺 is the geometric mean of the sample values.
4.4 IN-TEXT QUESTIONS
Self Assesement questions: MCQ’s Problems
Question: 1
Let 𝑋 and 𝑌 be two independent 𝑁(0,1) random variables. Then 𝑃(0 < 𝑋 2 + 𝑌 2 < 4) equals
A. 1 − 𝑒 −2

B. 1 − 𝑒 −4
C. 1 − 𝑒 −1
D. 𝑒 −2

77 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question: 2
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 denote a random sample from a normal distribution with
(𝑋𝑖 −𝑋‾)2
variance 𝜎 2 > 0. If the first percentile of the statistic 𝑊 = ∑𝑛𝑖=1 is 1.24 and
𝜎2

𝑃(𝜒72 ≤ 1.24) = 0.01 and 𝑃(𝜒72 > 1.24) = 0.99, where 𝑋‾ denotes the sample mean, what is
the sample size n ?
A. 7
B. 8
C. 6
D. 5
Question: 3
Consider the sample linear regression model 𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖 , 𝑖 = 1,2, … , 𝑛
Where 𝜖𝑖′ 𝑠 are i.i. d random variables with mean 0 and variance 𝜎 2 ∈ (0, ∞)
Suppose that we have a data set
(𝑥1 , 𝑦2 ), … , (𝑥𝑛 , 𝑦𝑛 ) with n = 20,
∑𝑛𝑖=1 𝑥𝑖 = 100, ∑𝑛𝑖=1 𝑦𝑖 = 50, ∑𝑛𝑖=1 𝑥𝑖2 = 600, ∑𝑛𝑖=1 𝑦𝑖2 = 500 and ∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 = 400
Then the least square estimates of 𝛼 and 𝛽 are respectively,
A. 5 and 3/2
B. −5 and 3/2

C. 5 and −3/2
D.-5 and −3/2

Question: 4
If 𝑋1 , 𝑋2 , … , 𝑋𝑛 is random sample from a population with density
𝜃−1
if 0<𝑥 < 1
𝑓(𝑥, 𝜃) = {𝜃𝑥
0, otherwise
Where 𝜃 > 0 is an unknown parameter, what is 100(1 − 𝛼)% confidence interval for 𝜃 ?
2
𝜒𝛼 (2𝑛) 𝜒2 𝛼 (2𝑛)
1−
2 2
A. [ , ]
2 ∑𝑛
𝑖=1 ln 𝑋𝑖 2 ∑𝑛
𝑖=1 ln 𝑋𝑖

78 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

2
𝜒𝛼 (𝑛) 𝜒2 𝛼 (𝑛)
1−
2 2
B. [ , ]
−2 ∑𝑛
𝑖=1 ln 𝑋𝑖 −2 ∑𝑛
𝑖=1 ln 𝑋𝑖

2
C. 𝜒𝛼 (2𝑛) 𝜒2 𝛼 (2𝑛)
1−
2 2
[ , ]
−2 ∑𝑛
𝑖=1 ln 𝑋𝑖 −2 ∑𝑛
𝑖=1 ln 𝑋𝑖

2
𝜒𝛼 (𝑛) 𝜒2 𝛼 (𝑛)
1−
2 2
D. [2 ∑𝑛 , ]
𝑖=1 ln 𝑋𝑖 2 ∑𝑛
𝑖=1 ln 𝑋𝑖

Question: 5
Let 𝑋1 , … , 𝑋𝑛 be a random sample from a 𝑁(2𝜃, 𝜃 2 ) population, 𝜃 > 0. A consistent
estimator for 𝜃 is
1
A. ∑𝑛𝑖=1 𝑋𝑖
𝑛

5 1/2
B. (𝑛 ∑𝑛𝑖=1 𝑋𝑖2 )
1
C. ∑𝑛𝑖=1 𝑋𝑖2
5𝑛

1 1/2
D. (5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 )
Question: 6
What is the arithmetic mean of the data set: 4, 5, 0, 10, 8, and 3?
A. 4
B. 5
C. 6
D. 7
Question: 7
Which of the following cannot be the probability of an event?
A. 0.0
B. 0.3
C. 0.9
D. 1.2
79 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question: 8
If a random variable X has a normal distribution, then eX has a _____ distribution.
A. lognormal
B. exponential
C. poisson
D. binomial

Question : 9
What is the geometric mean of: 1, 2, 8, and 16?
A. 4
B. 5
C. 6
D. 7

Question: 10
Which test is applied to Analysis of Variance (ANOVA)?
A. t test
B. z test
C. F test
D. χ2 test

Question : 11
The arithmetic mean of all possible outcomes is known as
A. expected value
B. critical value
C. variance
D. standard deviation

Question: 12
Which of the following cannot be the value of a correlation coefficient?

80 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

A. –1
B. –0.75
C. 0
D. 1.2

Question: 13
Var (X) = ?
A. E[X2]
B. E[X2] – E[X]
C. E[X2] + E[X]2
D. E[X2] – E[X]2

Question: 14
Var (X + Y) = ?
A. E[X/Y] + E[Y]
B. E[Y/X] + E[X]
C. Var(X) + Var(Y) + 2 Cov(X, Y)
D. Var(X) + Var(Y) – 2 Cov(X, Y)
Question: 15
What is variance of the data set: 2, 10, 1, 9, and 3?
A. 15.5
B. 17.5
C. 5.5
D. 7.5

Question: 16
In a module, quiz contributes 10%, assignment 30%, and final exam contributes 60% towards
the final result. A student obtained 80% marks in quiz, 65% in assignment, and 75% in the
final exam. What are average marks?
A. 64.5%
B. 68.5%

81 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

C. 72.5%
D. 76.5%

Question: 17
In a university, average height of students is 165 cm. Now, consider the following Table,
HEIGHT 160-162 162-164 164-166 166-168 168-170
STUDENTS 16 20 24 20 16

What type of distribution is this?


A. Normal
B. Uniform
C. Poisson
D. Binomial

Question: 18
What is the average of 3%, 7%, 10%, and 16% ?
A. 8%
B. 9%
C. 10%
D. 11%

Question: 19
The error of rejecting the null hypothesis when it is true is known as
A. Type-I error
B. Type-II error
C. Type-III error
D. Type-IV error

Question: 20
The mean and variance of Poisson distribution with parameter lamda are both

82 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

A. 0
B. 1
C. λ
D. 1/λ

Questions : 21
Which of the following statements is(are) TRUE?
A. The marginal distribution of 𝑋 is Poisson with mean 1/2
B. The random variable 𝑋 and 𝑌 are independent
1
C. The conditional distribution of X given Y = 5 is Bin (6, 2)
D. 𝑃(𝑌 = 𝑛) = (𝑛 + 1)𝑃(𝑌 = 𝑛 + 2) for 𝑛 = 0,1,2, …
Question 22.
Consider the trinomial distribution with the probability mass function
2! 1 𝑥 2 𝑦 3 2−𝑥−𝑦
𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) = ( ) ( ) ( )
𝑥! 𝑦! (2 − 𝑥 − 𝑦)! 6 6 6
, 𝑥 ≥ 0, 𝑦 ≥ 0, and 0 < 𝑥 + 𝑦 ≤ 2. Then Corr (𝑋, 𝑌) is equal to…
(correct up to two decimal places)
A) -0.31
B) 0.31
C) 0.35
D) 0.78
Question 23.
Let 𝑥1 = 1.1, 𝑥2 = 0.5, 𝑥3 = 1.4, 𝑥4 = 1.2 be the observed values of a random sample of size
four from a distribution with the probability density function
𝑒 𝜃−𝑥 , if 𝑥 ≥ 𝜃, 𝜃 ∈ (−∞, ∞)
𝑓(𝑥 ∣ 𝜃) = {
0, otherwise
Then the maximum likelihood estimate of 𝜃 2 + 𝜃 + 1 is equal (up to decimal place).
A) 1.75
B) 1.89

83 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

C) 1.74
D) 0.87
Question : 24
Let 𝑈 ∼ 𝐹5,8 and 𝑉 ∼ 𝐹8,5. If 𝑃[𝑈 > 3.69] = 0.05, then the value of C such that
𝑃[𝑉 > 𝑐] = 0.95 equals… (round off two decimal places)
A) 0.27
B) 1.27
C) 2.27
D) 2.29
Question 25.
Let P be a probability function that assigns the same weight to each of the points of the sample
space Ω = {1,2,3,4}. Consider the events E = {1,2}, F = {1,3} and G = {3,4}. Then which of
the following statement(s) is (are) TRUE?
1. E and F are independent
2. E and G are independent
3. E, F and G are independent
Select the correct answer using code given below:
A. 1 only
B. 2 only
C. 1 and 2 only
D. 1,2 and 3
Question : 26
Let 𝑋1 , 𝑋2 , … , 𝑋4 and 𝑌1 , 𝑌2 , … , 𝑌5 be two random samples of size 4 and 5 respectively,
from a standard normal population. Define the statistic
5 𝑋12 + 𝑋22 + 𝑋32 + 𝑋42
T=( ) 2
4 𝑌1 + 𝑌22 + 𝑌32 + 𝑌42 + 𝑌52
then which of the following is TRUE?
A. Expectation of 𝑇 is 0.6
B. Variance of T is 8.97

84 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

C. T has F-distribution with degree of freedom 5 and 4


D. T has F-distribution with degree of freedom 4 and 5
Question : 27
Let 𝑋, 𝑌 and 𝑍 be independent random variables with respective moment generating
1 2 /2
function 𝑀𝑋 (𝑡) = 1−𝑡 , 𝑡 < 1; 𝑀𝑌 (𝑡) = 𝑒 𝑡 = 𝑀𝑍 (𝑡)
𝑡 ∈ ℝ. Let 𝑊 = 2𝑋 + 𝑌 2 + 𝑍 2 then P(W > 2) is equals to
A. 2𝑒 −1
B. 2𝑒 −2
C. 𝑒 −1
D. 𝑒 −2
Question : 28
Let 𝑥1 = 3, 𝑥2 = 4, 𝑥3 = 3, 𝑥4 = 2.5 be the observed values of a random sample from the
probability density function
1 1 𝑥 1 −𝑥
𝑓(𝑥 ∣ 𝜃) = [ 𝑒 − 𝜃 + 2 𝑒 𝜃2 + 𝑒 −𝑥 ] , 𝑥 > 0, 𝜃 ∈ (0, ∞)
3 𝜃 𝜃
Then the method of moment estimate (MME) of 𝜃 is
A. 1.5
B. 2.5
C. 3.5
D. 4.5
Question : 29
Let 𝑋 be a random variable with cumulative distribution function
1 𝑛+2𝑘+1
𝑃(𝑋 = ℎ, 𝑌 = 𝑘) = ( ) ; 𝑛 = −𝑘, −𝑘 + 1, … , ; 𝑘 = 1,2, …
2
Then E(Y) equals
A. 1
B. 2
C. 3
D. 4
85 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question : 30
Let 𝑋 be a random variable with the cumulative distribution function
0, 𝑥<0
1 + 𝑥2
, 0≤𝑥<1
𝐹(𝑥) = 10
3 + 𝑥2
, 1≤𝑥<2
10
{ 1, 𝑥≥2
Which of the following statements is (are) TRUE?
3
A. 𝑃(1 < 𝑋 < 2) = 10
31
B. 𝑃(1 < 𝑋 ≤ 2) = 5
11
C. 𝑃(1 ≤ 𝑋 < 2) = 2
41
D. 𝑃(1 ≤ 𝑋 ≤ 2) = 5

SELF ASSESEMENT TRUE/ FALSE QUESTIONS


Objective Type Questions
1 State True or False:
(i) The number of equations 𝜇𝑟′ = 𝑚𝑟′ in the method of moments is taken equal to
the number of unknown parameters. (True)
(ii) The method of moments estimates is biased. (False)
(iii) The method of moments estimates is not efficient, in general. (True)
(iv) The moments method estimates and maximum likelihood (True) estimates
often coincide.
(v) The moments method of estimation fails in case of some distributions, the
moment may not exist. (True)
(vi) The method of maximum likelihood for estimation of parameters was
introduced by Prof. R.A. Fisher. (True)
(vii) Maximum likelihood estimators are consistent. (True)
(viii) Maximum likelihood estimators are not necessarily unbiased. (True)
(ix) Maximum likelihood estimators are asymptotically normal. (True)

86 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(x) Minimum variance unbiased estimators are unique under certain general
conditions. (True)
(xi) The minimum variance unbiased estimator for 𝜃 does not exist in the
Cauchy distribution
1 1
𝑓(𝑥, 𝜃) = 𝜋 ⋅ 1+(𝑥−𝜃)2 , −∞ ≤ 𝑥 ≤ ∞ (True)

2. Fill in the blanks:


(i) The method of moments estimates is asymptotically distributed. [Ans.
normally]
(ii) If a sufficient estimator 𝑇 for 𝜃 exists, then any solution of the likelihood
equation will be a .......... of 𝑇. [Ans. function]
(iii) A random sample of size 3 is taken from.
1
𝑓(𝑥, 𝜃) =
. 0 ≤ 𝑥 ≤ 𝜃, 𝜃 > 0
𝜃
The sample values are 𝑥1 = 13, 𝑥2 = 6, 𝑥3 = 22.
The maximum likelihood estimate of 𝜃 is [Ans. 𝑥(3) = 22]
∂log 𝐿
(iv) The necessary and sufficient condition for the existence of MVUE is =
∂𝜃
𝑇−𝜃
. Then
𝜆

(v) The importance of the method of minimum variance over other methods is that
it gives also then variance of T is …… (Ans. Lambda) [Ans. Variance]
3. In each of the following questions, four alternative answers are given in which
only one is correct. Select the correct answer and write the letter (a), (b) (c) or (d):

(i) The method of moments for determining point estimators of the population
parameters was discovered by
(a) Karl Pearson
(b) R.A. Fisher
(c) Cramer-Rao
(d) Rao-Blackwell
Ans. (a)
2
(ii) Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be a random sample from 𝑓(𝑥, 𝛽) = 𝛽2 (𝛽 − 𝑥), 𝛼 ≤ 𝑥 ≤ 𝛽.

87 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

The estimate of 𝛽 obtained by the method of moments is


(a) 𝑋‾
(b) 2𝑋‾
(c) 3𝑋‾
(d) 4𝑋‾ Ans. (c)
1
(iii) If 𝑓(𝑥, 𝜃) = 2 𝑒 −(𝑥−𝜃) , then the m.l.e. of 𝜃 is :

(a) sample mean


(b) sample mode
(c) sample median
(d) none of these Ans. (c)
1
(iv) Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from p.d.f., 𝑓(𝑥, 𝜃) = 𝜃 , 0 < 𝑥 < 𝜃.

The m.le. for 𝜃 is


(a) Min (𝑋𝑖 )
(b) Max (𝑋𝑝 )
1
(c) Σ𝑋𝑖
𝑛

(d) Σ𝑋𝑖 Ans. (b)

(v) If the likelihood function of the sample values 𝑥1 , 𝑥2 , … + 𝑥𝑛 is denoted by 𝐿


and 𝜃 is the maximum likelihood estimator then 𝜃 is the solution of
∂𝐿 ∂2 𝐿
(a) = 0 and ∂𝜃2 > 0
∂𝜃
∂𝐿 ∂2 𝐿
(b) = 0 and ∂𝜃2 < 0
∂𝜃
∂𝐿 ∂2 𝐿
(c) ≠ 0 and ∂𝜃2 = 0
∂𝜃
∂𝐿 ∂2 𝐿 ∂2 log 𝐿
(d) = 0 and ∂𝜃2 = and of the equation an = [ ] 𝜃 = 𝑇 then 𝑇 is
∂𝜃 ∂𝜃2

the maximum likelihood estimate of 𝜃 for

88 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(a) 𝑙<0
(b) 𝑙>0
(c) 𝑙=0
(d) 𝑙 ≠ 0 Ans. (a)

(vii) The necessary and sufficient condition for the existence of minimum variance
∂log 𝐿
unbiased estimator 𝑇 of 𝜓(𝜃) is = 𝑘(𝜃, 𝑛)[𝑇 − 𝜓(𝜃)]
∂𝜃

Then Var (𝑇) is


(a) 𝐾(𝜃, 𝑛)
1
(b) 𝐾(𝜃,𝑛)
𝑌 ′ (𝜃)
(c) 𝐾(𝜃,𝑛)

(d) 𝜓 (𝜃), 𝐾(𝜃, 𝑛) Ans. (c)
(viii) If 𝑇1 and 𝑇2 are unbiased estimators of 𝜃 and 𝜃 2 respectively (0 < 𝜃 < 1) and
𝑇 is a sufficient statistic, then
𝐸[𝑇1 /𝑇] − 𝐸[𝑇2 /𝑇] is
(a) the minimum variance unbiased estimator of 𝜃
(b) always an unbiased estimator of 𝜃(1 − 𝜃)
(c) the maximum likelihood estimator for 𝜃 + 𝜃 2
(d) rot an unbiased estimator of 𝜃(1 − 𝜃).
Ans. (b)

(ix) The statement If 𝑇1 is an unbiased estimator of 𝜃 and 𝑇2 is safficient for 𝜃,


𝑇
then Var |𝐸 (𝑇1 )| ≤ Var(𝑇1 ) be longs to
2

(a) Cramer Rao Inequality


(b) Rao-Blackwill Theorem
(c) Maximam Likelihood Estimators
(d) None of these.
Ans. (b)

89 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

(x) Let 𝑋1 , 𝑋2 , 𝑋𝑎 be a nandom sample from a unsform distribution with probability


density function.

1
𝑓(𝑥) = {0 ; 0 < 𝑥 < 0
0 otherwise
The minimum varianse unbiased estimator for 𝜃 is.
(a) Max (𝑋1 , 𝑋2 , … … , 𝑋𝑛 )
𝑋1 +𝑋2 +⋯+𝑋𝑛
(b) 𝑛

(c) Min (𝑋1 , 𝑋2 , … … , 𝑋𝑛 )


𝑛+1
(d) [Max, (𝑋1 , 𝑋2 , … − 𝑋𝑛 )]
𝑛

(xi) The maximum likelihood estimators are necessarily


(a) unbiased
(b) sufficient
(c) most efficient
(d) unique

(xii) If a sufficient estimator exists, it is a function of


(a) MLE
(b) Unbiased estimator
(c) consistent estimator
(d) All of these
Ans. (a)

(xiii) If 𝑇 = 𝑟(𝑋1 , 𝑋2 , … , 𝑋𝑛 ) is a sufficient statistic for a parametric 𝜃 and 𝑎 unique MLE,


𝜃 of 𝜃 exists then

(a) 𝜃ˆ = 𝑓(𝑋1 , 𝑋2 , … … , 𝑋𝑛 )

(b) 𝜃ˆ is a function of 𝑡

(c) 𝐵ˆ is independent of 𝑡
90 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(d) none of the above

Ans. (b)

(xiv) A minimum variance unbiased estimator is said to be unique if for any other
estimator 𝑇𝑛′
(a) Var (𝑇𝑛 ) = Var (𝑇𝑛 )
(b) Var (𝑇𝑛 ) ≤ Var (𝑇𝑛 )
(c) both (a) and (b)
(d) neither (a) nor (b)
Ans. (a)

4.5 SUMMARY
The main points which we have covered in this lessons are what is estimator and what is
consistency, efficiency and sufficiency of the estimator and how to get best estimator.
4.6 GLOSSARY
Motivation: These Problems are very useful in real life and we can use it in data science ,
economics as well as social sciemce.
Attention: Think how the Methods of Estimation are useful in real world problems.
4.7 ANSWER TO IN-TEXT QUESTIONS
Answer 1 : A
Explanation:
2 2
Since 𝑋 2 + 𝑌 2 ∼ 𝜒(2) , we know that 𝜒(2) random variable is the same as that of the
exponential random variable with mean 2 and therefore, we have
4 1 −𝑡/2
𝑃(0 < 𝑋 2 + 𝑌 2 < 4) = ∫0 𝑒 𝑑𝑡 = 1 − 𝑒 −2
2

Hence option A is the correct choice.


Answer 2 : B
Explanation:

91 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1 𝑛
(𝑋𝑖 − 𝑋‾)2 2
= 𝑃(𝑊 ≤ 1.24) = 𝑃 (∑𝑖=1 2
≤ 1.24) = 𝑃(𝜒𝑛−1 ≤ 1.24)
100 𝜎
Thus from the given value 𝑃(𝜒72 ≤ 1.24) = 0.01
we get
n−1=7
and hence the sample size n is 8 .
Answer 3 : B
Explanation:
𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖 , 𝑖 = 1,2, … , 𝑛
5
∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑛𝑥‾𝑦‾ 400 − 20 × 5 × 2 150 3 5 3 10
𝛽ˆ = 𝑖 2
= 2
= = ; 𝛼
ˆ = 𝑦
‾ − 𝑥
‾𝛽ˆ = −5× =−
∑𝑖=1 𝑥𝑖 − 𝑛𝑥‾ 2 600 − 20 × 5 100 2 2 2 2
= −5
Hence option B is correct.
Answer 4 : C
Explanation:
We use the random variable 𝑄 = −2𝜃∑𝑛𝑖=1 ln 𝑋𝑖 ∼ 𝜒(2𝑛)
2

As the pivotal quantity. The 100(1 − 𝛼)%


confidence interval for 𝜃 can be constructed from
𝜒𝛼2 (2𝑛) 2
𝜒1− 𝛼 (2𝑛)
1−𝛼 = 𝑃 (𝜒𝛼2 (2𝑛) ≤𝑄≤ 2
𝜒1− 𝛼 (2𝑛)) = 𝑃[ 2
≤𝜃≤ 2
]
2 2 −2∑𝑛𝑖=1 ln 𝑋𝑖 𝑛
−2∑𝑖=1 ln 𝑋𝑖
2
𝜒𝛼 (2𝑛) 𝜒2 𝛼 (2𝑛)
1−
2 2
Thus, 100(1 − 𝛼)% confidence interval for 𝜃 is given by [ , ]
−2∑𝑛
𝑖=1 ln 𝑋𝑖′ −2∑𝑛
𝑖=1 ln 𝑋𝑖

Hence option C is correct.

Answer 5 : D
Explanation:
2
We have 𝐸(𝑋𝑖2 ) = 𝑉(𝑋𝑖 ) + (𝐸(𝑋𝑖 )) = 5𝜃 2 ; 𝑖 = 1,2, … , 𝑛
𝑥12 𝑥22
Then , ,…
5 5
𝑥2
is a sequence of i.i. d random variables with 𝐸 ( 51 ) = 𝜃 2 . Using WLLN, we get

92 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1 1
∑𝑛 𝑋 2 /5 = 5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 converge in probability to 𝜃 2
𝑛 𝑖=1 𝑖
1 1/2
as 𝑛 → ∞, which implies that (5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 )
1 1/2
converge in probability to 𝜃 as 𝑛 → ∞. Thus (5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 ) is a consistent estimator for 𝜃.
Hence option D is the correct choice.
Answer 6 : B
Explanation :
Here total numbers are 6

4+5+0+10+8+3 30
So, 𝐴𝑀 = = =5
6 6

Answer 7 : D
Explanation :
The probability of an event is always between 0 and 1 (including 0 and 1 ). So, 1.2 cannot be
the probability of an event.
Answer 8 : A
Explanation :
A lognormal distribution is a probability distribution of a random variable whose logarithm is
normally distributed. So, if X is lognormal then Y = ln (X) is normal. Similarly, if Y is
normal then X = eY is lognormal.
Answer 9 : A
Explanation :

There are 4 numbers in total. So, by using the formula for calculating geometric mean, we
have

𝐺. 𝑀 = (1 × 2 × 8 × 16)1/4
= (256)1/4
= (44 )1/4 ∵ 44 = 256
=4

93 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Answer 10 : C
In anova we use F test because we use ratio of two chi square statistics .
Answer 11 : A
Explanation :
Expectation (or expected value) is the arithmetic mean of all possible outcomes of a random
variable.
Answer 12 : D
Explanation :
The value of a correlation coefficient is always between −1 and 1, including −1 and 1.
Answer 13 : D
Explanation :
By definition, Var (𝑋) = 𝐸[𝑋 2 ] − 𝐸[𝑋]2
Answer 14 : C
Explanation :
By definition, Var (𝑋 + 𝑌) = Var (𝑋) + Var (𝑌) + 2Cov (𝑋, 𝑌)
Answer 15 : B
Explanation :
First calculate the mean
2 + 10 + 1 + 9 + 3 25
Mean = = =5
5 5
Now calculate the variance,
(2 − 5)2 + (10 − 5)2 + (1 − 5)2 + (9 − 5)2 + (3 − 5)2
Variance =
5
= 17.5
Answer 16 : C
Explanation :
By using formula for calculating weighted average, we have
Weighted Average = 𝑤1 𝑥1 + 𝑤2 𝑥2 + 𝑤3 𝑥3
= 0.1 × 0.8 + 0.3 × 0.65 + 0.6 × 0.75
= 0.725 = 72.5%

94 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Answer 17 : A
Explanation :

Answer 18 : B
Explanation :
3%, 7%, 10%, 16%
3 + 7 + 10 + 16
Average = %
4
36
= % = 9%
4
Answer 19 : A
Explanation :
Type I error = P(reject H0| when H0 is true )
Answer 20 : C
Explanation :
We know that if random variable x follows Poisson distribution with parameter lamda then
E(X) = V(X)= lamda
Question 21.
Let the discrete random variables 𝑋 and 𝑌 have the joint probability mass function
𝑒 −1
; 𝑚 = 0,1,2, … , 𝑛; 𝑛 = 0,1,2, …
𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) = {(𝑛 − 𝑚)! 𝑚! 2𝑛
0, otherwise

95 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Answer 21 : A
Explanation :
The marginal probability mass function of X is given by
𝑃(𝑋 = 𝑚) = ∑∞
𝑛=𝑚 𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ( for 𝑚 = 0,1,2, … )
1 1 𝑚
𝑒 − 2(2)
= , 𝑚 = 0,1,2, …
𝑚!
Thus the marginal distribution of X is Poisson with mean 1/2.
The marginal probability mass function of 𝑌 is given by
𝑃(𝑌 = 𝑛) = ∑∞ 𝑚=0 𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ( for 𝑛 = 0,1,2, … )
−1
𝑒
= , 𝑛 = 0,1,2, …
𝑛!
Thus the marginal distribution of 𝑌 is Poisson with mean 1 .
𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ≠ 𝑃(𝑋 = 𝑚)𝑃(𝑌 = 𝑛)
Therefore 𝑋 and 𝑌 are not independent.
𝑃(𝑋 = 𝑚, 𝑌 = 5) 5! 1 5
𝑃(𝑋 = 𝑚 ∣ 𝑌 = 5) = = ( ) , 𝑚 = 0,1,2, … ,5
𝑃(𝑌 = 5) 𝑚! (5 − 𝑚)! 2
1
Thus the conditional distribution of 𝑋 given 𝑌 = 5 is B in (5, 2)
𝑃(𝑌=𝑛)
Since 𝑃(𝑌=𝑛+1) = (𝑛 + 1) for 𝑛 = 0,1,2, …

Answer 22 : 𝑨
Explanation :
The trinomial distribution of two r.v.'s 𝑋 and 𝑌 is given by
𝑛!
𝑓𝑋,𝑌 (𝑥, 𝑦) = 𝑝 𝑥 𝑞 𝑦 (1 − 𝑝 − 𝑞)(𝑛−𝑥−𝑦)
𝑥! 𝑦! (𝑛 − 𝑥 − 𝑦)!
for 𝑥, 𝑦 = 0,1,2, … , 𝑛 and 𝑥 + 𝑦 ≤ 𝑛, where p + q ≤ 1.
n = 2, p = 1/6 and q = 2/6
1 1 10
Var (X) = 𝑛𝑝1 (1 − 𝑝1 ) = 2 × (1 − ) = ; Var (Y) = 𝑛𝑝2 (1 − 𝑝2 )
6 6 36
2 2
= 2 × 6̅ (1 − 6) = 16/36

96 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1 2 4
Cov (𝑋, 𝑌) = −𝑛𝑝1 𝑝2 = −2 × × =−
6 6 36
Cov (𝑋, 𝑌) 4
Corr (𝑋, 𝑌) = =− = −0.31
√Var (𝑋)√Var (𝑌) 4√10
Hence −0.31 is the correct answer.
Answer 23 : 𝐀
Explanation :
Let 𝑥1 = 1.1, 𝑥2 = 0.5, 𝑥3 = 1.4, 𝑥4 = 1.2
𝑒 𝜃−𝑥 , if 𝑥 ≥ 𝜃, 𝜃 ∈ (−∞, ∞)
𝑓(𝑥 ∣ 𝜃) = {
0, otherwise
𝜃 ∈ (∞, 𝑋(1) ]
𝑑
Since 𝑑𝜃 𝑓(𝑥 ∣ 𝜃) > 0 ∀𝜃 ∈ (∞, 𝑋(1) ], then

𝑓(𝑥 ∣ 𝜃) is strictly increasing function. So 𝜃ˆ = 𝑋(1) = 0.5, therefore by invariance


property the MLE of 𝜃 2 + 𝜃 + 1 = (0.5)2 + 0.5 + 1 = 1.75.
Hence MLE for 𝜃 2 + 𝜃 + 1 is 1.75.

Answer 24 : 𝐀
Explanation :
1
X ∼ 𝐹(𝑚, 𝑛) then x ∼ 𝐹(𝑛, 𝑚)
𝑃[𝑈 > 3.69] = 0.05 ⇒ 1 − 𝑃[𝑈 > 3.69] = 1 − 0.05
⇒ 𝑃[𝑈 < 3.69] = 0.95
1 1 1
⇒ 𝑃 [𝑈 > 3.69] = 0.95 ⇒ 𝑉 = 𝑈 and
1
𝑐= = 0.27
3.69
Hence c = 0.27 is the correct answer.
Answer 25 : C
Explanation :
Clearly, P({𝜔}) = 1/4 ∀𝜔 ∈ Ω = {1,2,3,4}. We have E = {1,2}, F = {1,3} and G = {3,4}

97 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Then P(E) = P(F) = P(G) = 2/4 = 1/2.


Using this result, we see that E and F are independent and also E and G are independent.
Hence option C is correct.
Answer 26 : D
Explanation :
5 𝑋12 + 𝑋22 + 𝑋32 + 𝑋42 𝑛 5
𝑇=( ) 2 2 2 2 2 ∼ 𝐹(4,5); 𝐸(𝑊) = =
4 𝑌1 + 𝑌2 + 𝑌3 + 𝑌4 + 𝑌5 𝑛−2 3
2(5)2 (7) 350
Var (𝑇) = = = 9.72
4(3)2 (1) 36
Hence option D is correct.
Answer 27 : A
Explanation :
2
Since 𝑊 = 2𝑋 + 𝑌 2 + 𝑍 2 ∼ 𝜒(4)
1 −𝑤/2
𝑓𝑊 (𝑤) = {4 𝑤𝑒 , if 𝑤 > 0
0, otherwise
∞1
𝑃(𝑊 > 2) = ∫2 𝑤𝑒 −𝑤/2 𝑑𝑤 = 2𝑒 −1
4
Hence option A is correct.
Answer 28 : C
Explanation :
1
𝑥‾ = (3 + 4 + 3.5 + 2.5) = 3.25
4
1 1
𝐸(𝑋) = [𝜃 + 𝜃 2 + 1]Γ2 = [𝜃 + 𝜃 2 + 1] = 3.25
3 3
𝜃 2 + 𝜃 − 8.75 = 0 then 𝜃 = 2.5 or −3.5
Since 𝜃 ∈ (0, ∞) then 𝜃 = 2.5
Hence option C is correct.
Answer 29 : B
Explanation :
𝑃(𝑌 = 𝑘) = ∑∞
𝑛=−𝑘 𝑃(𝑋 = 𝑛, 𝑌 = 𝑘): { put m = n + k}
98 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1 1 𝑘−1
= ( ) {𝑘 = 1,2, …
2 2
which is the pmf of geometric distribution with parameter 1/2}
1 1 𝑘−1
𝐸(𝑌) = ∑∞𝑘=0 𝑘 ( ) =2
2 2
Hence option B is correct.
Answer 30 : A
Explanation :
3
𝑃(1 < 𝑋 < 2) = 𝐹(2) − 𝐹(1) − 𝑃(𝑋 = 2) =
10
3
𝑃(1 < 𝑋 ≤ 2) = 𝐹(2) − 𝐹(1) =
5
1
𝑃(1 ≤ 𝑋 < 2) = 𝐹(2) − 𝐹(1) − 𝑃(𝑋 = 2) + 𝑃(𝑋 = 1) =
2
4
𝑃(1 ≤ 𝑋 ≤ 2) = 𝐹(2) − 𝐹(1) + 𝑃(𝑋 = 1) =
5
4.8 REFERENCES
• Devore, J. (2012). Probability and statistics for engineers, 8th ed. Cengage Learning.
• John A. Rice (2007). Mathematical Statistics and Data Analysis, 3rd ed. Thomson
Brooks/Cole
• Larsen, R., Marx, M. (2011). An introduction to mathematical statistics and its
applications. Prentice Hall.
• Miller, I., Miller, M. (2017). J. Freund’s mathematical statistics with applications, 8th
ed. Pearson.
• Demetri Kantarelis, D. and Malcolm O. Asadoorian, M. O. (2009). Essentials of
Inferential Statistics, 5th edition, University Press of America.
4.9 SUGGESTED READINGS
• S. C Gupta , V.K Kapoor, Fundamentals of Mathematical Statistics,Sultan Chand
Publication, 11th Edition.
• B.L Agarwal, Programmed Statistics ,New Age International Publishers, 2nd Edition.

99 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

LESSON 5
CRAMER RAO INEQUALITY

STRUCTURE
5.1 Learning Objectives
5.2 Introduction
5.3 Cramer Rao Inequality
5.3.1 Simple form of C-R Inequality
5.3.2 Regularity Condition
5.3.3 Alternative form of C-R Inequality
5.3.4 Equality Sign in C-R Inequality
5.3.5 Uses of C-R Inequality
5.4 In-Text Questions
5.5 Summary
5.6 Glossary
5.7 Answer to In-Text Questions
5.8 References
5.9 Suggested Readings
5.1 LEARNING OBJECTIVES
The Cramér–Rao inequality gives a lower bound for the variance of an unbiased estimator of
a parameter. It is named after work by Cramér (1946) and Rao (1945). The inequality and the
corresponding lower bound in the inequality are stated for various situations.
5.2 INTRODUCTION
Point estimation is the use of a statistic to estimate the value of some parameter of a population
having a particular type of density. The statistic we use is called the point estimator and its
value is the point estimate. A desirable property for a point estimator T for a parameter 𝜃 is
that the expected value of T is 𝜃. If T is a random variable with density 𝑓 and values 𝜃ˆ, this is
equivalent to saying

𝔼[𝑇] = ∫ 𝜃ˆ𝑓(𝜃ˆ)𝑑𝜃ˆ = 𝜃
−∞

100 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

An estimator having this property is said to be unbiased.


Often in the process of making a point estimate, we must choose among several unbiased
estimators for a given parameter. Thus, we need to consider additional criteria to select one of
the estimators for use. For example, suppose that 𝑋1 , 𝑋2 , … , 𝑋𝑚 are a random sample from a
normal population of mean 𝜇 and variance 𝜎 2 with 𝑛 an odd integer, 𝑚 = 2𝑛 + 1. Let the
density of this function be given by 𝑓(𝑥; 𝜇, 𝜎 2 ). Suppose we wish to estimate the mean, 𝜇, of
this population. It is well-known that both the sample mean, and the sample median are
unbiased estimators of the mean
Often, we will take the unbiased estimator having the smallest variance. The variance of T is,
as for any random variable, the second moment about the mean:

2
var (T) = ∫ (𝜃ˆ − 𝜇 𝑇ˆ ) 𝑓(𝜃ˆ)𝑑𝜃ˆ
−∞

Here, 𝜇T̂ is the mean of the random variable Θ̂, which is 𝜃 in the case of an unbiased estimator.
Choosing the estimator with the smaller variance is a natural thing to do, but by no means is it
the only possible choice. If two estimators have the same expected value, then while their
average values will be equal the estimator with greater variance will have larger fluctuations
about this common value.
An estimator with a smaller variance is said to be relatively more efficient because it will tend
to have values that are concentrated more closely about the correct value of the parameter;
thus, it allows us to be more confident that our estimate will be as close to the actual value as
we would like. Furthermore, the quantity
var T̂1
var T̂2
is used as a measure of the efficiency of T̂2 relative to T̂1 . We hope to maximize efficiency by
minimizing variance.
In our example, the mean of the population has variance 𝜎 2 /𝑚 = 𝜎 2 /(2𝑛 + 1). If the
population median is 𝜇˜, that is 𝜇˜ is such that.
𝜇˜
1
∫ 𝑓(𝑥; 𝜇, 𝜎 2 )𝑑𝑥 =
−∞ 2
then, the sampling distribution of the median is approximately normal with mean 𝜇˜ and
variance.
1
8𝑛 ⋅ 𝑓(𝜇˜)2

101 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Since the normal distribution of our example is symmetric, we must have 𝜇˜ = 𝜇, which makes
it easy to show that 𝑓(𝜇˜) = 1/√2𝜋𝜎 2 . The variance of the sample median is therefore
𝜋𝜎 2 /4𝑛.
Certainly, in our example, the mean has the smaller variance of the two estimators, but we
would like to know whether an estimator with smaller variance exists. More precisely, it would
be very useful to have a lower bound on the variance of an unbiased estimator. Clearly, the
variance must be non-negative 1 , but it would be useful to have a less trivial lower bound. The
Cramér-Rao Inequality is a theorem that provides such a bound under very general conditions.
It does not, however, provide any assurance that any estimator exists that has the minimum
variance allowed by this bound.
5.3 CRAMER-RAO INEQUALITY
The following are some of the main points about Cramer- Rao Inequality.
1. Simple form of C-R inequality
2. Regularity Condition
3. Alternative form of C-R Inequality
4. Equality Sign in C-R Inequality
5. Uses of C-R Inequality

Now we shall now, briefly, explain these terms one by one.


5.3.1 Simple form of C-R Inequality :
Let 𝑓(𝑥, 𝜃) be a pdf or pmf of random variable 𝑋, 𝜃 ∈ Θ being a parameter. Let 𝑥1 , 𝑥2 , … . , 𝑥𝑛
be a random sample of size 𝑛 from this function (or distribution or population) and 𝐿 =
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑛 , 𝜃) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 , 𝜃) = {𝑓(𝑥, 𝜃)}𝑛 be the joint 𝑝𝑑𝑓 or pmf of 𝑥1 , 𝑥2 , … . , 𝑥𝑛 (or
be a likelihood function). Let 𝑇 = 𝑡(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) be an unbiased estimator of 𝜃, then Var (𝑇),
of this unbiased estimator 𝑇, satisfies the inequality
𝑛
1
Var (𝑇) ≥ where 𝐿 = ∏ 𝑓(𝑥𝑖 , 𝜃)
∂log 𝐿 2
𝐸[ ] 𝑖=1
∂𝜃
5.3.2: Regularity Condition of C-R Inequality :
Regularity Conditions. (In case of continuous random variable)
1. Θ the parameter space is a non-degenerate open interval on the real line.

102 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

∂𝐿
2. for almost all 𝑥1 , 𝑥2 , … . , 𝑥𝑛 (and for all 𝜃 ∈ Θ) ∂𝜃 e , ts, the exceptional set if any being
independent of 𝜃 (the case of random sampling from 𝑅(0, 𝜃), for example, is excluded).
That is the range of 𝑥 is independent of 𝜃.
III. Differentiation under the sign of integral is valid. This means among other things the
domain of positive pdf does not depend upon 𝜃. Thus
∂ ∂𝐿
(i) ∫𝐴 𝐿𝑑𝑥1 𝑑𝑥2 … 𝑑𝑥𝑛 = ∫𝐴 ∂𝜃 𝑑𝑥1 𝑑𝑥2 … 𝑑𝑥𝑛 where A denotes the domain of
∂𝜃
positive pdf.
∂ ∂𝐿
(ii) ∫ 𝑡𝐿𝑑𝑥1 𝑑𝑥2 … 𝑑𝑥𝑛 = ∫ 𝐴 𝑡 ∂𝜃 𝑑𝑥1 𝑑𝑥2 … . 𝑑𝑥𝑛 (This condition makes the
∂𝜃 𝐴
result applicable to certain class of estimates.)
∂log 𝐿 2
(iii) First two derivatives of 𝐿 with respect to 𝜃 exist. L.e. 𝐸 [ ] or
∂𝜃
2
∂2 log 𝐿
(iv) [ ∂𝜃2 ] exisis and is positive for every 𝜃 ∈ Θ. In other words, the Regularity
conditions are:
1 Range of the distribution is independent of 𝜃.
2 First two derivatives of 𝐿 with respect to 𝜃 exist.
3 Conditions of uniform convergence are fulfilled so that differentiation
under the sign of integral is valid.
Proof.
Proof. We have
1 = ∫𝐴 𝐿𝑑𝑥1 𝑑𝑥2 … . . 𝑑𝑥𝑛 for all 𝜃 ∈ Θ …..(1)
Differentiating (1) with respect to 𝜃 both sides, we get

0 =
∫ 𝐿𝑑𝑥1 𝑑𝑥2 … . . 𝑑𝑥𝑛
∂𝜃 𝐴
∂𝐿
⇒ 0 =∫ 𝑑𝑥1 𝑑𝑥2 … . . 𝑑𝑥𝑛 (Regularity condition)
𝐴 ∂𝜃
1 ∂𝐿
⇒ 0 =∫ ⋅ 𝐿𝑑𝑥1 𝑑𝑥2 … . . 𝑑𝑥𝑛
𝐴 𝐿 ∂𝜃
(Divide and multiply by 𝐿 )

103 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

∂log 𝐿
⇒ 0=∫ ⋅ 𝐿𝑑𝑥1 𝑑𝑥2 … . . 𝑑𝑥𝑛
𝐴 ∂𝜃
∂log 𝐿
⇒ 0 = 𝐸[ ]
∂𝜃

𝐴𝑔𝑎𝑖𝑛, 𝑇 𝑏𝑒𝑖𝑛𝑔 𝑎𝑛 𝑢𝑛𝑏𝑖𝑎𝑠𝑒𝑑 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑜𝑟 𝑜𝑓 𝜃, 𝑤𝑒ℎ𝑎𝑣𝑒

E(T) = ∫ 𝑡𝐿𝑑𝑥1 𝑑𝑥2 … 𝑑𝑥𝑛


𝐴

𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑡𝑖𝑛𝑔 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡𝑖𝑣𝑒𝑡 𝜃 , 𝑤𝑒 𝑜𝑏𝑡𝑎𝑖𝑛



1 = ∫ 𝑡𝐿𝑑𝑥1 𝑑𝑥2 … 𝑑𝑥𝑛
∂𝜃 𝐴
∂𝐿
⇒ 1 = ∫ 𝑡 𝑑𝑥1 𝑑𝑥2 … . . 𝑑𝑥𝑛
𝐴 ∂𝜃
∂log 𝐿
⇒ 1=∫ 𝑡 𝐿𝑑𝑥1 𝑑𝑥2 … . . 𝑑𝑥𝑛
∂𝜃
Multiplying 1(a) by 𝜃 and subtracting it from (3), we have
∂log 𝐿
1 − 0 = ∫ (𝑡 − 𝜃) 𝐿𝑑𝑥1 𝑑𝑥2 … . . 𝑑𝑥𝑛
∂𝜃
∂log 𝐿
1 = 𝐸 [(𝑇 − 𝜃) ]
∂𝜃
∂log 𝐿
1 = 𝐸 [(𝑇 − 𝜃) ( − 0)]
∂𝜃
2
2
∂log 𝐿
1 = {𝐸 [(𝑇 − 𝜃) ( − 0)]}
∂𝜃
Hence from Cauchy-Schwarz Inequality
1
[𝐸(𝑈𝑉)2 ] ≤ 𝐸(𝑈 2 )𝐸(𝑉 2 ), we get 𝐸(𝑇 − 𝜃)2 ≥
2 ∂log 𝐿 2
∂log 𝐿 𝐸( )
12 ≤ 𝐸(𝑇 − 𝜃)2 𝐸 [ − 0] ∂𝜃
∂𝜃 1
∂log 𝐿 2 That is Var (𝑇) ≥
1 ≤ 𝐸(𝑇 − 𝜃)2 𝐸 ( ) ∂log 𝐿 2
∂𝜃 𝐸( )
∂𝜃
where 𝐿 = ∏𝑛𝑖=1 𝑓(𝑥𝑖 , 𝜃)

104 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Note 1. R.H.S. in (5) gives the lower bound of the variance of 𝑇 and is sometimes known as
Minimum Variance Bound (𝑀𝑉𝐵).
∂log 𝐿 2
Note 2. 𝐼𝑛 (𝜃) or 𝐼(𝜃) or 𝐼 = 𝐸 ( ∂𝜃 ) has been called by Fisher, the amount of
information about 𝜃 in the ratiom sample and its reciprocal the information limit of variance
of 𝑇.
∂log 𝐿 2 ∂2 log 𝐿
Note 3. 𝐸 [ ] = −𝐸 ( )
∂𝜃 ∂𝜃2
Under the assumption that 1 = ∫𝐴 𝐿𝑑𝑥1 , 𝑑𝑥2 … . . 𝑑𝑥𝑛 = ∫𝐴 𝐿𝑑𝑥, say
∂log 𝐿
is differentiable with respect to 𝜃 under the integral sign twice, we get. 0 = ∫𝐴 𝐿𝑑𝑥
∂𝜃
∂𝐿
∂2 log 𝐿 ∂log 𝐿 ] 𝑑𝑥
∂𝜃
⇒ 0 = ∫𝐴 [ 2
𝐿+
∂𝜃 ∂𝜃

∂2 log 𝐿 ∂log 𝐿 2
⇒ 0 = ∫𝐴 [ 𝐿 + ∫𝐴 ( ) ⋅ 𝐿] 𝑑𝑥
∂𝜃 2 ∂𝜃
∂log 𝐿 2 ∂𝐿 ∂log 𝐿 ∂2 log 𝐿
⇒ 0 = ∫𝐴 ( ) 𝐿 = ⋅ 𝐿} 𝐿𝑑𝑥
∂𝜃 ∂𝜃 ∂𝜃 ∂𝜃 2
∂log 𝐿 2 ∂2 log 𝐿
⇒ 0 = 𝐸( ) +𝐸( )
∂𝜃 ∂𝜃 2
∂log 𝐿 2 ∂2 log 𝐿
⇒ 𝐸( ) = −𝐸 ( )
∂𝜃 ∂𝜃 2
∂log 𝐿 2 ∂log 𝑓(𝑥, 𝜃) 2
Note 4. 𝐸 ( ) = 𝑛𝐸 ( ]
∂𝜃 ∂𝜃
𝑛 2
∂log Π𝑖=1 𝑓(𝑥𝑖 , 𝜃)
L.H.S. = 𝐸 ( )
∂𝜃
2
∂ ∑𝑛𝑖=1 log 𝑓(𝜘𝑖 , 𝜃)
= 𝐸[ )
∂𝜃
𝑛 2
∂log 𝑓(𝑥𝑖 , 𝜃) ∂log 𝑓(𝑥𝑖 , 𝜃) ∂log 𝑓(𝑥𝑗 , 𝜃)
= 𝐸 [∑ { } +∑ { ⋅ }]
∂𝜃 ∂𝜃 ∂𝜃
𝑖=1 𝑖≠𝑗
𝑛 2
∂log 𝑓(𝑥, 𝜃)
= ∑ 𝐸[ ] +0
∂𝜃
𝑖=1

105 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

∂log 𝑓(𝑥,𝜃)
{∵ 𝑥1 , 𝑥2 , … . , 𝑥𝑛 are independent and identically distributed and 𝐸 ( ) = 0}
∂𝜃

∂log 𝑓(𝑥, 𝜃) 2
= 𝑛𝐸 [ ] = R.H.S.
∂𝜃
∂log 𝐿 2 ∂2 log 𝑓(𝑥,𝜃)
Remark 5. Since 𝐸 ( ) = −𝑛𝐸 [ ] and
∂𝜃 ∂𝜃2

∂log 𝐿 2 ∂log 𝑓(𝑥, 𝜃) 2


𝐸( ) = 𝑛𝐸 ( )
∂𝜃 ∂𝜃
∂log 𝑓(𝑥,𝜃) 2 ∂2 log 𝑓(𝑥,𝜃)
Consequently 𝐸 ( ) = −𝐸 ( )
∂𝜃 ∂𝜃2

Note 5. The minimum variance bound for the variance of an unbiased estimator of 𝜃 is given
by
1
Var (𝑇) ≥
∂log 𝐿 2
𝐸( )
∂𝜃
1
Var (𝑇) ≥
∂2 log 𝐿
−𝐸 ( )
∂𝜃 2
1
Var (𝑇) ≥
∂log 𝑓(𝑥, 𝜃) 2
𝑛𝐸 ( )
∂𝜃
1
Var (𝑇) ≥ 2
∂ log 𝑓(𝑥, 𝜃)
−𝑛𝐸 ( )
∂𝜃 2

5.3.3: Alternative Form of C-R inequality :


For continuous random variable
Let 𝑥1 , 𝑥2 , … . , 𝑥𝑛 be a random sample of size 𝑛 from a probability distribution with probability
density function 𝑓(𝑥, 𝜃), 𝜃 ∈ Θ. Let 𝑇 = 𝑇(𝑥1 , 𝑥2 , … . , 𝑥𝑛 ) be an unbiased estimator of 𝛾(𝜃),
where 𝛾(𝜃) is a function of the parameter 𝜃. Then under certain regularity conditions the
variance of this unbiased estimator 𝑇, satisfies the inequality.
{𝛾 ′ (𝜃)}2
Var (𝑇) ≥
∂log 𝐿 2
𝐸( )
∂𝜃
where

106 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

∂𝛾(𝜃)
𝛾 ′ (𝜃) =
∂𝜃
and
𝐿 = ∏𝑛𝑖=1 𝑓(𝑥𝑖 , 𝜃), joint 𝑝𝑑𝑓 of (𝑥1 , 𝑥2 , … . , 𝑥𝑛 )

5.3.4: Equality Sign in C-R inequality :


The necessary and sufficient condition for Cauchy-Schwarz Inequality (𝐸𝑈𝑉)2 ≤
𝐸(𝑈)2 𝐸(𝑉 2 ) to become an equality is that variables 𝑈 and 𝑉 are proportional to each other.
That is
𝑈 ∝ 𝑉 or 𝑉 ∝ 𝑈
In Cramer-Rao Inequality we have
[𝛾 ′ (𝜃)]2
Var (𝑇) ≥
∂log 𝐿 2
𝐸( )
∂𝜃
∂log 𝐿 2
⇒ [𝛾 ′ (𝜃)]2 ≤ Var (𝑇; 𝐸 ( )
∂𝜃
2
∂log 𝐿 ∂log 𝐿
⇒ 𝐸[7 − 𝐸(𝑇)]2 { − 0} ≤ 𝐸[𝑇 − 𝑟(𝜃)]2 𝐸 ( − 0)
∂𝜃 ∂𝜃
Hence the sign of equality holds if and only if
or
∂log 𝐿
{𝑇 − 𝐸(𝑡)} ∝( − 0)
∂𝜃
∂log 𝐿
{ − 0} ∝ {𝑇 − 𝐸(𝑇)}
∂𝜃
∂log 𝐿
⇒ = 𝐴(𝜃)(𝑇 − 𝛾(𝜃)]
∂𝜃
where 𝐴(𝜃) is a constant depending on 𝜃 may depend on 𝑛, but independent of observations
(𝑥1 , 𝑥2 , … , 𝑥𝑛 )
Thus a necessary and sufficient condition that ai. unbiased estimator 𝑇 attain the minimum
variance bound (𝑀𝑉𝐵) of its variance is given by.
∂log 𝐿
= 𝐴(𝜃)(𝑇 − 𝛾(𝜃)]
∂𝜃

107 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

∂log 𝐿
Note 1. 𝐸 ( ∂𝜃 ) = 0 ⇒ 𝐸(𝑇) = 𝛾(𝜃;
Note 2. in other form, we can write.
∂log 𝐿 𝑇 − 𝛾(𝜃)
=
∂𝜃 𝜆(𝜃)
∂log 𝐿
or = 𝑘(𝜃, 𝑛)(𝑇 − 𝛾(𝜃))
∂𝜃
where 𝜆(𝜃) or 𝑘(𝜃, 𝑛) or 𝐾(𝜃) is a constant depending on 𝜃 independent of (𝑥1 , 𝑥2 , … . , 𝑥𝑛 ),
but may depend on 𝑛.
∂log 𝐿 ∂log ∏𝑛𝑖=1 𝑓(𝑥𝑖 , 𝜃
=
∂𝜃 ∂𝜃
∂ ∑𝑛𝑖=1 log 𝑓(𝑥𝑖 , 𝜃)
Note 3. =
∂𝜃1
𝑛
∂log 𝑓(𝑥𝑖 , 𝜃)
=∑ = 𝐴(𝜃)[𝑇 − 𝛾(𝜃)]
∂𝜃
𝑖=1

Note 4. Var (𝑇) = 𝛾 ′ (𝜃) ⋅ 𝜆(𝜃)


𝛾′ (𝜃) 𝛾′ (𝜃)
Var (𝑇) = or Var (𝑇) = 𝑘(𝜃,𝑛)
𝐴(𝜃)

Cramer-Rao Inequality with sign of equality is


if and only if
[𝛾 ′ (𝜃)]2
Var (𝑇) =
∂log 𝐿 2
𝐸( )
∂𝜃
∂log 𝐿 𝑇 − 𝛾(𝜃)
=
∂𝜃 𝜆(𝜃)
∂log 𝐿 2 𝑇 − 𝛾(𝜃) 2
⇒ 𝐸( ) = 𝐸[ ]
∂𝜃 𝜆(𝜃)
1 2
Var (𝑇)
= 𝐸|𝑇 − 𝛾(𝜃)| =
|𝜆(𝜃)|2 |𝜆(𝜃)|2
we have

108 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

|𝛾 ′ (𝜃)|2
Var (𝑇) =
Var (𝑇)/(𝜆(𝜃)|2
⇒ [Var (𝑇)]2 = [(𝛾 ′ (𝜃)]2 [𝜆(𝜃)]2
∴ Var (𝑇) = 𝛾 ′ (𝜃) ⋅ 𝜆(𝜃)

When 𝛾(𝜃) = 𝜃, then


1 1
Var (𝑇) = 𝜆(𝜃) or or .
𝐴(𝜃) 𝑘(𝜃, 𝑛)
Thus, we conclude that if equation
∂log 𝐿 𝑇 − 𝛾(𝜃)
=
∂𝜃 𝜆(𝜃)
is satisfied then 𝛾 is 𝑀𝑉𝐵 estimator of 𝛾(𝜃)..

5.3.5: Uses of C-R Inequality :


Cramer-Rao inequality gives the lower bound for the variance of an estimator. With the help
of Cramer-Rao Inequality we can check whether a given estimator is minimum variance bound
estimator. The estimator which attains the lower bound given by 𝐶 − 𝑅 Inequality is often
called M.V.B. Estimator.
If an unbiased estimate does not attain the Cramer-Rao lower bound for any value of 𝜃, then
efficiency of the estimator is expressed as the ratio of 𝐶 − 𝑅 lower bound to the actual variance
of the estimator. Thus, efficiency of 𝑇 is given by
𝑀𝑉𝐵
𝜂=
Var (𝑇)
If 𝜂 = 1, then 𝑇 is efficient If 𝜂 < 1, then 𝑇 is less efficient This is why some authors define
𝑇 to be efficient (unbiased) estimator of 𝜃 if.
1
Logically this does not appear to be good definition for expression 𝐼(𝜃) is only one possible
lower bound and there do exist many sharper lower bounds any of which could have been
equally chosen to define efficiency.
1
Var (𝑇) = Cramer-Rao lower bound =
𝐼(𝜃)
i.e., when lower bound is attained.

109 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Example 1. Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be a random sample from 𝑁(𝜇, 𝜎 2 ) Show that sample mean 𝑥˜ is
minimum variance bound estimator (MVBE) of 𝜇
Solution.
2
[𝛾′ (𝜃)]
[Use of 𝐶 − 𝑅 Inequality, Var (𝑡) ≥ ∂log 𝐿 2
𝐸( )
∂𝜃
∂𝜇
Here 𝛾(𝜃) = 𝜇 ∴ 𝛾 ′ (𝜃) = ∂𝜇 = 1 𝑋 ∼ 𝑁(𝜇, 𝜎 2 ), 𝐸(𝑋) = 𝜇, Var (𝑥) = 𝜎 2
We know that
𝑛
1
Var (𝑥‾) = Var ( ∑ 𝑥𝑖 )
𝑛
𝑖=1
𝑛 𝑛
1 1
= 2 ∑ Var (𝑥𝑖 ) = 2 ∑ Var (𝑋)
𝑛 𝑛
𝑖=1 𝑖=1
𝑛
1 2
1 2
𝜎2
= ∑ 𝜎 = ⋅ 𝑛𝜎 =
𝑛2 𝑛2 𝑛
𝑛=1
1 1 𝑥−𝜇 2
𝑓(𝑥, 𝜃) = 𝑒 −2 ( )
√2𝜋 𝜎
𝑛 𝑛
1 1
− 2
𝐿 = ∏ 𝑓(𝑥𝑖 , 𝜃) = 𝑛 𝑒 2𝜎 ∑ (𝑥𝑖 − 𝜇)2
𝜎 (2𝜋)𝑛/2
𝑖=1 𝑖=1
𝑛
𝑛 1
log 𝐿 = −𝑛log 𝜎 − log (2𝜋) − 2 ∑ (𝑥𝑖 − 𝜇)2
𝐿 2𝜎
𝑖=1
𝑛
∂log 𝐿 1
=0−0− ⋅ 2 ∑ (𝑥𝑖 − 𝜇)(−1)
∂𝜇 2𝜎 2
𝑖=1
1 1
= 2 Σ(𝑥𝑖 − 𝜇) = 2 𝑛(𝑥‾ − 𝜇)
𝜎 𝜎
2 2
∂log 𝐿 𝑛
( ) = 𝑛 (𝑥‾ − 𝜇)2
∂𝜇 𝜎
2
∂log 𝐿 𝑛2
𝐸( ) = 4 𝐸(𝑥 − 𝜇)2
∂𝜇 𝜎
𝑛2 𝑛2 𝜎 2 𝑛
= 4 Var (𝑥) = 4 ⋅ = 2
𝜎 𝜎 𝑛 𝜎
(∵ 𝐸(𝑥‾) = 𝜇)

110 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1 𝜎2
Thus by 𝐶 − 𝑅 Inequality Var (𝑥) ≥ 𝑛/𝜎2 i.e., Var (𝑥) ≥ 𝑛
Since this lower bound of variance of the estimate is equal to the variance of 𝑥‾ and Hence 𝑥‾
is MVB estimator of 𝜇.
𝑛
Example 2. For the distribution 𝑁(𝜇, 𝜎 2 = 𝜃), 0 < 𝜃 < ∞, find the efficiency of 𝑇 = 𝑛−1 𝑆 2
where 𝑆 2 is the variance of a random sample of size 𝑛 > 1.

Solution.
1 2 /2𝜃
𝑓(𝑥, 𝜃) = 𝑒 −(𝑥−𝜇) , −∞ < 𝑥 < ∞
√2𝜋𝜃
𝐸(𝑥) = 𝜇, Var (𝑋) = 𝜃 −∞ < 𝜇 < ∞, 𝜃 > 0
Let 𝑋1 , 𝑋2 , … . , 𝑋𝑛 be a random sample of size 𝑛 > 1. Then
𝑛 𝑛
1 1
𝑋‾ = ∑ 𝑋𝑖 , 𝑆 2 = ∑ (𝑋𝑖 − 𝑋‾)2
𝑛 𝑛
𝑖=1 𝑖=1
𝑛
2
𝑛𝑆 1
𝑇= = ∑ (𝑋𝑖 − 𝑋‾)2
𝑛−1 𝑛−1
𝑖=1

1
⇒ 𝑋‾ 2 − 𝑛 is an unbiased estimator of 𝛾(𝜇) = 𝜇 2 .
̅̅̅̅̅̅̅̅̅̅
1
Var (𝑋 2 − ) = Var 𝑋‾ 2 = 𝐸(𝑋‾ 4 ) − (𝐸𝑋‾ 2 )2
𝑛
2
3 6𝜇 2 1
=( 2+ + 𝜇4) − ( + 𝜇2)
𝑛 𝑛 𝑛
(∵ 𝜇4′ = 𝜇4 + 4𝜇3 𝜇1′ + 6𝜇2 (𝜇1′ )2 + (𝜇1′ )4
1 4 1
= = 3 ( ) + 0 + 6 ⋅ ⋅ 𝜇2 + 𝜇4)
√𝑛 𝑛
2 4𝜇 2 1 2
= 2+ + 𝜇4 − 2 − 𝜇2 − 𝜇4
𝑛 𝑛 𝑛 𝑛
𝑛
𝑀𝑉𝐵 4𝜇 2 /𝑛
(c) Efficiency of 𝑇 is given by 𝜂 = Var (𝑇) = 4𝜇2 2
<1
( + 2)
𝑛 𝑛

111 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Example 3. By considering the Trapezoidal distribution


2𝑥
𝑑𝐹(𝑛) = ,0 < 𝜃 ≤ 𝑥 ≤ 𝜃 +1
2𝜃 + 1
4𝑛2
show that the intrinsic accuracy in the sample of size 𝑛, is (2𝜃+1)2
Solution.
2𝑥
𝑓(𝑥, 𝜃) =
2𝜃 + 1
log 𝑓(𝑥, 𝜃) = log 2𝑥 − log (2𝜃 + 1)
∂log 𝑓(𝑥, 𝜃) 2
=−
∂𝜃 2𝜃 + 1
and
𝜃+1
∂log 𝑓(𝑥, 𝜃) 2 2𝑥
𝐸( ) = −∫ ⋅ 𝑑𝑥
∂𝜃 𝜃 2𝜃 + 1 2𝜃 + 1
𝜃+1
4 −2
=− 2
∫ 𝑥𝑑𝑥 = ≠ 0.
(2𝜃 + 1) 𝜃 2𝜃 + 1
∂log 𝑓(𝑥,𝜃) 2 4 𝜃+1 4
Again 𝐸 [ ] = (2𝜃+1)2 ∫𝜃 𝑥𝑑𝑥 = (2𝜃+1)2
∂𝜃

Hence for a sample of size 𝑛 intrinsic accuracy


∂log 𝑓(𝑥, 𝜃) 2 ∂log 𝑓(𝑥, 𝜃) 2
= 𝑛𝐸 [ ] + 𝑛(𝑛 − 1) [𝐸 ( )]
∂𝜃 ∂𝜃
4𝑛 4 4𝑛2
= + 𝑛(𝑛 − 1) =
(20 + 1)2 (2𝜃 + 1)2 (2𝜃 + 1)2
Example 4. Find if MVB estimator exists for 𝜃 in the Cauchy's distribution
1 1
𝑓(𝑥, 𝜃) = ⋅ , −∞ < 𝑥 < ∞
𝜋 1 + (𝑥 − 𝜃)2
2
Also show that Cramer-Rao lower bound of variance of an unbiased estimator of 𝜃 is 𝑛
(although it is not attained by the variance of any thtiased estimator of 𝜃 ).

112 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Solution. Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample of size 𝑛 from 𝑓(𝑥, 𝜃). Then joint pdf of
(𝑋1 , 𝑋2 , … , 𝑋) is given by

𝑛 𝑛
1 𝑛 1
𝐿 = ∏ 𝑓(𝑥𝑖 , 𝜃) = ( ) ∏
𝜋 1 + (𝑥𝑖 − 𝜃)2
𝑖=1 𝑖=1
𝑛

∴ log 𝐿 = −𝑛log 𝜋 − ∑ log [1 + (𝑥𝑖 − 𝜃)2 ]


𝑖=1
𝑛
∂log 𝐿 (𝑥𝑖 − 𝜃)
⇒ = 2∑ [ ]
∂𝜃 1 + (𝑥𝑖 − 𝜃)2
𝑖=1

𝑇−𝑟(𝜃)
The RHS cannot be expressed in the form .
𝜆(𝜃)
Hence 𝑀𝑉𝐵 estimator does not exist for 𝜃 in Cauchy distribution. That is Cramer-Rao lower
bound is not attainable by the variance of any unbiased estimator of 𝜃.

1 1
𝑓(𝑥, 𝜃) = ⋅ , −∞ < 𝑥 < ∞
𝜋 1 + (𝑥 − 𝜃)2
log 𝑓(𝑥, 𝜃) = −log 𝜋 − log [1 + (𝑥 − 𝜃)2 ]
∂log 𝑓(𝑥, 𝜃) 2(𝑥 − 𝜃)(−1) 2(𝑥 − 𝜃)
=0− 2
=
∂𝜃 1 + (𝑥 − 𝜃) 1 + (𝑥 − 𝜃)2
Further ∂log 𝑓(𝑥, 𝜃) 2 4(𝑥 − 𝜃)2
[ ] =
∂𝜃 [1 + (𝑥 − 𝜃)2 ]2
∂log 𝑓(𝑥, 𝜃) 2 4(𝑥 − 𝜃)2
𝐸( ) =𝐸[ ]
∂𝜃 [1 + (𝑥 − 𝜃)2 ]2

4(𝑥 − 𝜃)2 1 1
=∫ 2 2
⋅ ⋅ 𝑑𝑥
−∞ [1 + (𝑥 − 𝜃) ] 𝜋 1 + (𝑥 − 𝜃)2

113 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑛
∂log 𝐿 𝑛 1
= − 2∑
∂𝜃 1+𝜃 𝑥𝑖 + 𝜃
𝑖=1
𝑛
2
∂ log 𝐿 𝑛 1
=− + 2 ∑
∂𝜃 2 (1 + 𝜃)2 (𝑥𝑖 + 𝜃)2
𝑖=1
𝑛
∂2 log 𝐿 𝑛 1
−𝐸 ( ) = − 2∑ 𝐸{ }
∂𝜃 2 (1 + 𝜃) 2 (𝑥𝑖 + 𝜃)2
𝑖=1
𝑛 1
= 2
− 2𝑛𝐸 { }
(1 + 𝜃) (𝑥 + 𝜃)2
𝑛 1 1+𝜃
= 2
− 2𝑛 ∫ 𝑑𝑥
(1 + 𝜃) (𝑥 + 𝜃) (𝑥 + 𝜃)2
2

𝑛 1
= 2
− 2𝑛(1 + 𝜃) ∫ 4
𝑑𝑥
(1 + 𝜃) 1 (𝑥 + 𝜃)

𝑛 1
= − 2𝑛(1 + ∂) [− ]
(1 + 𝜃)2 3(𝑥 + 𝜃)3 1
𝑛 2𝑛(1 + 𝜃) 1
= 2
− [−0 + ]
(1 + 𝜃) 3 (1 + 𝜃)3
𝑛 2𝑛 𝑛
= 2
− 2
=
(1 + 𝜃) 3(1 + 𝜃) 3(1 + 𝜃)2
𝑀𝑉𝐵 by C − R Inequality is given by
[𝜏 ′ (𝜃)]2
𝑀𝑉𝐵 =
∂2 log 𝐿
−𝐸 ( )
∂𝜃 2
1
= {∵ 𝜏(𝜃) = 𝜃}
𝑛
3(1 + 𝜃)2
3(1 + 𝜃)2
= .
𝑛
SELF ASSESEMENT (CONCEPTUAL QUESTIONS)
1. Comment on the following statements :
(i) In the case of Poisson distribution with parameter 𝜆, 𝑥‾ is sufficient for 𝜆.
(ii) If (𝑋1 , 𝑋2 , … 𝑋𝑛 ) be a sample of independent observation from the uniform
distribution on (𝜃, 𝜃 + 1), then the maximum likelihood estimator of 𝜃 is
unique.

114 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(iii) A maximum likelihood estimator is always unbiased.


(iv) Unbiased estimator is necessarily consistent
(v) A consistent estimator is also unbiased.
(vi) An unbiased estimator whose variance tends to zero as the sample size increases
is consistent.
(vii) If 𝑡 is a sufficient statistic for 𝜃 then 𝑓(𝑡) is a sufficient statistic for 𝑓(𝜃).
(viii) If 𝑡1 and 𝑡2 are two independent estimators of 𝜃, then 𝑡1 + 𝑡2 is less efficient
than both 𝑡1 and 𝑡2 .
(ix) If 𝑇 is consistent estimator of a parameter 𝜃, then 𝑎𝑇 + 𝑏 is a consistent
estimator of 𝑎𝜃 + 𝑏, where 𝑎 and 𝑏 are constants.
(x) If 𝑥 is the number of successes in 𝑛 independent trials with a constant
probability 𝑝 of success in each trial, then 𝑥/𝑛 is a consistent estimator of 𝑝.
2. Fill in the blanks :
(i) In a random sample of size 𝑛 from a population with mean 𝜇, the sample mean
(𝑥‾) is .... estimate of ...
(ii) The sample median is ... estimate for the mean of normal population.
(iii) An estimator 𝜃ˆ of a parameter 𝜃 is said to be unbiased if...
(iv) The variance 𝑠 2 of a sample of size 𝑛 is a ... estimator of population variance
𝜎2.
(v) If a sufficient estimator exists, it is a function of the ... estimator.
(vi) ...estimate may not be unique.
3. (a) Give example of a statistic 𝑡 which is unbiased for a parameter 𝜃 but 𝑡 2 is not
unbiased for 𝜃 2 .
(b) Give example of an M.L. estimator which is not unbiased.
4. (i) If 𝑥⃗ is an unbiased estimator for the population mean 𝜇, state which of the
following are unbiased estimators for 𝜇 2 :
𝜎2
(a) 𝑥‾ 2 , (𝑏)𝑥‾ 2 − (𝜎 2 is known/unknown)
𝑛

(ii) If 𝑡 is the maximum likelihood estimator for 𝜃, state the condition under
which 𝑓(𝑡) will be the maximum likelihood estimator for 𝑓(𝜃).
(iii) Write down the condition for the Cramer-Rao lower bound for the variance of
the estimator to be attained.

115 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

(iv) Write down the general form of the distribution admitting sufficient statistic.
1
5. A random variable 𝑋 takes the values 1,2,3 and 4 , each with probability 4. A random
sample of three values of 𝑥 is taken, 𝑥‾ is the mean and 𝑚 is the median of this
sample. Show that both 𝑥‾ and 𝑚 are unbiased estimators of the mean of the
population, but 𝑥‾ is more efficient than 𝑚. Compare their efficiencies.
6. Give an example of estimates which are (i) Unbiased and efficient, (ii) Unbiased and
inefficient.
7. Mark the correct alternative:
(i) Let 𝑇𝑛 be an estimator, based on a sample 𝑥1 , 𝑥2 , … , 𝑥𝑛 , of the parameter 𝜃.
Then 𝑇 is a consistent estimator of 𝜃 if
(a) P(𝑇𝑛 − 𝜃 > 𝜀) = 0∀𝜀 > 0,
(b) 𝑃(|𝑇𝑛 − 𝜃| < 𝜀) = 0,
(c) lim𝑛→∞ 𝑃(|𝑇𝑛 − 𝜃| > 𝜀) = 0∀𝜀 > 0,
(d) lim𝑛→− 𝑃(𝑇𝑛 − 𝜃 > 𝜀) = 0 ∀𝜀 > 0
(ii) Let 𝐸(𝑇1 ) = 𝜃 = 𝐸(𝑇2 ), where 𝑇1 and 𝑇2 are the linear functions of the
sample observations. If 𝑉(𝑇1 ) ≤ 𝑉(𝑇2 ) then:
(a) 𝑇1 is an unbiased linear estimator.
(b) Γ1 is the best linear unbiased estimator.
(c) 𝑇1 is a consistent linear unbiased estimator.
(d) 𝑇1 is a consistent best linear unbiased estimator.
(iii) Let 𝑋 be a random variable with 𝐸(𝑋) = 𝜇 and 𝑉(𝑋) = 𝜎 2 . Let 𝑋‾ be the
sample mean based on a random sample of size 𝑛, then 𝑋‾ is:
(a) the best linear unbiased estimator of 𝜇.
(b) an unbiased and consistent estimator of 𝜇.
(c) an unbiased and linear estimator of 𝜇.
(d) the best linear consistent estimator of 𝜇.
(iv) Let 𝜃 be an unknown parameter and 𝑇1 be an unbiased estimator of 𝜃. If
Var (𝑇1 ) ≤ Var (𝑇2 ), for 𝑇2 to be any other unbiased estimator, then 𝑇1 is
known as:
(a) minimum variance unbiased estimator.
(b) unbiased and efficient estimator.
(c) consistent and efficient estimator.
(d) unbiased, consistent and minimum variance estimator.
116 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(v) Which one of the following statements is false?


(a) In sufficiency we reduce original random variables to a few statistics
for the purpose of drawing inference about parameters.
(b) 𝑇1 , 𝑇2 , … , 𝑇𝑘 will be called sufficient for 𝜃 if the conditional
distribution of 𝑇1 , 𝑇2 , … , 𝑇𝑘 given 𝑋1 , 𝑋2 , … , 𝑋𝑛 is independent of 𝜃.
(c) A statistics is said to be sufficient if and only if the joint probability
density function of random variables can be expressed as two factors,
where one factor depends on parameters(s) and observations through
the statistics, while the other is independent of parameter(s).

(i) The most important of all the methods of estimation is the method of
maximum likelihood. It generally yields good estimator as judged from
various criteria. Which one of the following statements about maximum
likelihood estimates is not, true?
(a) Maximum likelihood estimates are consistent.
(b) If maximum likelihood estimate exists, it is most efficient in the
class of estimates.
(c) Maximum likelihood estimates are sufficient.
(d) Maximum likelihood estimates are unbiased.
(vi) The maximum likelihood estimates, which are obtained by maximizing the
function of joint density of random variables, are generally :
(a) unbiased and inconsistent,
(b) unbiased and consistent,
(c) consistent and invariant, and
(d) invariant and unbiased.
1. Write True or False :
(i) The variance of the 𝑀𝑉𝑈𝐸 is always given by the Cramer'-Rao Bound. (False)
∂log 𝐿2 ∂log 𝑓(𝑥,𝜃) 2
(ii) equation 𝐸 ( ) = 𝑛𝐸 [ ] is not satisfied in case of 𝑓(𝑥, 𝜃) =
∂𝜃 ∂𝜃
1
, 0 ≤ 𝑥 ≤ 𝜃 (True)
𝜃

(iii) Cramer-Rao inequality for the variance of an estimator provides lower bound
to the variance (True)

117 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

2. Fill in the blanks:


∂log 𝐿 2 1 𝑛2
(i) The value of 𝐸 ( ) in case of 𝑓(𝑥, 𝜃) = 𝜃 , 0 ≤ 𝑥 ≤ 𝜃 is [Ans. 𝜃2 ]
∂𝜃

(ii) If the variance of an estimator attains the Cramer-Rao lower bound, the
estimator is most.......... [Ans. efficient]
3. In each of the following questions four alternative answers are given in which only
one is correct. Select the correct answer and write (a), (b), (c) or (d) accordingly :
(i) If 𝑋1 , 𝑋2 , … , 𝑋𝑛 is a random sample of size 𝑛 from the Poisson distribution
with mean 𝜆, the Cramer-Rao lower bound to the variance of any unbiased
estimator of 𝜆 is given by
𝑛
(a) 𝑒 −𝜆
𝜆
(b) 𝑛
√𝜆
(c) 𝑛
𝜆

(d) 𝑒 𝑛

Ans. (b)
∂log 𝑓(𝑥,𝜃) 2 1
(ii) The value of 𝐸 ( ) in case of 𝑓(𝑥, 𝜃) = 𝜃 , 0 ≤ 𝑥 ≤ 𝜃 is
∂𝜃
𝑛2
(a) 𝜃2
𝑛
(b)
𝜃2
𝑛2
(c) 𝜃
(d) none of these.
Ans. (b)
(iii) The necessary and sufficient conditions for Cramer-Rao lower bound to be
attainable is
∂ log 𝐿
(a) = 𝐴(𝜃)[𝑇 − 𝛾(𝜃)]
∂𝜃
∂2 log 𝐿
(b) = 𝐴(𝜃)[𝑇 − 𝛾(𝜃)]
∂2 𝜃
∂log 𝐿 𝑇
(c) = 𝐴(𝜃) [𝛾(𝜃)]
∂𝜃
∂2 log 𝐿 𝑇
(d) = 𝐴(𝜃) [ ]
∂𝜃2 𝛾(𝜃)
Ans. (a)
(iv) Suppose 𝐿(𝜃) is the likelihood function and 𝑡 is an unbiased estimator of 0 .
If
118 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

2
∂log 𝐿(𝜃) 2
𝑉1 = 𝐸(𝑡 − 𝜃) , 𝑉2 = 𝐸 [ ] ,
∂𝜃
which one of the following is true?
(a) 𝑉1 ≤ 𝑉2
(b) 𝑉1 ≥ 𝑉2
1
(c) 𝑉1 = 𝑉
2

1
(d) 𝑉1 ≥ 𝑉 .
2

Ans. (d)

(v) The denominator in the Crame'r Rao inequality is known as


(a) information limit
(b) low bound of the variance
(c) upper bound of the variance
(d) all the above
Ans. (a)
(vi) Regularly conditions of Crame'r-Rao in equality are related to
(a) integrability of funtions
(b) differentiability of functions
(c) both (a) and (b)
(d) neither (a) nor (b)
Ans. (b)
(vii) Crme'r-Rao inequality was given by
(a) jointly
(b) in different years
(c) in the same years
(d) none of the above
Ans. (b)

119 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

5.4 IN-TEXT QUESTIONS


MCQ’S PROBLEMS
Question 1.
Let the random variables 𝑋1 and 𝑋2 have joint probability density function
𝑥1 𝑒 −𝑥1𝑥2
𝑓(𝑥1 , 𝑥2 ) = { , 1 < 𝑥1 < 3, 𝑥2 > 0
2
0, otherwise.
What is the value Var (𝑋2 ∣ 𝑋1 = 2) …(up to two decimal place)?
A) 0.27
B) 0.28
C) 0.25
D) 1.90
Question 2.
Let 𝑥1 = 1, 𝑥2 = 0, 𝑥3 = 0, 𝑥4 = 1, 𝑥5 = 0, 𝑥6 = 1 be the data on a random sample of size 6
from Bin (1, 𝜃) distribution, where 𝜃 ∈ (0,1). Then the uniformly minimum
variance unbiased estimate of 𝜃(1 + 𝜃) equals
A) 0.7
B) 0.77
C) 0.99
D) 0.12
Question 3.
Let 𝑋1 , … , 𝑋𝑛 be a random sample from a population with density
𝜃−𝑥
, if 𝑥 > 𝜃
𝑓(𝑥, 𝜇) = {𝑒
0, otherwise
and let 𝑋(1) = min{𝑋1 , … , 𝑋𝑛 }. Then
2
(𝑋(1) − 𝑛 log 𝑒 5, 𝑋(1) ) is 𝑎 … % confidence interval for 𝜃.
A) 0.96
B) 0.97
C) 0.12
D) 0.56
120 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question 4.
0, if 0≤𝑥 < 1
2
Let 𝑓: [0,3] → ℝ be define by 𝑓(𝑥) = {𝑒 𝑥 − 𝑒, if 1≤𝑥 < 2
2
𝑒 𝑥 + 1, if 2≤𝑥 ≤ 3
𝑥
Now, define 𝑓: [0,3] → ℝ by 𝐹(0) = 0 and 𝐹(𝑥) = ∫0 𝑓(𝑡)𝑑𝑡, for 0 < 𝑥 ≤ 3
Then
A. 𝐹 is differentiable at 𝑥 = 1 and 𝐹 ′ (1) = 0
B. 𝐹 is differentiable at 𝑥 = 1 and 𝐹 ′ (2) = 0
C. F is not differentiable at 𝑥 = 1
D. 𝐹 is differentiable at 𝑥 = 1 and 𝐹 ′ (2) = 1
Question 5.
sin (2(𝑥 2 +𝑦 2 )) 4
3𝑥sin ( )
2
Let 𝑓/∈ ℝ → ℝ be define by 𝑓(𝑥, 𝑦) = { 𝑒 𝑦 , if (𝑥, 𝑦) ≠ (0,0)
𝑥 2 +𝑦 2
𝛼, if (𝑥, 𝑦) = (0.0)
Where 𝑎 is a real constant? If 𝑓 is continuous at (0,0), then a is equal to
A. 1
B. 2
C. 3
D. 4
Question 6
Which probability distribution is used to model the time elapsed between events?
A. Exponentia
B. Poisson
C. Normal
D. Gamma

Question 7.
Let 𝑓: ℝ × ℝ → ℝ be define by
𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑥𝑦 + 𝑦 2 − 𝑥 − 100

121 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Where ℝ denotes the set of all real numbers. Then


2 1
A. 𝑓 has a local minimum at (3 , − 3)
2 1
B. 𝑓 has a local maximum at (3 , − 3)
2 1
C. 𝑓 has a saddle point at (3 , − 3)
D. f is bounded
Question 8.
Which one is used to find the linear relationship between two variables?
A. Correlation
B. Variance
C. Joint Moment
D. ANOVA
Question 9.
The system of equations 𝑥 + 𝑦 + 2𝑧 = 2, 2𝑥 + 3𝑦 − 𝑧 = 5, 4𝑥 + 7𝑦 + 𝑐𝑧 = 6
does NOT have a Solution. Then, the value of c must be equal to
A. 7
B. 5
C. -7
D. -5
Question 10.
Suppose r1.23 and 𝑟1.234 are sample multiple correlation coefficient of 𝑋1 on 𝑋2 , 𝑋3 and
𝑋2 , 𝑋3 , 𝑋4 respectively. Which of the following is possible?
A. 𝑟1.23 = −0.3, 𝑟1.234 = 0.7
B. 𝑟1.23 = −0.5, 𝑟1.234 = −0.7
C. 𝑟1.23 = 0.3, 𝑟1.234 = 0.7
D. 𝑟1.23 = 0.7, 𝑟1.234 = −0.3
Question 11.
Let 𝑋1 , … , 𝑋𝑛 be a random sample from a 𝑁(2𝜃, 𝜃 2 ) population, 𝜃 > 0. A consistent
estimator for 𝜃 is

122 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1
A. ∑𝑛𝑖=1 𝑋𝑖
𝑛

5 1/2
B. (𝑛 ∑𝑛𝑖=1 𝑋𝑖2 )
1
C. ∑𝑛𝑖=1 𝑋𝑖2
5𝑛

1 1/2
D. (5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 )
Question 12.
Consider the trinomial distribution with the probability
7!
mass function 𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) = 𝑥!𝑦!(7−𝑥−𝑦)! (0.6)𝑥 (0.2)y (0.2)7−𝑥−𝑦 , 𝑥 ≥ 0, 𝑦 ≥ 0,
and 𝑥 + 𝑦 ≤ 7
Then 𝐸(𝑌 ∣ 𝑋 = 3) is equal to ⋯
A) 2
B) 3
C) 4
D) 5
Question 13.
Let 𝑋1 , … , 𝑋𝑛 be a random sample of size 𝑛(≥ 2) from a uniform distribution with
probability density function
1
𝑓(𝑥, 𝜃) = {𝜃 ; 0 < 𝑥 < 𝜃
0, otherwise
where 𝜃 ∈ (0, ∞). If 𝑋(1) = min{𝑋1 , … , 𝑋𝑛 } and
𝑋(𝑛) = max{𝑋1 , … , 𝑋𝑛 }
1 𝜃
then the conditional expectation 𝐸 [𝜃 (𝑋(𝑛) + 𝑛+1) ∣ 𝑋1 − 𝑋2 = 5] = ⋯ …
A) 1
B) 2
C) 3
D) 34
Question 14.
If a constant value 100 is subtracted from each observation of a set, the mean of the set is:
123 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

A. increased by 50
B. decreased by 100
C. is not affected
D. zero
Question 15.
Extreme values in a distribution have no effect on:
A. average
B. median
C. geometric mean
D. harmonic mean
Question 16.
Line of regression of Y on X and X on Y can’t be
A. Parallel
B. Perpendicular
C. Coincide
D. None of these
Question 17.
Sum of square of deviation is minimum, when the deviation is taken from
A. Mean
B. Median
C. Mode
D. None of these
Question 18.
All value in a sample are same, then their mean is
A. Zero
B. One
C. Sum of Variance
D. None of these

124 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question 19.
Variance of a Random variable is always
A. Positive
B. Non-negative
C. Zero
D. Can’t Say
Question 20.
X ∼ 𝑃(1.412) then mean of random variable X
A. 1.412
B. 1.512
C. 1.96
D. None of these
Question 21.
1
Moment generating function of x where 𝑥 ∼ Bin (2, 2) is
2
1+𝑒 𝑡
A. ( )
2
2
(1+𝑒 2 )
B. 2

C. (1 − 𝑒 2 𝑡)2
D. (1 + 𝑒 𝑡 )
Question 22.
1
𝑥 ∼ Bin (2, 3)

find 𝐸(𝑒 2𝑥 ) = ⋯
2
2+𝑒 2
A. ( )
3
2
2−𝑒 2
B. ( )
3
2−𝑒
C. ( )
3

125 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

2+𝑒 2
D. ( )
2

Question 23.
X~𝑃(𝜆)
Then 𝐸(X 2 ) will 𝑏𝑒
A. λ(λ + 1)
B. 𝜆(𝜆 − 1)
C. (𝜆 − 1)
D. (𝜆 + 1)
Question 24.
In Poisson Distribution mean is……variance
A. Equal to
B. Greater than
C. Less than
D. None of these
Question 25.
If 𝑎: 𝑏: 𝑐 = 3: 4: 7, then the ratio (𝑎 + 𝑏 + 𝑐): 𝑐 is equal to
A. 2:1
B. 14:3
C. 7:2
D. 1:2
Question 26.
Quartile deviation or semi inter-quartile deviation is given by the formula:
𝑄3 +𝑄1
A. Q.D. = 2

B. Q.D. = 𝑄3 − 𝑄1
C. Q.D. = (𝑄3 − 𝑄1 )/2
D. Q.D.= (𝑄3 − 𝑄1 )/4

126 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question: 27
The moment generating function of a random variable X is
given by
1 1 𝑡 1 2𝑡 1 3𝑡
𝑀𝑋 (𝑡) = + 𝑒 + 𝑒 + 𝑒 , −∞ < 𝑡 < ∞
6 3 3 6
Then P(X ≤ 2) equals

1
A.
3

1
B. 6

1
C. 2

5
D. 6

Question: 28
1 1
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from 𝐔 (𝜃 − 2 , 𝜃 + 2) distribution, where 𝜃 ∈ ℝ. If
𝑋(1) = min{𝑋1 , 𝑋2 , … , 𝑋𝑛 } and 𝑋(𝑛) = max{𝑋1 , 𝑋2 , … , 𝑋𝑛 }. Define
1 1 1
𝑇1 = 2 (𝑋(1) + 𝑋(𝑛) ), 𝑇2 = 4 (3𝑋(1) + 𝑋(𝑛) + 1)and 𝑇3 = 2 (3𝑋(𝑛) − 𝑋(1) − 2) an estimator
for 𝜃, then which of the following is/are TRUE?
A. 𝑇1 and 𝑇2 are MLE for 𝜃 but 𝑇3 is not MLE for 𝜃

B. 𝑇1 is MLE for 𝜃 but 𝑇2 and 𝑇3 are not MLE for 𝜃

C. 𝑇1 , 𝑇2 and 𝑇3 are MLE for 𝜃

D. 𝑇1 , 𝑇2 and 𝑇3 are not MLE for 𝜃


Question: 29
Let 𝑋 and 𝑌 be random variable having joint probability density function
𝑘
𝑓(𝑥, 𝑦) = ; −∞ < (𝑥, 𝑦) < ∞
(1 + 𝑥 2 )(1 + 𝑦2)

127 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Where k is constant, then which of the following is/are TRUE?

1
A. k = 𝜋2
1 1
B. 𝑓(𝑥) = 𝜋 1+𝑥 2 ; −∞ < 𝑥 < ∞
C. P(X = Y) = 0
D. all of the above
Question: 30
Lę 𝑋1 , 𝑋2 , … , 𝑋𝑛
be sequence of independently and identically distributed random variables with the
probability density function
1 2 −𝑥
𝑓(𝑥) = {2 𝑥 𝑒 , if 𝑥 > 0 and let
0, otherwise
𝑆𝑛 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛
Then which of the following statement is/are TRUE?
𝑆𝑛 −3𝑛
A. ∼ 𝑁(0,1) for all 𝑛 ≥ 1
√3𝑛

𝑆
B. For all 𝜀 > 0, 𝑃 (| 𝑛𝑛 − 3| > 𝜀) → 0 as
n→∞
𝑆𝑛
C. → 1 with probability 1
𝑛

D. Both A and B
Question: 31
Let 𝑋, 𝑌 are i.i.d Binomial (𝑛, 𝑝) random variables. Which of the following are true?

A. 𝑋 + 𝑌 ∼ Bin (2𝑛, 𝑝)

B. (X, Y) ∼ Multinomial (2n; p, p)

C. Var (X − Y) = E(X − Y)2

D. option A and C are correct.

128 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

5.5 SUMMARY
The main points which we have covered in this lessons are what is estimator and what is
consistency, efficiency and sufficiency of the estimator and how to get best estimator.
5.6 GLOSSARY
Motivation: These Problems are very useful in real life and we can use it in data science ,
economics as well as social sciemce.
Attention: Think how the Cramer Rao Inequality is useful in real world problems.
5.7 ANSWER TO IN-TEXT QUESTIONS
Answer 1 : C
Explanation :
∞ 𝑥1 𝑒 −𝑥1 𝑥2 1
The marginal pdf of 𝑋1 is 𝑔(𝑥1 ) = ∫0 𝑑𝑥2 = 2 , 1 < 𝑥1 < 3
2
𝑓(𝑥1 , 𝑥2 )
ℎ(𝑥2 ∣ 𝑥1 ) = = 𝑥1 𝑒 −𝑥1𝑥2 , 𝑥2 > 0
𝑔(𝑥1 )
1
𝑋2 ∣ 𝑋1 ∼ Exp (𝑥1 ) with mean 𝑥
1
1
Therefore Var (𝑋2 ∣ 𝑋1 = 2) = 4 = 0.25
Hence 0.25 is correct answer.
Answer 2 : A
Explanation :
𝑇 𝑇(𝑇−1)
+ 𝑛(𝑛−1) is UMVUE of 𝜃(1 + 𝜃)
𝑛
𝑇 𝑇(𝑇−1)
𝑬 (𝑛 + 𝑛(𝑛−1)) = 𝜃(1 + 𝜃); where

𝑇 = ∑𝑛𝑖=1 𝑋𝑖
𝑇 𝑇(𝑇−1) 3 3(3−1) 21
Therefore, 𝑛 + 𝑛(𝑛−1) = 6 + 6(6−1) = 30 = 0.70

Hence 𝐴 is the correct answer.


Answer 3 : A
Explanation :

129 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑛(𝜃−𝑥)
𝑓𝑋(1) (𝑥) = {𝑛𝑒 , 𝑥>𝜃
0, otherwise
2 2
𝑃 (𝑋(1) − log 𝑒 5 ≤ 𝜃 ≤ 𝑋(1) ) = 𝑃 (𝜃 ≤ 𝑋(1) ≤ 𝜃 + log 𝑒 5) = 0.96
𝑛 𝑛
Hence 0.96 is the correct answer.
Answer 4 : A
Explanation :
f is discontinuous at x = 2. So we consider the interval [0,1.5].
Clearly, f is continuous on [0,1.5]. By the fundamental theorem of calculus,
we have that F is differentiable on [0,1.5], and so is at 𝑥 = 1
with 𝐹 ′ (1) = 𝑓(1) = 𝑒 − 𝑒 = 0
Hence option A is the correct choice.
Answer 5 : B
Explanation :
Given that 𝑓 is continuous at (0,0). Then, lim(𝑥,𝑦)→(0,0) 𝑓(𝑥, 𝑦) = 𝑓(0,0) = 𝛼
Putting 𝑥 = 0 and taking 𝑦 → 0, we get
sin (2𝑦 2 ) sin (2𝑦 2 )
𝛼 = lim𝑦→0 𝑓(0, 𝑦) = lim𝑦→0 = 2lim𝑦→0 =2
𝑦2 2𝑦 2
Hence option B is the correct choice.
Answer 6 : A
Explanation :
Exponential distribution is used to find the elapsed between events.
Answer 7 : A
Explanation :
The first order partial derivatives of 𝑓 are given by
∂𝑓 ∂𝑓
= 2𝑥 + 𝑦 − 1 and ∂𝑦 = 𝑥 + 2𝑦
∂𝑥

For critical points, we equates first order partial derivatives to zero, and on solving the
2 1
resulting equations, we get 𝑥 = 3 and 𝑦 = − 3
2 1
The second order partial derivatives of 𝑓 evaluated at (3 , − 3) are given by

130 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

∂2 𝑓 ∂2 𝑓 ∂2 𝑓
𝑟 = [∂𝑥 2 ] 2 1 = 2, 𝑠 = [∂𝑥 ∂𝑦] 2 1
=1 and [∂𝑦 2] 2 1
=2
( ′, ) ( ′− ) ( ′, )
3 3 3 3 3 3
2 1
Since 𝑟𝑡 − 𝑠 2 = 3 > 0 and 𝑟 > 0, we conclude that 𝑓 has local minimum at (3 , − 3).
Since there is only one critical point, we do not have any point of local maximum.
Hence option A is correct.
Answer 8 : A
Explanation :
Correlation is the linear relation between two random variable X and Y.
cov (X,Y)
Correlation Coeff r= 𝜎𝑥 ,𝜎𝑦

Answer 9 : C
Explanation :
1 1 2
Let 𝐴 = [2 3 −1]
4 7 𝑐
The condition for no solution is that rank (A: B) ≠ rank (A). This would we satisfied if c +
7 = 0, which implies that c = −7.
Hence option C is correct.
Answer 10 : C
Explanation :
Since sample multiple correlation lies between 0 to 1 0 ≤ 𝑟1.23,…,𝑛 ≤ 1
So option C only hold this condition
Hence option C is correct.
Answer 11 : D
Explanation :
We have
2
𝐸(𝑋𝑖2 ) = 𝑉(𝑋𝑖 ) + (𝐸(𝑋𝑖 )) = 5𝜃 2 ; 𝑖 = 1,2, … , 𝑛
𝑥12 𝑥22
Then , ,…
5 5
𝑥2
is a sequence of i.i.d random variables with 𝐸 ( 51 ) = 𝜃 2 . Using WLLN,

131 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1 1
we get 𝑛 ∑𝑛𝑖=1 𝑋𝑖2 /5 = 5𝑛 ∑𝑛𝑖=1 𝑋𝑖2
1 1/2
converge in probability to 𝜃 2 as 𝑛 → ∞, which implies that (5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 )
converge in probability to 𝜃 as 𝑛 → ∞.
1 1/2
Thus (5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 ) is a consistent estimator for 𝜃.
Hence option D is the correct choice.
Answer 12 : A
Explanation :
The trinomial distribution of two r.v.'s 𝑋 and 𝑌 is given by
𝑛!
𝑓𝑋,𝑌 (𝑥, 𝑦) = 𝑝 𝑥 𝑞 𝑦 (1 − 𝑝 − 𝑞)(𝑛−𝑥−𝑦)
𝑥! 𝑦! (𝑛 − 𝑥 − 𝑦)!
for x, y = 0,1,2, … , n and x + y ≤ 𝑛
𝑞
where p+q≤1. Then 𝑌 ∣ 𝑋 = 𝑥 ∼ 𝐵 {𝑛 − 𝑥, 1−𝑝} 𝑛 = 7, 𝑝 = 0.6 and 𝑞 = 0.2
𝑞 0.2 0.2
Y ∣ X = x ∼ B {𝑛 − 𝑥, } ⇒ 𝐸(𝑌 ∣ 𝑋 = 3) = (7 − 3) × = 4× =2
1−𝑝 1 − 0.6 0.4
Hence A is the correct answer.
Answer 13 : A
Explanation :
1
𝑓(𝑥, 𝜃) = {𝜃 ; 0 < 𝑥 < 𝜃
0, otherwise
𝑛𝜃
E(𝑋(𝑛) ) =
𝑛+1
1 1 1 𝜃
𝐸 [ (𝑋(𝑛) + ) ∣ 𝑋1 − 𝑋2 = 5] = 𝐸 [ (𝑋(𝑛) + )] = 1
𝜃 𝑛+1 𝜃 𝑛+1
Hence correct answer is A.

Answer 14 : B
Explanation :
∑𝑓 𝑥
Mean X‾ = 𝑁𝑖 𝑖
if (𝑥𝑖 )11 = 𝑥𝑖 − 100

132 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

∑𝑓𝑖 (𝑥𝑖 − 100)


Then 𝑥‾new = = 𝑥‾ − 100
𝑁
𝑥‾new = 𝑥‾ − 100
Answer 15 : B
Explanation :
Median is unaffected by extreme value.
Answer 16 : A
Explanation -
Line of regression Y on X and X on Y can’t be parallel because both lines passes through
‾ ,𝑌‾)
(𝑋
Answer 17 : A
Explanation -
∑𝑛i=1 𝑓𝑖 (𝑥𝑖 − 𝐴)2 is minimum when A =𝑥‾
Answer 18 : D
Explanation -
As all sample values are same,
So x can be x, x,…,x (n times)
1 𝑛
𝑥‾ = ∑ 𝑥
𝑛 𝑖=1 𝑖
1
= ∑𝑛𝑖=1 𝑥
𝑛
1
= ⋅ 𝑛𝑥
𝑛
=𝑥
𝑥‾ = x
Answer 19 : B
Explanation -
As we know that V(X) ≥ 0 , so non-negative
Answer 20 : A
Explanation -

133 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

X ∼ 𝑃(𝜆) then Mean (X) = 𝐸(X) = 𝜆


so 𝐸(X) = 1.412
Answer 21 : A
Explanation -
𝑥 ∼ Bin (𝑛, 𝑝)
𝑀𝑥 (𝑡) = (𝑞 + 𝑝𝑒𝑡)𝑛
1
Hence 𝑥 ∼ Bin (2, 2) , p=1/2 , q=1/2 , n=2

1 1 𝑡 2
𝑀(𝑡) = ( + 𝑒 )
2 2
𝑡 2
1+𝑒
=( )
2
Answer 22 : A
Explanation -
1
𝑥 ∼ Bin (2, )
3
𝑡𝑥 )
2 1 𝑡 2
𝐸(𝑒 =( + 𝑒 )
3 3
2 1 2
2𝑥 )
So 𝐸(𝑒 = ( + 𝑒 2)
3 3

Answer 23 : A
Explanation -
X~𝑃(𝜆)

𝐸(𝑥) = λ
Then [ ]
𝑉(𝑥) = λ

𝑆𝑜 𝑉(𝑥) = 𝐸(𝑥 2 ) − (E(𝑥))2


𝐸(𝑥 2 ) = V(𝑥) + (E(𝑥))2
= 𝜆 + 𝜆2
= λ(𝜆 + 1)

134 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Answer 24 : A
Explanation -

𝑥 ∼ 𝑃(𝜆)
Then 𝐸(𝑥) =𝜆
𝑉(𝑥) =λ
So mean = variance
Answer 25 : A
Explanation -
𝑎: 𝑏: 𝑐
3: 4: 7
3𝑥: 4𝑥: 7𝑥 ⇒ 14𝑥
∴ 𝑎 + 𝑏 + 𝑐 = 14𝑥
𝑐 = 7𝑥
∴ (𝑎 + 𝑏 + 𝑐): 𝑐
= 14𝑥: 7𝑥
= 2: 1
Answer 26 : C
Explanation :
Semi inter-quartile deviation = Q.D. = (𝑄3 − 𝑄1 )/2
Answer 27 : D
Explanation:
Let 𝑋 be Random Variable with 𝑀𝑋 (𝑡) = 𝐸(𝑒 𝑡𝑋 ) = ∑etx P(X = x)
1
; 𝑥=0
6
1
; 𝑥=1
3
Then 𝑃(𝑋 = 𝑥) = 1
; 𝑥=2
3
1
{6 ; 𝑥 = 3

𝑃(𝑋 ≤ 2) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)


1 1 1 5
= + + =
6 3 3 6

135 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Answer 28 : A
Explanation:
1 1
𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from U (𝜃 − 2 , 𝜃 + 2)
1 1
𝑓(𝑥) = 1; 𝜃 − < 𝑥𝑖 < 𝜃 +
2 2
1 1
𝜃ˆ ∈ [𝑋(𝑛) − , 𝑋(1) + ]
2 2
1 1
distribution of 𝑋 free from parameter, then 𝜃ˆ = 𝜆 (𝑋(𝑛) − 2) + (1 − 𝜆) (𝑋(1) + 2) ; 0 < 𝜆 <
1

1 1 3
Take 𝜆 = , and then we obtained mle of 𝜃 are
2 4 4

1 1 1
(𝑋(1) + 𝑋(𝑛) ); 4 (3𝑋(1) + 𝑋(𝑛) + 1); 4 (3𝑋(1) + 𝑋(𝑛) + 1) respectively.
2

Hence option (a) is correct.


Answer 29 : D
Explanation:
Let 𝑋 and 𝑌 be random variable having joint probability
𝑘
density function 𝑓(𝑥, 𝑦) = (1+𝑥 2)(1+𝑦 2) ; −∞ < (𝑥, 𝑦) < ∞
∞ ∞ 1
∫−∞ ∫−∞ 𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦 = 1 ⇒ 𝑘 = 2
𝜋
1 1
Since 𝑋 and 𝑌 are independent, then 𝑋 ∼ 𝑓(𝑥) = 𝜋 1+𝑥 2 ; −∞ < 𝑥 < ∞

P(X = Y) = 0{ There is no region occur corresponding to X = Y, then probability


corresponding to this region will be zero}

Answer 30 : D
Explanation:
Clearly, 𝑋1 , 𝑋2 , … , 𝑋𝑛
are i.i.d 𝐺(3,1) random variables. Then, 𝐸(𝑋𝑖 ) = 3 and Var (𝑋𝑖 ) = 3, 𝑖 = 1,2, …
Let 𝑆𝑛 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛 , then E(𝑆𝑛 ) = 3𝑛 and Var (𝑆𝑛 ) = 3𝑛

136 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Now For option (a)


𝑆𝑛 −3𝑛
Using CLT ∼ 𝑁(0,1) for all 𝑛 ≥ 1
√3𝑛

For option (b)


𝑆 3𝑛 𝑆 3𝑛
lim𝑛→∞ 𝐸 ( 𝑛𝑛) = lim𝑛→∞ = 3; lim𝑛→∞ 𝑉 ( 𝑛𝑛) = lim𝑛→∞ 𝑛2 = 0
𝑛

By Using Convergence in probability condition


(Consistency Properties)

𝑆
For all 𝜀 > 0, 𝑃 (| 𝑛𝑛 − 3| > 𝜀) → 0 as n → ∞
For option (c)
𝑆𝑛
→3
𝑛
with probability 1 (By using convergent in probability condition)

For option (d)


𝑠𝑛 −𝐸(𝑠𝑛 ) 3(𝑛−√𝑛)−𝐸(𝑆𝑛 )
lim𝑛→∞ 𝑃 ( ≥ ) = 𝑃(𝑍 ≥ −√3) = 1 −
√Var (𝑆2 ) √Var (𝑆𝑤 )
𝑃(𝑍 ≤ −√3)
1
= 1 − Φ(−√3) ≥
2
Answer 31 : D
Explanation:
(A) Sum of independent binomial variate is also a binomial variate if corresponding
probability will be same
Then 𝑋 + 𝑌 ∼ Bin (2𝑛, 𝑝)

(B) When there are more than two variables include, the observation lead to multinomial
distribution.
(𝑋, 𝑌) not follows Multinomial (2𝑛; 𝑝, 𝑝)

(C) Var (X − Y) = E(X − Y)2 − {E(X − Y)}2 = E(X − 𝑌)2

137 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

(D) Cov (𝑋 + 𝑌, 𝑋 − 𝑌) = 𝑉(𝑋) − Cov (𝑋, 𝑌) + Cov (𝑌, 𝑋) − 𝑉(𝑌) = 0

{∴ X and Y are independent Cov (X1 Y) = Cov (Y, X) = 0}

Hence option D is correct.


5.8 REFERENCES
• Devore, J. (2012). Probability and statistics for engineers, 8th ed. Cengage Learning.
• John A. Rice (2007). Mathematical Statistics and Data Analysis, 3rd ed. Thomson
Brooks/Cole
• Larsen, R., Marx, M. (2011). An introduction to mathematical statistics and its
applications. Prentice Hall.
• Miller, I., Miller, M. (2017). J. Freund’s mathematical statistics with applications, 8th
ed. Pearson.
• Demetri Kantarelis, D. and Malcolm O. Asadoorian, M. O. (2009). Essentials of
Inferential Statistics, 5th edition, University Press of America.
• Hogg, R., Tanis, E., Zimmerman, D. (2021) Probability and Statistical inference,
10TH Edition, Pearson
5.9 SUGGESTED READINGS
• S. C Gupta , V.K Kapoor, Fundamentals of Mathematical Statistics,Sultan Chand
Publication, 11th Edition.
• B.L Agarwal, Programmed Statistics ,New Age International Publishers, 2nd Edition.

138 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

LESSON 6
INTERVAL ESTIMATION

STRUCTURE
6.1 Learning Objectives
6.2 Introduction
6.3 Interval Estimation
6.3.1 One Sided and Two Sided Confidence interval
6.3.2 Cental and Non Central Confidence Interval
6.3.3 Pivotal Method to find the Confidence interval
6.4 In-Text Questions
6.5 Summary
6.6 Glossary
6.7 Answer to In-Text Questions
6.8 References
6.9 Suggested Readings
6.1 LEARNING OBJECTIVES
In statistics, the evaluation of a parameter for example, the mean (average) of a population by
computing an interval, or range of values, within which the parameter is most likely to be
located. Intervals are commonly chosen such that the parameter falls within with a 95 or 99
percent probability, called the confidence coefficient. Hence, the intervals are
called confidence intervals; the end points of such an interval are called upper and lower
confidence limits.
The interval containing a population parameter is established by calculating that statistic from
values measured on a random sample taken from the population and by applying the
knowledge (derived from probability theory) of the fidelity with which the properties of a
sample represent those of the entire population.
The probability tells what percentage of the time the assignment of the interval will be correct
but not what the chances are that it is true for any given sample. Of the intervals computed
from many samples, a certain percentage will contain the true value of the parameter being
sought.

139 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

In newspaper stories during election years, confidence intervals are expressed as proportions
or percentages. For instance, a survey for a specific presidential contender may indicate that
they are within three percentage points of 40% of the vote (if the sample is large enough). The
pollsters would be 95% certain that the actual percentage of voters who supported the candidate
would be between 37% and 43% because election polls are frequently computed with a 95%
of the confidence level.
Stock market investors are most interested in knowing the actual percentage of equities that
rise and fall each week. The percentage of American households with personal computers is
relevant to companies selling computers. Confidence intervals may be established for the
weekly percentage change in stock prices and the percentage of American homes with personal
computers.
In data analysis, calculating the confidence interval is a typical step that may be easily derived
from populations with normally distributed data using the well-known x (ts)/n formula. The
confidence interval, however, is not always easy to determine when working with data that is
not regularly distributed. There are fewer and far less easily available references for this data
in the literature.
6.2 INTRODUCTION
Let 𝑇 be an estimator whose value 𝑡 is a point estimate of some unknown parameter 𝜃.
Even if the estimator 𝑇 satisfies the desirable properties of point estimators, it is clear that 𝑡
ordinarily will not be equal to the value 𝜃 because of sampling error. Thus, it becomes
necessary to indicate the general magnitude of this error. This allows us to estimate the
unknown parameter 𝜃 from the observed sample values, as follows.
𝜃 = 𝑡 ± error
Then our problem is to know, "What is the magnitude of this error?" and "How sure are we
that we are right?"
If sampling error is small the true value is likely to be covered by a small, estimated range of
values. On the other hand, if sampling error is large, the true value is likely to be covered by a
large estimated range of values. The answer to this problem is interval estimation based on
confidence intervals.
Method for Confidence Interval :
The method of interval estimation using the concept of confidence intervals can better be
explained as given below :
Let the random variable 𝑋 have a probability (mass/density) function 𝑓(𝑥, 𝜃) with the
parameter 𝜃 which we wish to estimate by means of a random sample of observations
𝑥1 , 𝑥2 , … , 𝑥𝑛 .

140 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Let 𝛿 < 1 be an arbitrary positive number, and 𝑇1 = 𝑇1 (𝑥1 , 𝑥2 , … . , 𝑥𝑛 ) and 𝑇2 =


𝑇2 (𝑥2 , 𝑥2 , … . , 𝑥𝑛 ) be two statistics (i.e. the functions of sample values) such that
𝑃[𝑇1 ≤ 𝜃 ≤ 𝑇2 ] = 𝛿
where 𝑃[𝑇1 ≤ 𝜃 ≤ 𝑇2 ] = 𝛿 denote that the two statistics 𝑇1 and 𝑇2 will contain the true value
of the parameter 𝜃 in its interior with a given probability 𝛿. Then the interval from 𝑇1 to 𝑇2 is
called a confidence interval for 𝜃 with the confidence coefficient 𝛿 where the values 𝑇1 and
𝑇2 are referred to as lower and upper confidence limits for 𝜃.
The expression 1 − 𝛿, generally denoted by 𝛼 is called the level of significance or
confidence level.
1−𝛿 =𝛼 ⇒𝛿 =1−𝛼
In brief, an estimated range 𝑇2 − 𝑇1 = 𝑆(𝑋), say, obtained from sample values containing the
true value of the parameter with a given probability 𝛿 is known as 100𝛿% or 100(1 − 𝛼)%
confidence interval for 𝜃 such that 𝑃[𝑆(𝑋) contain 𝜃] = 1 − 𝛼 = 𝛿.
In general, to estimate 𝑇(𝜃) a function of 𝜃, we obtain two quantities (based on sample values)
𝑇1 and 𝑇2 such that.
𝑃(𝑇1 ≤ 𝑇(𝜃) ≤ 𝑇2 ) = 𝛿 or 1 − 𝛼
Note 1. 𝑆(𝑋) covers the true parameter value 𝑇(𝜃) with probability ≥ (1 − 𝛼). The sign of
equality holds in case of continuous random variable.
Note 2. Since 𝜃 the true value of the parameter, is not a random variable we do not read.
𝑃[𝑇1 ≤ 𝜃 ≤ 𝑇2 ] = 𝛿 as the probability that 𝑇 lies between 𝑇1 and 𝑇2 is
𝛿. But by 𝑃[𝑇1 ≤ 𝜃 ≤ 𝑇2 ] we mean simply that the sample will be such that 𝑇1 computed from
that sample will be less than or equal to 𝜃 and 𝑇2 will be greater than or equal to 𝜃.
Note 3. In other words, 100𝛿% confidence interval may be explained as given below :
If samples are drawn repeatedly under identical conditions, and if 95% say, confidence interval
were computed for each of these samples, then in the long run 95% of these confidence
intervals would include the true value 𝜃.
6.3 INTERVAL ESTIMATION
We will discuss these topics in details
(i) One Sided and Two Sided Confidence Intervals
(ii) Cental and Non Central Confidence Interval
(iii) Pivotal Method to find the Confidence Interval

141 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

6.3.1 One Sided and Two Sided Confidence Intervals :


Let 𝑥1 , 𝑥2 , … . . , 𝑥𝑛 be a random sample of size 𝑛 from the pmf/pdf, 𝑓(𝑥, 𝜃). Let 𝑇1 =
𝑡1 (𝑥1 , 𝑥2 , … . , 𝑥𝑛 ) be a statistic for which.
𝑃[𝑇1 < 𝜏(𝜃)] = 1 − 𝛼1
or 𝑃[𝑇1 ≤ 𝜏(𝜃) < ∞] = 1 − 𝛼1 .

Then 𝑇1 is called one sided lower confidence limit for 𝜏(𝜃) and the interval (𝑇1 , ∞) is known
as one sided confidence interval on the right hand tail
Fig. 6.1 with confidence coefficients (1 − 𝛼1 ) as shown below [for 𝑋 ∼ 𝑁(𝜇, 𝜎 2 )] :
Similarly 𝑇2 = 𝑡2 (𝑥1 , 𝑥2 , … , , 𝑥𝑛 ) be another statistic for which
𝑃[𝜏(𝜃) < 𝑇2 ] = 1 − 𝛼2
or 𝑃[−∞ < 𝜏(𝜃) ≤ 𝑇2 ] = 1 − 𝛼2 ,
then 𝑇2 is called one sided upper confidence limit for 𝜆(𝜃) and the interval [−∞, 𝑇2 ] is known
as one sided confidence interval on the left hand side tail with confidence coefficient 1 − 𝛼2 ,
as shown below [for 𝑋− 𝑁(𝜇, 𝜎 2 )] :

In brief, when we delete the area on either left side or right side only, then the confidence
interval is said to be one sided.
Combining the two one sided confidence intervals we get two sided confidence interval (or
simply confidence interval) where 𝛼 = 𝛼1 + 𝛼2
Here [𝑇1 , 𝑇2 ] is 100(1 − 𝛼) percent confidence interval for 𝜏(𝜃) as shown
below (for 𝑋∼ 𝑁(𝜇, 𝜎 2 )] :
6.3.2: Central and Noncentral Confidence Intervals :
The interval 𝑇1 to 𝑇2 (in case of two sided confidence interval) can be obtained in various ways
:

142 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

We can construct different confidence intervals with the same confidence coefficient by
deleting areas equal to 𝛼2 at the right end and 𝛼1 at the left end of the curve such that 𝛼1 +
𝛼2 = 𝛼, say and 𝛿 = 1 − 𝛼.
𝛼
When we take 𝛼1 = 𝛼2 = then the confidence interval is often called central confidence
2
Interval.
When 𝛼1 ≠ 𝛼2 , then (𝑇1 , 𝑇2 ) is Non-central Confidence Interval.
The length of the confidence Interval
The difference 𝑇2 − 𝑇1 is known as the length of the confidence interval.
𝑃[𝑇1 ≤ (𝜃) ≤ 𝑇2 ] = 1 − 𝛼
We prefer that interval for which the length is shortest. The confidence interval with shortest
length is known as Shortest Confidence Interval.
6.3.3 Pivotal Method to find the Confidence interval:
Pivotal Quantity:
Let 𝑥1 , 𝑥2 , … . . , 𝑥𝑛 , be a random sample from pmf/pdf 𝑓(𝑥, 𝜃). Let 𝑄 = 𝑞(𝑥1 , 𝑥2 , … , 𝑥𝑛 , 𝜃) be
a function of 𝑥1 , 𝑥2 , … , 𝑥𝑛 and 𝜃. If the distribution of 𝑄 does not depend on 𝜃, then 𝑄 is
called a pivot or Pivotal Quantity.
For example,
Let 𝑋 ∼ 𝑁(𝜃, 𝜎 2 = 9). Then
9
(i) 𝑄 = 𝑥‾ − 𝜃 is a pivotal quantity, since the distribution of 𝑄 = 𝑥‾ − 𝜃 − 𝑁 (0, 𝑛)
independent of 𝜃
𝑥‾−𝜃 𝑥‾−𝜃
(ii) 𝑄 = 3/ 𝑛 is a pivotal quantity, since the distribution of 𝑄 = 3/ 𝑥 − 𝑁(0.1), independent
√ √
of 𝜃.
𝑥‾ 𝑥‾ 9
(iii) 𝑄 = 𝜃 is not a pivotal quantity because distribution of 𝑄 = 𝜃 − 𝑁 (1, 𝑛𝜃2 ), depends on 𝜃.
Pivotal Method
Let a random variable which is a function of 𝑥1 , 𝑥2 … … , 𝑥𝑛 and 𝜃 denoted by 𝑄 or 𝜓(𝑇, 𝜃) and
whose distribution is independent of 𝜃 be taken as a pivot ; For each 𝜃, 𝜓(𝑇, 𝜃) is a statistic
and 𝑇 is a point estimate of 𝜃. For any fixed 0 < 𝛿 < 1, there will exist 𝑞1 and 𝑞2 depending
on 𝛿 such that.
𝑃[𝑞1 < 𝑄 < 𝑞2 ] = 𝛿
and for each possible sample values 𝑥1 , 𝑥2 , … . , 𝑥𝑛

143 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑃[𝑞1 < 𝑄 < 𝑞2 ] = 𝛿


⇒ 𝑃[𝑡1 < 𝜃 < 𝑡2 ] = 𝛿

for functions 𝑡1 and 𝑡2 not depending on 𝜃, then [𝑡1 , 𝑡2 ] is a 100 𝛿% confidence interval of
𝜏(𝜃) (or in particular 𝜃 ).
When the lower and upper confidence limits in a 100𝛿% confidence interval for
𝜃, 𝑃(𝑡1 ≤ 𝜃 ≤ 𝑡2 ) = 𝛿 depend on the point estimate 𝑇 for 𝜃 and also on the sampling
distribution of the pivotal quantity (or the statistic 𝑇 itself), then the method of obtaining the
confidence interval is often called a General Method or Pivotal Method.
Example 1. Obtain an expression for confidence Interval 100(1 − 𝛼)% for the Mean where
variance is known in case of 𝑁(𝜇, 𝜎 2 ) .
Solution. Suppose a random sample 𝑥1 , 𝑥2 , … . , 𝑥𝑛 is drawn from a normal popuiation with
mean 𝜇 and variance 𝜎 2 . Since the most efficient point estimator for the population mean 𝜇 is
the sample mean 𝑋‾, we can establish a confidence interval for 𝜇 by considering the sampling
distribution of 𝑋‾.
𝜎 2
We know that if 𝑋∼ 𝑁(𝜇, 𝜎 2 ), then 𝑋‾∼ 𝑁 (𝜇, 𝑛 ) and the corresponding standard normal
variate (i.e., pivotal quantity).
𝑋‾ − 𝜇
𝑍 = − 𝑁(0,1)
𝜎/√𝑛
1 −122
i.e. 𝑓(𝑧) = 𝑒 2 , −∞ < 𝑧 < ∞
√2𝜋
Let 𝑍𝛼/2 be the value of 𝑍 such that.
1 1 2 𝛼
𝑃(𝑍 ≥ 𝑍𝛼/2 ) = ∫ 𝑓(𝑧)𝑑𝑧 = ∫ 𝑒 −2𝑒 𝑑𝑧 = .
𝑍𝛼/2 𝑍𝛼/2 √2𝜋 2
𝛼
and 𝑍1 − 2 = −𝑍𝛼/2 be the value of 𝑍 such that.
−𝑍𝛼/2 −𝑍𝛼/2
1 1 2 𝛼
𝑃(𝑍 ≤ −𝑍𝛼/2 ) = ∫ 𝑓(𝑧)𝑑𝑧 = ∫ 𝑒 −2𝑡 𝑑𝑥 =
−∞ −∞ √2𝜋 2
Then, clearly we have (See figure 6.4) 𝑃(−𝑍𝛼/2 ≤ 𝑍 ≤ 𝛼/2) = 1 − 𝛼 = 8, say ⇒
𝑥‾−𝜇 𝜎 𝜎
𝑃 (−𝑍𝛼/2 ≤ ≤ 𝑍𝛼/2 ) = 1 − 𝛼 𝑃 (−𝑍𝛼/2 ≤ 𝑋‾ − 𝜇 ≤ 𝑍𝛼/2 ) = 1 − 𝛼
𝜎/√𝑛 √𝑛 √𝑛
𝜎
[ Multiplying each term in the inequality by ]
√𝑛

144 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

or
𝜎 𝜎
𝑃 (𝑋‾ − 𝑍𝛼/2 ≤ 𝜇 ≤ 𝑋‾ + 𝑍𝛼/2 )= 1−𝛼
√𝑛 √𝑛
[Subtracting 𝑋‾ from each term in the inequality and multiplying by - 1]
Thus, the (1 − 𝛼)100% confidence interval for 𝜇 in normal populatioa when 𝜎 2 is known
𝜎 𝜎
𝑋‾ − 𝑍𝛼/2 ≤ 𝜇 ≤ 𝑋‾ + 𝑍𝛼/2
√𝑛 √𝑛
where 𝑋‾ is the mean of a random sample of size 𝑛 from a normal population with mean 𝜇 and
known variance 𝜎 2 and 𝑍𝛼/2 is the value of the standard normal variate having an area of 𝛼/2
to the right side.

Example 2. Obtain 95% confidence interval for the mean of a normal distribution 𝑁(𝜇, 𝜎 2 )
where the variance 𝜎 2 is known. What is length of this confidence interval?

Solution. Given 𝑋 = 𝑁(𝜇, 𝜎 2 )


1
Let 𝑋1 , 𝑋2 … … , 𝑋𝑛 be a random sample, where 𝑛 Σ𝑋𝑖 = 𝑥‾, then
𝜎2
𝑥‾ − 𝑁 (𝜇, )
𝑛
𝑥‾−𝜇
and 𝑍 = 𝜎/ − 𝑁(0,1), with p.d.f. 𝜙(𝑧), say
√𝑛
Then (1 − 𝛼) % central confidence interval for 𝜇 is obtained as follows:

145 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Then (1 − 𝛼) \% central confidence interval for 𝜇 is obained as follows:


𝑍𝛼/2
𝑃(−𝑍𝛼/2 < 𝑍 ≤ 𝑍𝛼/2 ) = ∫ 𝜑(𝑧)𝑑𝑧 = 1 − 𝛼
−𝑍𝜔/2
𝑥‾ − 𝜇
⇒ 𝑃 (−𝑍𝛼/2 ≤ ≤ 𝑍𝛼/2 ) = 1 − 𝛼
𝜎/√𝑥
𝜎 𝜎
⇒ 𝑃 (𝑥‾ − 𝑍𝛼/2 ≤ 𝜇 ≤ 𝑥‾ + 𝑍𝛼/2 ) = 1 − 𝛼
√𝑛 √𝑛
𝛼 −05
Here 1 − 𝛼 = 95%, 𝛼 = 05, = = .025
2 2
From the tables of Normal Probability Integral we find that
1.96
𝑍𝛼/2 = 1.96 i.e. ∫ 𝜙(𝑧)𝑑𝑧 = 0.95
−1.96

Hence the required confidence interval is given by


𝜎 𝜎
𝑃 [𝑥‾ − 1.96 ≤ 𝜇 ≤ 𝑥‾ + 1.96 ] = 0.95
√𝑛 √𝑛
𝜎 𝜎
i.e. (𝑥‾ − 1.96 , 𝑥‾ + 1.96 ) is a confidence interval for 𝜇 with confidence coefficient 0 ⋅
√𝑥 √𝑥
95, The length of this confidence interval is
𝜎 𝜎 𝜎
(𝑥‾ + 1.96 ) − (𝑥‾ − 1.96 ) = 3.92
√𝑛 √𝑛 √𝑛
Example 3. Construct a non-central 95% confidence interval for 𝜇 in 𝑁(𝜇, 1) with sample
size, 𝑛 = 16. (You may take 𝛼1 = −02, in the left tail and 𝛼2 = .03 in the right tail)
Solution. Given
𝛼1 = −02, 𝛼2 = .03
Let
𝛼 = −02 + .03 = .05.
1 − 𝛼 = 0.95 = 𝛿
From Normal Probability

𝑃(𝑍 ≤ −1.88) = −02, 𝑃(𝑍 ≥ 2.05) = −03


∴ 𝑃[−1.88 ≤ 𝑍 ≤ 2.05] = 0.95
𝑥‾ − 𝜇
⇒ 𝑃 [−1.88 ≤ ≤ 2.05] = 0.95
𝜎/√𝑛

146 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝜎 𝜎
⇒ 𝑃 [−1.88 ≤ 𝑥‾ − 𝜇 ≤ 2.05 ] = 0.95
√𝑛 √𝑥
−1.88 2.05 ∵𝜎=1
⇒ [ ≤ 𝑥‾ − 𝜇 ≤ ] = 0.95 { 𝑛 = 16
4 4
√𝑛 = 4
⇒ 𝑃[−.47 < 𝑥‾ − 𝜇 < .5125] = 0.95
⇒ 𝑃[−𝑥‾ − 0.47 < −𝜇 < −𝑥‾ + .5125] = 0.95
⇒ 𝑃[𝑥‾ − .5125 < 𝜇 < 𝑥‾ + 0.47] = 0.95

confidence interval for 𝜇.


The length of this confidence interval is
(𝑥 + 0.47) − (𝑥 − .5125) = .9825

Example 4. Obtain an expression for confidence interval for the mean when variance is not
known in case of 𝑁(𝜇, 𝜎 2 ).
Solution. Case 1. Sample size 𝑛 > 30, variance is unknown in this case we estimate variance
𝜎 2 by
1
𝑆2 = Σ(𝑥𝑖 − 𝑥‾)2 , the
𝑛
𝑥
−𝜇
𝑍 =𝑥 ∼ 𝑁(0,1)
𝑆/√𝑛
and the (1 − 𝛼)100% confidence interval for 𝜇 is given by
𝑥‾ − 𝜇
𝑃 [−𝑍𝛼/2 ≤ ≤ 𝑍𝛼/2 } =1−𝛼
𝑆/√𝑥
𝑆 𝑆
⇒ 𝑃 [𝑥‾ − 𝑍𝛼/2 ≤ 𝜇 ≤ 𝑥‾ + 𝑍𝛼/2 ] = 1 − 𝛼.
√𝑥 √𝑛
Case 2. Sample size is small, 𝑛 ≤ 30, variance is unknown.
In this case we estimate variance 𝜎 2 by
1
𝑠 2 = 𝑛−1 Σ(𝑥𝑖 − 𝑥‾)2 , then statistic (i.e. pivotal quantity)
𝑥‾ − 𝜇
𝑡= ∼ Students' 𝑡 with (𝑛 − 1)𝑑𝑓
𝑠/√𝑥
and the (1 − 𝛼)100% confidence interval for 𝑡 is given by

147 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑡𝛼/2
𝑃[−𝑡𝛼/2 ≤ 𝑡 ≤ 𝑡𝛼/2 ] = ∫ 𝑓𝑇 (𝑡)𝑑𝑡 = 1 − 𝛼
−𝑡𝛼/2
𝑥‾ − 𝜇
⇒ 𝑃 [−𝑡𝛼/2 ≤ ≤ 𝑡𝛼/2 ] = 1 − 𝛼
𝑠/√𝑛
𝑠 𝑠
⇒ 𝑃 [𝑥‾ − 𝑡𝛼/2 ⋅ ≤ 𝜇 ≤ 𝑥‾ + 𝑡𝛼/2 ⋅ } = 1 − 𝛼
√𝑛 √𝑛

𝑠 𝑠
Hence the (1 − 𝛼)100% confidence interval for 𝜇 is (𝑥‾ − 𝑡𝛼/2 , 𝑥‾ + 𝑡𝛼/2 )
√𝑛 √𝑛

Example 5. A sample of 10 individuals has a mean of 53; and the sum of the squares of
deviations from the mean 81. Find 90% confidence interval for the mean 𝜇, assuming that the
population is normal with variance unknown.
𝑠
Solution. The (1 − 𝛼)100% confidence interval for 𝜇 is given by 𝑃 {𝑥‾ − 𝑡𝛼/2 ≤ 𝜇 ≤ 𝑥‾ +
√𝑥
𝑠
𝑡𝛼/2 }=1−𝛼
√𝑛
𝛼
Here 1 − 𝛼 = 90, 𝛼 =⋅ 10, 2 =⋅ 5, 𝑛 = 10
𝑥‾ = 53, Σ(𝑥𝑖 − 𝑥‾)2 = 81
𝑛
2
1 81
𝑠 = ∑ (𝑥𝑖 − 𝑥‾)2 = =9
𝑛−1 9
𝑖=1
𝑠 =3
The value of 𝑡𝛼/2 (i.e., 𝑡05 ) for 9 d.f. is 1.833 (using the table of 𝑡 values)
3 3
53 − 1.833 × ≤ 𝜇 ≤ 53 + 1.833 ×
Hence the 90% confidence interval for 𝜇 is √10 √10
51.26 ≤ 𝜇 ≤ 54.74, taking √10 = 3.162

Example 6. Construct a 90% confidence interval for 𝜎 2 on the basis of random sample of size
1
10 with a standard deviation 𝑠 = √𝑛−1 Σ(𝑥𝑖 − 𝑥‾)2 = 3.2 in case of 𝑁(𝜇, 𝜎 2 ), 𝜇 being
unknown.
Solution. The (1 − 𝛼)100% confidence interval for 𝜎 2 based on 𝜒 2 -statistic is given by

148 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(𝑛 − 1)𝑠 2 2
(𝑛 − 1)𝑠 2
𝑃( 2 ≤𝜎 ≤ )=1−𝛼
𝜒𝛼/2 𝜒2 𝛼
1−
2
2 2
Here 𝑛 = 10, 𝑠 = (3 ⋅ 2) = 10.24
𝛼
1 − 𝛼 = 99, 𝛼 =⋅ 10. = .05
2
(𝑛 − 1)𝑠 2 = (10 − 1) × 10.24 = 9 × 10.24 = 92.16
2
𝜒0.05 (9) = 16.912
2 } from Table values of 𝜒 2 .
𝜒0.95 (9) = 3.325

Hence 90% confidence interval for 𝜎 2 is given by


92 ⋅ 16 92 ⋅ 16
𝑃( ≤ 𝜎2 ≤ ) = 0.90
16 ⋅ 912 3 ⋅ 325
⇒ 𝑃(5.449 ≤ 𝜎 2 ≤ 27.717) = 0.90
Objective Type Questions
1 Write True or False:
(i) a 100%, confidence interval for the mean 𝜇 in case of 𝑁(𝜇, 𝜎 2 ), where 𝜎 2 is
known is
𝜎 𝜎
𝑋‾ − ∞. ≤ 𝜇 ≤ 𝑋‾ + ∞ ⋅ i.e. − ∞ ≤ 𝜇 ≤ ∞
√𝑛 √𝑛
and a 0% confidence interval is simply 𝑋‾. (True)
(ii) In the statement 𝑃[𝑇1 ≤ 𝜃 ≤ 𝑇2 ] = 𝛿 𝛿 is defined as confidence coefficient.
(True)
(iii) If 𝜎 2 is known in a normal population, the 99% confidence interval for mean
𝜎
𝜇 is 𝑥‾ ± 2 ⋅ 58 𝑛. (True)

(iv) If a random sample of size 𝑛 is drawn from 𝑁(𝜇, 𝜎 2 ) with unknown 𝜇 and 𝜎 2 ,
then 100(1 − 𝛼)% confidence interval for 𝜎 2 will be
𝑛𝑠 2 2
𝑛𝑠 2
2 𝛼 ≤𝜎 ≤ 2 𝛼
𝜒𝑛−1 (2) 𝜒𝑛−1 (1 − 2 )
(True)

149 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

2 Fill in the blanks:


(i) Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample of size 𝑛 from a norinal popolation with
mean 𝜇 and known variance 𝜎 2 , In order to obtain the confidence interval for 𝜇
the relevant statistic to be used is.
𝑥‾ − 𝜇
[ Ans. ]
𝜎/√𝑛
(ii) 95% confidence interval for 𝜆 based on a large random sample from a Poisson
distribution
𝜆𝑥
𝑓(𝑥, 𝜆) = 𝑒 −𝜆 𝑥! , 𝑥 = 0,1,2, … …
is (approximately)

𝑥‾
[ Ans. 𝑥‾ ± 1.96√ ]
𝑛

(iii) In confidence interval 𝑃(𝑡1 ≤ 𝜃 ≤ 𝑡2 ) = 1 − 𝛼,


(1 − 𝛼) is called......
[Ans, confidence coefficient]
3. In each of the following questions four alternative answers are given in which only
one is correct. Select the correct answer and write (a), (b), (c) or (d) accordingly :
(i) If the confidence limits for 𝜇, the mean of the normal population withvariance
𝜎
𝜎 2 be 𝑋‾ ∓ 2 ⋅ 58 𝑛, then the corresponding confidence coefficient is

(a) 0.01
(b) 0.001
(c) 0.99
(d) none of these
[Ans. (c)]
(ii) Assume that you take a random sample from 𝑁(𝜇, 𝜎 2 ) and calculate 𝑥‾ as 100 .
You then calculate the upper limit of a 90 percent confidence interval for 𝜇; its
value is 112 . The lower limit for the confidence interval will be :
(a) 88
(b) 92
(c) 100
(d) 124

150 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

[Ans, (a)]
(iii) A sample of 400 units has a mean 4.6 and standard deviation 3-42. If the
population is normal with unknown mean 𝜇 and 𝜎, the 99.73% confidence
interval for 𝜇 is
(a) 4 ⋅ 6 ∓ 0.413
(b) 4.6 + 1.71
(c) 2.3 ∓ 0.413
(d) 2.3 ∓ 1.71
[Ans. (a)]

(iv) In a town a sample of 100 voters contained 64 persons who favoured a particular
issue. We can be 95% confident that proportion of voters in the community
who favour the issue is contained between the limts of :
(a) 0.50 and 0.70
(b) 0.50 and 0.74
(c) 0.55 and 0.73
(d) 0.55 and 0.70
{Ans. (c)}
(v) A fair coin is tossed repeatedly. If 𝑛 tosses are required in order that the
probability will be at least 0.90 for the proportion of heads to lie between 0.4
and 0.6 then the value of 𝑛 is at least
(a) 200
(b) 220
(c) 250
(d) 280
[Ans. (c)]
(vi) The 95% confidence limits for 𝜇, when sample is drawn from the population
𝑁(𝜇, 𝜎 2 ), 𝜎 2 is known, are given by
𝑥‾−𝜇
(a) −1.96 ≤ 𝜎 ≤ 1.96
√𝑛
𝑥‾−𝜇
(b) 𝑃 [−𝑍𝛼/2 ≤ 𝜎/ ≤ 𝑍𝛼/2 ] = 0.95 = 1 − 𝛼
√𝑛
𝜎
(c) 𝑥‾ ± 1.96
√𝑛
151 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

(d) all the above


[Ans. (d)]
(vii) For a fixed confidence coefficient, confidence interval for the parameter 𝜃 is
one
(a) with shortest width
(b) with largest width
(c) with an average width
(d) none of the above

6.4 ANSWER TO IN-TEXT QUESTIONS


MCQ’s Problems
Question 1. Formula for directly calculating the mean 𝑋‾ of an individual series (assume
frequency for each data is one) is
Σ𝑋
A. 𝑋‾ = 𝑁
Σ𝑓𝑋
B. 𝑋‾ = 𝑁
Σ𝑑𝑥
C. 𝑋‾ = 𝐴 + 𝑛 where 𝑑𝑥 = 𝑋 − 𝐴
Σ𝑑𝑥
D. 𝑋‾ = 𝐴 + 𝑁 where 𝑑𝑥 = 𝑋 − 𝐴
Question 2. Formula for calculating the mean of an individual series by short-cut method
(assume frequency for each data is one) is:

Σ𝑋
A. 𝑋‾ = 𝑁
B. 𝑋‾ = Σ𝑓𝑋/𝑁
Σ𝑑𝑥
C. 𝑋‾ = 𝐴 + 𝑁
Σ𝑓𝑑𝑥
D. 𝑋‾ = 𝐴 + 𝑁
Question 3. The variance of first 𝑛 natural numbers is:
A. (𝑛2 + 1)/12
B. (𝑛 + 1)2 /12
C. (𝑛2 − 1)/12
D. (2𝑛2 − 1)/12
Question 4. If a random variable 𝑋 has mean 3 and standard deviation 5, then the variance of
a variable 𝑌 = 2𝑋 − 5 is:
A. 45

152 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

B. 100
C. 15
D. 40
Question 5. For any two events 𝐴 and 𝐵 then 𝑃(𝐴 − 𝐵) is equal to:
A. 𝑃(𝐴) − 𝑃(𝐵)
B. 𝑃(𝐵) − 𝑃(𝐴)
C. 𝑃(𝐵) − 𝑃(𝐴𝐵)
D. 𝑃(𝐴) − 𝑃(𝐴𝐵)

Question 6. If an event 𝐵 has occurred and it is known that 𝑃(𝐵) = 1, the conditional
probability 𝑃(𝐴/𝐵) is equal to:
A. 𝑃(𝐴)
B. 𝑃(𝐵)
C. one
D. zero

Question 7. Two random variables 𝑋 and 𝑌 are said to be independent then


A. 𝐸(𝑋𝑌) = 1
B. 𝐸(𝑋𝑌) = 0
C. 𝐸(𝑋𝑌) = 𝐸(𝑋)𝐸(𝑌)
D. 𝐸(𝑋𝑌) = any constant value

Question 8. If 𝑋 and 𝑌 are two non-negative random variables such that 𝑋 ≤ 𝑌


A. 𝐸(𝑋) ≤ 𝐸(𝑌)
B. 𝐸(𝑋) ≥ 𝐸(𝑌).
C. 𝐸(𝑋) = 𝐸(𝑌)
D. none of the above
Question 9. If 𝑋 and 𝑌 two independent variables and their expected values are 𝑋‾ and 𝑌‾
respectively, then
A. 𝐸{(𝑋 − 𝑋‾)(𝑌 − 𝑌‾)} = 0
153 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

B. 𝐸{(𝑋 − 𝑋‾)(𝑌 − 𝑌‾)} = 1


C. 𝐸{(𝑋 − 𝑋‾)(𝑌 − 𝑌‾)} = 𝐶 (constant)
D. all the above

Question 10. If 𝑋 is a random variable which can take only non-negative values, then
A. 𝐸(𝑋 2 ) = [𝐸(𝑋)]2
B. 𝐸(𝑋 2 ) ≥ [𝐸(𝑋)]2
C. 𝐸(𝑋 2 ) ≤ [𝐸(𝑋)]2
D. none of the above

Question 11. If 𝑋 is a random variable having its p.d.f. 𝑓(𝑥), the 𝐸(𝑋) is called:
A. arithmetic mean
B. geometric mean
C. harmonic mean
D. first quartile
1
Question 12. If 𝑋 is a random variable and 𝑓(𝑥) is itsp.d.f., 𝐸 (𝑋) is used to find:

A. arithmetic mean
B. harmonic mean
C. geometric mean
D. first central moment

Question 13. If 𝑋 is a random variable and its p.d.f. is 𝑓(𝑥), 𝐸(log 𝑥) used to find :
A. arithmetic mean
B. geometric mean
C. harmonic mean
D. logarithmic mean
1 1
Question 14. If 𝑋 ∼ 𝑏 (3, 2) and 𝑌 ∼ 𝑏 (5, 2), the probability of 𝑃(𝑋 + 𝑌 = 3) is:

A. 7/16

154 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

B. 7/32
C. 11/16
D. none of the above

Question 15. If 𝑋 and 𝑌 are two Poisson variates such 𝑋 ∼ 𝑃(1) and 𝑌 ∼ 𝑃(2), then X+Y
follow
A. P(1)
B. P(2)
C. P(4)
D. P(3)

Question 16. If 𝑿 ∼ 𝒃(𝒏, 𝒑), the distribution of 𝒀 = (𝒏 − 𝑿) is:


A. 𝑏(𝑛, 1)
B. 𝑏(𝑛, 𝑥)
C. 𝑏(𝑛, 𝑝)
D. 𝑏(𝑛, 𝑞)

Question 17. Student’s 𝑡 -distribution was given by:


A. G.W. Snedecor
B. R.A. Fisher
C. W.S. Gosset
D. none of the above

Question 18. Student’s 𝑡 -distribution curve is symmetrical about mean, it means that:
A. odd order moments are zero
B. even order moments are zero
C. both (a) and (b)
D. none of (a) and (b)
Question 19. If 𝑋 ∼ 𝑁(0,1) and 𝑌 ∼ 𝜒 2 /𝑛, the distribution of the variate 𝑋/√𝑌 follows:
A. Cauchy’s distribution
155 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

B. Fisher’s 𝑡 -distribution
C. student’s 𝑡 -distribution
D. none of the above

Question 20. The relation between the mean and variance of 𝜒 2 with 𝑛 d.f. is:
A. mean = 2 variance
B. 2 mean = variance
C. mean = variance
D. none of the above
Question 21. Chi-square distribution curve in respect of symmetry is:
A. negatively skew
B. symmetrical
C. positively skew
D. any of the above

Question 22. 2 Chi-square distribution curves with regard to bulginess is:


A. mesokurtic
B. leptokurtic
C. platykurtic
D. not definite

Question 23. If 𝑋 and 𝑌 are distributed as 𝜒 2 with d.f. 𝑛1 and 𝑛2 , respectively, the
distribution of the variate 𝑋/𝑌 is:
𝑛 𝑛
A. 𝛽𝐼 ( 21 , 22 )
𝑛 𝑛
B. 𝛽2 ( 21 , 22 )
C. 𝜒 2 with d.f. (𝑛1 − 𝑛2 )
D. none of the above
Question 24. If 𝑋 ∼ 𝜒𝑛21 and 𝑌 ∼ 𝜒𝑛22 , the distribution of the variate (𝑋 + 𝑌) is:
A. 𝜒 2 (𝑛1 − 𝑛2 )
B. 𝜒 2 (𝑛1 𝑛2 )

156 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

C. 𝜒 2 (𝑛1 + 𝑛2 )
D. all the above

Question 25. A normal random variable has mean = 2 and variance = 4. Its fourth central
moment 𝜇4 will be:
A. 16
B. 64
C. 80
D. 48

Question 26. If a random variable 𝑋 has mean 3 and standard deviation 5, then, the variance
of the variable 𝑌 = 2𝑋 − 5 is,
A. 25
B. 45
C. 100
D. 50
2 1
Question 27. A variable 𝑋 with moment generatingfunction 𝑀𝑋 (𝑡) = (3 + 3 𝑒 𝑡 ) is

distributed
with mean and variance as:
2 2
A. mean = 3 , variance = 9
1 2
B. mean = 3, variance = 9
1 2
C. mean = 3, variance = 3
2 1
D. mean = 3, variance = 9

Question 28. If a distribution has moment generating function 𝑀𝑋 (𝑡) = (2 − 𝑒 𝑡 )−3 , then the
distribution is:
A. geometric distribution

157 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

B. hypergeometric distribution
C. binomial distribution
D. negative binomial distribution
1
Question 29. If 𝑋 is a standard normal variate, then 2 𝑋 2 is a gamma variate with parameters:
1
A. 1, 2
1
B. ,1
2
1 1
C. ,
2 2
D. 1,1
Question 30. Given the joint probability density function of 𝑋 and 𝑌 as,𝑓(𝑥, 𝑦) = 4𝑥𝑦; 0 ≤
1 1
𝑥 ≤ 1,0 ≤ 𝑦 ≤ 1 = 0; otherwise. 𝑃 (0 < 𝑥 < 2 ; 2 ≤ 𝑦 ≤ 1) is equal to
A. 1/4
B. 5/16
C. 3/16
D. 3/8
Question 31. The type of estimates are
A. point estimate
B. interval estimates
C. estimation of confidence region
D. all the above

Question 32. If an estimator 𝑇𝑛 of population parameter 𝜃 converges in probability to 𝜃 as 𝑛


tends to infinity is said to be:
A. sufficient
B. efficient
C. consistent
D. unbiased

Question 33. The estimator ∑𝑋/𝑛 of population mean is:


A. an unbiased estimator
B. a consistent estimator

158 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

C. both (a) and (b)


D. neither (a) nor (b)
Question 34. If 𝑋1 , 𝑋2 , … , 𝑋𝑛 is a random sample from a population 𝑁(0, 𝜎 2 ), the sufficient
statistic for 𝜎 2 is:
A. ∑𝑋𝑖
B. Σ𝑋𝑖2
C. (Σ𝑋𝑖 )2
D. none of the above

Question 35. Bias of an estimator or can be:


A. positive
B. negative
C. either positive or negative
D. always zero

Question 36. If 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from an infinite population where S 2 =


1
∑𝑖 (𝑋𝑖 − 𝑋‾)2, the unbiased estimator for the population variance 𝜎 2 is:
𝑛
1
A. 𝑆2
𝑛−1
1
B. 𝑆2
𝑛
𝑛−1
C. 𝑆2
𝑛
𝑛
D. 𝑆2
𝑛−1

Question 37. If the variance of an estimator attains the Crammer-Rao lower bound, the
estimator is:

A. most efficient
B. sufficient
C. consistent
D. admissible

159 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question 38. Degrees of freedom for statistic- 𝜒 2 in case of contingency table of order
(2 × 2) is
A. 3
B. 4
C. 2
D. 1

Question 39. The relation 𝑟 = √𝑏𝑌𝑋 ⋅ 𝑏𝑋𝑌 is known as:


A. mean property of regression coefficients
B. fundamental property of regression coefficients
C. signature property of 𝑟
D. none of the above

Question 40. Standard error of the sample correlation coefficient 𝑟 based on 𝑛 paired values
is:
1+𝑟 2
A.
√𝑛
1−𝑟 2
B. 𝑛
1−𝑟 2
C.
√𝑛

D. none of the above

Question 41. If 𝑌 = 𝑚𝑋 + 4 and 𝑋 = 4𝑌 + 5 are the lies between the values:


A. 0 and 1
B. 0 and 0.5
C. 0 and 0.25
D. none of the above

Question 42. Give the following results,


𝜇𝑋 = 9.2, 𝜇𝑌 = 16.5, 𝜎𝑋 = 2.1
𝜎𝑌 = 1.6 and 𝜌𝑋𝑌 = 0.84

160 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

the regression line of 𝑌 on 𝑋 is:


A. 𝑌 = 𝑋 + 7.3

B. 𝑌 = 0.64𝑋 + 10.612
C. 𝑌 = 0.4𝑋 + 12.82
D. none of the above

Question 43. If the two lines of regression in a bivariate distribution are 𝑋 + 9𝑌 = 7 and
𝑌 + 4𝑋 = 16 then 𝜎𝑋 : 𝜎𝑌 is:
A. 3: 2
B. 2: 3
C. 9: 4
D. 4: 9

Question 44. If a constant 5 is added to each observation of a set, the mean is:
A. increased by 5
B. decreased by 5
C. 5 times the original mean
D. not affected

Question 45. Which of the following relations among the location parameters does not hold?
A. Q₂ = median
B. P₅₀ = median
C. D₅ = median
D. D₆ = median\

Question 46. For random variables X and Y, we have Var(X)=1, Var(Y)=4, and Var(2X-3Y)
=34, then the correlation between X and Y is:

A. 1/2
B. 1/4
161 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

C. 1/3
D. None of the above
Question 47. Let X and Y be independent uniform (0, 1) random variables. Define A=X+Y
and B=X-Y. Then,
A. A and B are independent random variables
B. A and B are uncorrelated random variables
C. A and B are both uniforms (0,1) random variables.
D. None of these
6.5 SUMMARY
The main points which we have covered in this lessons are what is Interval Estimation and
how to get best interval estimator.
6.6 GLOSSARY
Motivation: These Problems are very useful in real life and we can use it in data science,
economics as well as social sciemce.
Attention: Think how the Interval Estimation are useful in real world problems.
6.7 ANSWER TO IN-TEXT QUESTIONS
Answer 1 : A
Explanation-
1
As we know that simple formula for mean is 𝑋‾ = 𝑁 ∑ 𝑥𝑖
Answer 2 : C
Explanation-
As we know that the actual data is X and A is a point
Then, dx=X-A
Σdx 𝛴𝑋 𝛴𝐴
= -
𝑁 𝑁 𝑁
Σdx 𝑁
= 𝑋‾- A.𝑁
𝑁
Σ𝑑𝑥
𝑋‾ = 𝐴 +
𝑁
Answer 3 : C
Explanation-
162 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1 2
𝑉(𝑥) = ∑𝑥 − (𝑥‾)2
𝑛 𝑖

1 + 2 + ⋯ + 𝑛 n(𝑛 + 1) 𝑛 + 1
we know 𝑥‾ = = = .
𝑛 2𝑛 2
1 2 2 2
𝑛+1 2
v(𝑥) = (1 + 2 + ⋯ + 𝑛 ) − ( )
𝑛 2
𝑛(𝑛 + 1)(2𝑛 + 1) 𝑛+1 2
= n( )−( )
6 2
(𝑛 + 1)(2𝑛 + 1) (𝑛 + 1)2
= −
6 4
2𝑛 + 𝑛 + 2𝑛 + 1 𝑛2 + 1 + 2𝑛
2

6 4
2 2
2𝑛 + 3𝑛 + 1 𝑛 + 1 + 2𝑛

6 4

4𝑛2 +6𝑛+2−3n²−3−6n
= 12
𝑛²−1
=V(x) = 12
Answer 4 : B
Explanation-
𝑋‾= 3, standard deviation = 5
V(y) = V (2 X - 5)
= 2² V(X)
= 2² * variance (x)
= 4 * 25
V(y)= 100

163 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Answer 5 : D
Explanation-
𝑃(𝐴 − 𝐵) = 𝑃(𝐴) − 𝑃(𝐴 ∩ 𝐵)
= 𝑃(𝐴) − 𝑃(𝐴𝐵)

Answer 6 : A
Explanation-
𝑃(𝐴 ∩ 𝐵) 𝑃(𝐴 ∩ 𝐵)
𝑃(𝐴/𝐵) = = = 𝑃(𝐴 ∩ 𝐵)
𝑃(𝐵) 1
as we know that 𝑃(𝐵) = 1
i. e 𝐵 = sample space
𝐴∩𝐵 =𝐴
ð𝑃(𝐴 ∩ 𝐵)= P(A)
Answer 7 : C
Explanation-
we know that 𝑥and 𝑦 are independent, then
𝐸(𝑥𝑦) = 𝐸(𝑥) ⋅ E(𝑦)
But Converse is not true
Answer 8 : A
Explanation-
As we know that 𝑋 and 𝑌 are non negative Random Variable with
𝑋≤𝑌
then 𝑋 − 𝑌 ≤ 0
𝐸(𝑋 − 𝑌) ≤ 0
𝐸(X) ≤ 𝐸(Y)
Answer 9: A
Explanation- we know that 𝐸((𝑥 − 𝑥‾)(𝑦 − y̅)) = con (𝑥, 𝑦) = 0
Answer 10 : B
Explanation-
We know that
𝑉(𝑥) = 𝐸(𝑥 2 ) − (𝐸(𝑥))2 ⩾ 0

164 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

so 𝐸(𝑥 2 ) ⩾ (𝐸(𝑥))2

Answer 11 : A
Explanation-
We know that,
𝐸(𝑥) = ∑𝑛𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 ) = 𝑥‾
𝐸(𝑥) is arithmetic mean.

Answer 12 : B
Explanation-
We know that,
1 1
= ∫ 𝑓(𝑥)𝑑𝑥
𝐻 𝑥
1 1
So, 𝐻 = 𝐸 (𝑋)
1
H= 1
𝐸( )
𝑋

Answer 13 : B
Explanation-
We know that
∑log𝑥 𝑝(𝑥)
log 𝐺 = 𝐸(log 𝑥) =
𝑁
so 𝐸(log 𝑥) means log of geometric mean
Answer 14 : B
Explanation-
X~b(3,1/2) Y~b(5,1/2)
ðX+Y ~b(8,1/2)
So, 𝑃(𝑋 + 𝑌 = 3) = 8𝐶₃ (1/2)³ (1/2)⁵
= 8𝐶₃ (1/2)⁸
8!
= 5!3! * (1/2)⁸

165 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

8∗7∗6 1
= 3∗2∗1 * 2³ ∗ 2⁵
= 7/32
Answer 15 : D
Explanation-
𝑋 ∼ 𝑃(1), Y ∼ 𝑃(2)
then X + Y ∼ 𝑃(3)
Answer 16 : D
Explanation-
𝑋 ∼ Bin (𝑛, 𝑝)
Then Y = 𝑛 −X will follow Binomial (𝑛, 𝑞)

Answer 17 : C
Answer 18 : A
Explanation -
Symmetrical distribution means odd order moments will be zero.
Answer 19 : B
Explanation -
We know that
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑛𝑜𝑟𝑚𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑒
Fisher t-distribution is 𝑥² 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛

𝑑𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚

Answer 20 : B

Explanation -
In Chi Square distribution
E(X) = n , where X~Chi-square(n)
V(X) = 2n
So, V(X) = 2E(X)
Answer 21 : C
Explanation -
𝑚𝑒𝑎𝑛−𝑚𝑜𝑑𝑒 𝑛−(𝑛−2)
Skew = = = √2/𝑛>0
𝑆.𝐷 √2𝑛

166 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Therefore, Positively Skew


Answer 22 : B
Explanation -
In chi-square distribution
𝜇₄ 48𝑛 + 12 𝑛² 12
β₂= 𝜇₂² = = + 3 > 3 ðleptokurtic
(2𝑛)² 𝑛

Answer 23 : B
Explanation -
X~𝜒 2 (n₁)
Y~𝜒 2 (n₂)
𝑛 𝑛2
Then, 𝑋/𝑌~𝛽2 ( 21 , )
2

Answer 24 : C
Explanation -
X~𝜒 2 (n₁)
Y~𝜒 2 (n₂)
Then, (𝑋 + 𝑌)~𝜒 2 (𝑛1 + 𝑛2 )
Answer 25 : D
Explanation -
𝑋̅ = 2 σ²= 4
𝜇4 = 𝜇2.2 = 1.3σ²ⁿ = 1.3σ⁴= 3σ⁴ = 3*16 = 48
Answer 26 : C
Explanation -
𝑋̅ = 3
σ = 5
σ²= 4
V (4) = V (2 X -5) = 4V(X) = 4 X 25 =100
Answer 27: B
Explanation -

167 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

2 1
𝑀𝑋 (𝑡) = ( + 𝑒 𝑡 )
3 3
ðX~bernoulli(p)
X~bernoulli(1/3)
E(X)= 1/3, V(X) = 1/3 * 2/3 = 2/9

Answer 28 : D
Explanation -
−𝑟
𝑀𝑋 (𝑡) for negative binomial is of form (Q-𝑃𝑒 𝑡)

Answer 29 : A
Explanation -
X~N(0,1)
1 1
Then, 2 𝑋 2 ~ Gamma(1, 2)
Answer 30 : C
Explanation -
𝑓(𝑥, 𝑦) = 4𝑥𝑦; 0 ≤ 𝑥 ≤ 1,0 ≤ 𝑦 ≤ 1
1 1
𝑃 (0 < 𝑥 < ; ≤ 𝑦 ≤ 1)
2 2
1/2 1
=∫0 ∫1/2 4𝑥𝑦𝑑𝑦𝑑𝑥
1/2
=∫ 4𝑥(𝑦²/2)11/2 𝑑𝑥
0
1/2
=∫ 2𝑥(𝑦²)11/2 𝑑𝑥
0
1/2
=∫ 2𝑥(1 − 1/4) 𝑑𝑥
0

3 1/2
= ∫ 𝑥 𝑑𝑥
2 0
=

168 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1/2
3 𝑛2
( )
2 2 0

= 3/2 * 1/8 = 3/16


Answer 31 : D
All of the given are estimates types.
Answer 32 : C
Explanation -
We know that 𝑇𝑛 is a consistent estimate of θ
If lim 𝑃(𝑇𝑛 − 𝜃| < 𝜀) = 0
𝑛→∞

ð𝑇𝑛 converges to θ in probability 1 as n → ∞


Answer 33: C

Explanation - Sample mean is unbiased as well as absolute unbiased estimate of population


mean also sample mean is consistent.
Answer 34 : C
Explanation -
X~N(0,1)
1 −1
So, L = ∏𝑛𝑖=1 𝑓𝜃(𝑥𝑖 ) = (√2𝜋𝜎) ⁿ exp(2𝜎² 𝛴(𝑥𝑖 − 𝜃)2 )
1 −1
= (√2𝜋𝜎) ⁿ exp(2𝜎² 𝛴𝑥𝑖 2 )

So, we can say 𝛴𝑥𝑖 2 is a sufficient estimate of 𝜎 2 . (According to factorization theorem).


Answer 35 : C
Explanation -
B(θ)= Bias of Estimator = E(𝑇𝑛 )-γ(θ)
So bias can be positive or negative
Answer 36 : D
Explanation-

169 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1
S2 = ∑ (𝑋 − 𝑋‾)2
𝑛 𝑖 𝑖
1
And we know that, 𝑛 ∑𝑖 (𝑋𝑖 − 𝑋‾)2 is an U.E of 𝜎 2
𝑠² 1
So, n 𝑛−1 = 𝑛−1 ∑𝑖 (𝑋𝑖 − 𝑋‾)2 is an U.E of 𝜎 2
𝑠²
so, n𝑛−1is an U.E of 𝜎 2
Answer 37 : A
Explanation-
Crammer-Rao lower bound means an estimator which attains this lower bound will be called
as minimum variance bound (MVB) Estimator. So, efficient estimator attains Crammer Rao
Variance of lower bound.
Answer 38 : D
Explanation-
Degree of freedom in 𝜒 2 contingency table is (m-1)(n-1)
Here, m=2 , n=2
So degree of freedom = (2-1)(2-1) = 1
Answer 39 : B

Explanation-
We know that
𝑟 = √𝑏𝑌𝑋 ⋅ 𝑏𝑋𝑌
Correlation between X and Y is Geometric Mean of regression coefficients
This is called fundamental property of regression coefficient.
Answer 40 : C
Explanation-
1−𝑟 2
Standard error of the sample correlation coefficient =
√𝑛

where r= correlation coefficient and n = sample size


Answer 41 : C

Explanation-
If 𝑌 = 𝑚𝑋 + 4 Y ; regression line Y on X
170 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

and 𝑋 = 4𝑌 + 5 ;regression line X on X


𝑏𝑦𝑥 = m , 𝑏𝑥𝑦 = 4
So, r =√𝑏𝑥𝑦 . 𝑏𝑦𝑥 = √4𝑚
As we know that r =√4𝑚< 1
0 <√4𝑚< 1
0 <4m< 1
0 <m< 1/4
0 <m<0.25

Answer 42 : B
Explanation-
𝜇𝑋 = 9.2, 𝜇𝑌 = 16.5, 𝜎𝑋 = 2.1
𝜎𝑌 = 1.6 and 𝜌𝑋𝑌 = 0.84
the regression line of 𝑌 on 𝑋
𝜎𝑦
(Y. 𝑌̅) = r𝜎 (X-𝑋̅)
𝑥
𝜎𝑦
Y= 𝑌̅ + r (X-𝑋̅)
𝜎𝑥
1.6
= 16.5 + 0.84 +2.1 (X-9.2)
=16.5 + 0.84 X - 5.888
Y =0.64X + 10.612

Answer 43 : A
Explanation-
𝑋 + 9𝑌 = 7and𝑌 + 4𝑋 = 16
If we take X =7-9y
Y=16-4x
Then 𝑏𝑥𝑦 = -9 , 𝑏𝑦𝑥 = -4
r =√𝑏𝑥𝑦 . 𝑏𝑦𝑥 = √36 = 6 >1

171 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

So this is not possible


Y= 7/9 -1/9x
X=16/4-1/4y
𝑏𝑦𝑥 = -1/9 , 𝑏𝑥𝑦 = -1/4
𝑏𝑦𝑥 . 𝑏𝑥𝑦 = 1/36
ð√𝑏𝑥𝑦 . 𝑏𝑦𝑥 = 1/6 <1
So line of regression Y on X is
Y= 7/9 -1/9X
And line of regression X on Y is
X= 4-1/4Y
−1 1
𝑏𝑦𝑥 9 9
= −1 = = 1/9 * 4/1 = 4/9
1
𝑏𝑥𝑦
4 4

𝑏𝑦𝑥 / 𝑏𝑥𝑦 = 4/9


𝜎
𝑟 𝜎𝑌
𝑋
𝜎𝑋
𝑟𝜎
𝑌
𝜎𝑌 2
= 4/9
𝜎𝑋 2
𝜎𝑌
ð𝜎 = 2/3
𝑋
𝜎𝑋
𝜎 = 3/2
𝑌

Answer 44 : A
Explanation -
Let x₁ , x₂ ….. , 𝑥𝑛 be sample with frequency
f₁ , f₂ ….., 𝑓𝑛 then
1
𝑥= 𝑁 ∑ 𝑓𝑖𝑥𝑖
1
So 𝑥𝑛𝑒𝑤= ∑ 𝑓𝑖(𝑥𝑖 +5)
𝑁
1 1
= 𝑁 ∑ 𝑓𝑖(𝑥𝑖 +5) + 𝑁 ∑ 𝑓𝑖 . 5

172 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

5
= 𝑥 + 𝑁 ∑ 𝑓𝑖
5
=𝑥+𝑁.N

𝑥 𝑛𝑒𝑤 = 𝑥 + 5
Answer 45 : D
Explanation - As we know that median is a point which divide the distribution in two equal
parts
But D₆ is 6th decile so median ≠D₆
Answer 46 : B
Explanation-
Var(2X-3Y) = 34
= 4Var(X)+9Var(Y)-12Cov(X, Y)
= 4(1)+9(4)-12Cov(X, Y) = 34
∴ Cov(X, Y)=1/2
Answer 47 : B
Explanation-
Cov(X+Y, X-Y) = Cov(X, X) – Cov(X, Y) + Cov(Y, X) – Cov(Y ,Y) ⇒ Var(X) – Var(Y) =
0
6.8 REFERENCES
• Devore, J. (2012). Probability and statistics for engineers, 8th ed. Cengage Learning.
• John A. Rice (2007). Mathematical Statistics and Data Analysis, 3rd ed. Thomson
Brooks/Cole
• Larsen, R., Marx, M. (2011). An introduction to mathematical statistics and its
applications. Prentice Hall.
6.9 SUGGESTED READINGS
• S. C Gupta , V.K Kapoor, Fundamentals of Mathematical Statistics,Sultan Chand
Publication, 11th Edition.
• B.L Agarwal, Programmed Statistics ,New Age International Publishers, 2nd Edition.

173 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

LESSON 7
INTERVAL BASED DISTRIBUTION

STRUCTURE
7.1 Learning Objectives
7.2 Introduction
7.3 Interval Based Distribution
7.3.1 Student’s t- Distribution
7.3.2 F- Distribution
7.3.3 Z-Distribution
7.3.4 Chi-Square Distribution
7.4 In-Text Questions
7.5 Summary
7.6 Glossary
7.7 Answer to In-Text Questions
7.8 References
7.9 Suggested Readings
7.1 LEARNING OBJECTIVES
One of the main objectives to discuss these distribution( t , F , Z and Chi-Squared distribution)
is to understand sampling distribution i.e to understand sample based distributions.
7.2 INTRODUCTION
The entire large sample theory was based on the application of "Normal Test". However, if the
𝑥‾−𝜇
sample size 𝑛 is small, the distribution of the various statistics, e.g., 𝑍 = 𝜎/ 𝑛 or 𝑍 = (𝑋 −

𝑛𝑃)/√𝑛𝑃𝑄 etc., are far from normality and as such 'normal test' cannot be applied if 𝑛 is small.
In such cases exact sample tests, pioneered by W.S. Gosset (1908) who wrote under the pen
name of Student, and later developed and extended by Prof. R.A. Fisher (1926), are used. In
the following sections we shall discuss: (i) 𝑡-test, (ii) 𝐹-test, and (iii) Fisher's 𝑧-transformation.
The exact sample tests can, however, be applied to large samples also though the converse is
not true. In all the exact sample tests, the basic assumption is that "the population(s) from
which sample(s) is (are) normal, i.e., the parent population(s) is (are) normally distributed."
7.3 INTERVAL BASED DISTRIBUTION
We will discuss these distributions in details.
174 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(i) Student’s t- Distribution


(ii) F- Distribution
(iii) Z- Distribution
(iv) Chi-square Distribution
7.3.1 Student’s t-Distribution :
Let 𝑥𝑖 (𝑖 = 1,2, … , 𝑛) be a random sample of size 𝑛 from a normal population with mean 𝜇
and variance 𝜎 2 . Then Student's 𝑡 is defined by the statistic:
𝑥‾ − 𝜇
𝑡=
𝑆/√𝑛
1 1
where 𝑥‾ = 𝑛 ∑𝑛𝑖=1 𝑥𝑖 , is the sample mean and 𝑆 2 = 𝑛−1 ∑𝑛𝑖=1 (𝑥𝑖 − 𝑥‾)2 ,
… (16 ⋅ 1𝑎) is an unbiased estimate of the population variance 𝜎 2 , and it follows Student's
tdistribution with 𝑣 = (𝑛 − 1)𝑑. 𝑓. with probability density function :
1 1
𝑓(𝑡) = ⋅ ; −∞ < 𝑡 < ∞
1 𝑣 𝑡 2 (𝑣+1)/2
√ 𝑣𝐵 ( , )
2 2 (1 + 𝑣 )
Remarks 1. A statistic 𝑡 following Student's 𝑡-distribution with 𝑛 d.f. will be abbreviated as
𝑡 ∼ 𝑡𝑛
2. If we take 𝑣 = 1, we get :
1 1 1 1
𝑓(𝑡) = ⋅ = ⋅ ; −∞ < 𝑡 < ∞ [∵ Γ(1/2) = √𝜋]
1 1 (1 + 𝑡 ) 𝜋 (1 + 𝑡 2 )
2
𝐵 (2 , 2)
which is the p.d.f. of standard Cauchy distribution. Hence, when 𝑣 = 1, Student's 𝑡
distribution reduces to Cauchy distribution.

Derivation of Student's t-Distribution. The expression (16.1) can be rewritten as :


𝑛(𝑥‾ − 𝜇)2 𝑛(𝑥‾ − 𝜇)2 12 (𝑥‾ − 𝜇)2 1 (𝑥‾ − 𝜇)2 /(𝜎 2 /𝑛)
𝑡2 = = ⇒ = ⋅ =
𝑆2 𝑛𝑠 2 /(𝑛 − 1) (𝑛 − 1) 𝜎 2 /𝑛 𝑛𝑠 2 /𝜎 2 𝑛𝑠 2 /𝜎 2
Since 𝑥𝑖 (𝑖 = 1,2, … , 𝑛) is a random sample from the normal population with mean 𝜇 and
(𝑥‾−𝜇)
variance 𝜎 2 , 𝑥‾ ∼ 𝑁(𝜇, 𝜎 2 /𝑛) ⇒ 𝜎/ 𝑛 ∼ 𝑁(0,1)

(𝑥‾−𝜇)2
Hence , being the square of a standard normal variate is a chi-square variate with 1 d.f.
𝜎2 /𝑛
𝑛𝑠2
Also is a 𝜒 2 -variate with (𝑛 − 1) d.f.
𝜎2
𝑡2
Further since 𝑥‾ and 𝑠 2 are independently distributed (c.f. Theorem 15.5), 𝑛−1 being the ratio
175 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1 𝑛−1
of two independent 𝜒 2 -variates with 1 and (𝑛 − 1)𝑑.f. respectively, is a 𝛽2 (2 , ) variate
2
and its distribution is given by :
1
1 (𝑡 2 /𝑣)2−1
𝑑𝐹(𝑡) = ⋅ 𝑑(𝑡 2 /𝑣), 0 ≤ 𝑡 2 < ∞ [ where 𝑣 = (𝑛 − 1)]
1 𝑣 2 (𝑣+1)/2
𝐵 (2 , 2) (1 + 𝑡 )
𝑣
1 1
= ⋅ 𝑑𝑡; −∞ < 𝑡 < ∞
1 𝑣 𝑡 2 (𝑣+1)/2
√𝑣𝐵 (2 , 2) (1 + )
𝑣
the factor 2 disappearing since the integral from −∞ to ∞ must be unity. This is the required
probability density function as given in (16 ⋅ 2) of Student's t-distribution with 𝑣 = (𝑛 −
1)𝑑. 𝑓.
Remarks on Student's '𝒕'.
1. Importance of Student's 𝑡-distribution in Statistics. W.S. Gosset, who wrote under
pseudonym (pen-name) of Student defined his 𝑡 in a slightly different way, viz., 𝑡 = (𝑥 −
𝜇)/𝑠 and investigated its sampling distribution, somewhat empirically, in a paper entitled
'The Probable Error of the Mean', published in 1908. Prof. R.A. Fisher, later on defined
his own ' 𝑡 ' and gave a rigorous proof for its sampling distribution in 1926. The salient
feature of ' 𝑡 ' is that both the statistic and its sampling distribution are functionally
independent of 𝜎, the population standard deviation.
The discovery of ' 𝑡 ' is regarded as a landmark in the history of statistical inference. Before
𝑥‾−𝜇
Student gave his ' 𝑡 ', it was customary to replace 𝜎 2 in 𝑍 = 𝜎/ 𝑛, by its unbiased estimate

2 𝑥‾−𝜇
𝑆 to give 𝑡 = 𝑆/ and then normal test was applied even for small samples. It has been
√𝑛
found that although the distribution of 𝑡 is asymptotically normal for large 𝑛 , it is far from
normality for small samples. The Student's 𝑡 ushered in an era of exact sample
distributions (and tests) and since its discovery many important contributions have been
made towards the development and extension of small (exact) sample theory.
2. Confidence or Fiducial Limits for 𝝁. If 𝑡0015 is the tabulated value of 𝑡 for 𝑣 = (𝑛 −
1)𝑑. 𝑓. at 5% level of significance, i.e., 𝑃(|𝑡| > 𝑡005 ) = 0.05 ⇒ 𝑃(|𝑡| ≤ 𝑡0.05 ) = 0.95,
the 95% confidence limits for 𝜇 are given by :
𝑥‾ − 𝜇 𝑠 𝑆
|𝑡| ≤ 𝑡0.05 , i.e., | | ≤ 𝑡0.05 ⇒ 𝑥‾ − 𝑡0.05 ⋅ ≤ 𝜇 ≤ 𝑥‾ + 𝑡0.05
𝑆/√𝑛 √𝑛 √𝑛
Thus, 95% confidence limits for 𝜇 are : 𝑥‾ ± 𝑡0,05 ⋅ (𝑆/√𝑛
Similarly, 99% confidence limits for 𝜇 are : 𝑥‾ ± 𝑡0.01 (𝑆/√𝑛)
where 𝑡0.01 is the tabulated value of 𝑡 for 𝑣 = (𝑛 − 1) d.f. at 1% level of significance.
176 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Fisher's ' 𝒕 ' (Definition). It is the ratio of a standard normal variate to the square root of an
independent chi-square variate divided by its degrees of freedom. If 𝜉 is a 𝑁(0,1) and 𝜒 2 is
an independent chi-square variate with 𝑛𝑑. 𝑓., then Fisher's 𝑡 is given by :
𝜉
𝑡=
√𝜒 2 /𝑛
and it follows Student's ' 𝑡 ' distribution with 𝑛 degrees of freedom.
Distribution of Fisher's ' 𝒕 '. Since 𝜉 and 𝜒 2 are independent, their joint probability
differential is given by :
𝑛
2)
1 2
exp (−𝜒 2 /2)(𝜒 2 ) 2−1
𝑑𝐹(𝜉, 𝜒 = exp (−𝜉 /2) 𝑑𝜉𝑑𝜒 2
√2𝜋 2𝑛/2 Γ(𝑛/2)
Let us transform to new variates 𝑡 and 𝑢 by the substitution :
𝜉
𝑡= and 𝑢 = 𝜒 2 ⇒ 𝜉 = 𝑡√𝑢/𝑛 and 𝜒 2 = 𝑢
√𝜒 2 /𝑛
Jacobian of transformation / is given by :
∂(𝜉, 𝜒 2 ) 𝑡/(2√𝑢𝑛)| = √𝑢
𝐽= = |√𝑢/𝑛
∂(𝑡, 𝑢) 0 1 𝑛
The joint p.d.f g(𝑡, 𝑢) of 𝑡 and 𝑢 becomes :
1 𝑢 𝑡2 𝑛 1
𝑔(𝑡, 𝑢) = exp {− (1 + )} 𝑢 2−2 𝑑𝑢
√2𝜋2𝑢/2 Γ(𝑛/2)√𝑛 2 𝑛
Since 𝜓2 ≥ 0 and −∞ < 𝜉 < ∞, 𝑢 ≥ 0 and −∞ < 𝑡 < ∞.

1 𝑢 12
𝑔1 (𝑡) = [∫ exp {− (1 + )} 𝑢(𝑢−1)/2 𝑑𝑢]
√2𝜋2𝑢/2 Γ(𝑛/2)√𝑛 0 2 𝑛

Constants of 𝒕-Distribution. Since 𝑓(𝑡) is symmetrical about the line 𝑡 = 0, all the moments
of odd order about origin vanish, i.e.,

𝜇2𝑟+1 (about origin) = 0; 𝑟 = 0,1,2, …
In particular, 𝜇1′ (about origin ) = 0 = Mean
Hence central moments coincide with moments about origin.
∴ 𝜇2𝑟+1 = 0, (𝑟 = 0,1,2, … )
The moments of even order are given by :

177 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

∞ ∞

𝜇2𝑟 = 𝜇2𝑟 (about origin) = ∫ 𝑡 2𝑟 𝑓(𝑡)𝑑𝑡 = 2 ∫ 𝑡 2𝑟 𝑓(𝑡)𝑑𝑡
−∞ 0
∞ 2𝑟
1 𝑡 ⋅
=2⋅ ∫ 𝑑𝑡
1 𝑛 2 (𝑛+1)/2
𝐵 (2 , 2) √𝑛 0 (1 + 𝑡 )
𝑛
This integral is absolutely convergent if 2𝑟 < 𝑛.
𝑡2 1 𝑛(1−𝑦) 𝑛
Put 1 + = 𝑦 ⇒ 𝑡2 = ⇒ 2𝑡𝑑𝑡 = − 𝑦 2 𝑑𝑦
𝑛 𝑦
When 𝑡 = 0, 𝑦 = 1 and when 𝑡 = ∞, 𝑦 = 0. Therefore,
0
2 𝑡 2𝑟 −𝑛
𝜇2𝑟 = ∫ ⋅ 𝑑𝑦
1 𝑛 (1/𝑦)(𝑛+1)/2 2𝑡𝑦 2
√𝑛𝐵 (2 , 2) 1

1
𝑛
= ∫ (𝑡 2 )(2𝑟−1)/2 𝑦{(𝑛 + 1)/2} − 2𝑑𝑦
1 𝑛 0
√𝑛𝐵 (2 , 2)
1
√𝑛 1
1 − 𝑦 𝑟−2 |(𝑛+1)/2|−2
= ∫ [𝑛 ( )] 𝑦 𝑑𝑦
1 𝑛 𝑦
𝐵 (2 , 2) 0
1 𝑛
𝑛𝑟 1 𝑛𝑟 𝑛 1
= ∫ 𝑦 2−𝑟−1 (1 − 𝑦)𝑟−2 𝑑𝑦 = . 𝐵 ( − 𝑟, 𝑟 + ) , 𝑛 > 2𝑟.
1 𝑛 1 𝑛 2 2
𝐵 (2 , 2) 0 𝐵 (2 , 2)
1
Γ[(𝑛/2) − 𝑟]Γ (𝑟 + 2)
= 𝑛𝑟
Γ(1/2)Γ(𝑛/2)
1 3 31 1 𝑛
(𝑟 − 2) (𝑟 − 2) … 2 2 Γ (2) Γ (2 − 𝑟)
𝑟
=𝑛
Γ(1/2)[(𝑛/2) − 1][(𝑛/2) − 2] … [(𝑛/2) − 𝑟]Γ[(𝑛/2) − 𝑟]
(2𝑟 − 1)(2𝑟 − 3) … 3 ⋅ 1 𝑛
= 𝑛𝑟 , >𝑟
(𝑛 − 2)(𝑛 − 4) … (𝑛 − 2𝑟) 2
In particular
1 𝑛
𝜇2 = 𝑛 ⋅ = , (𝑛 > 2)
(𝑛 − 2) 𝑛 − 2
2
3⋅1 3𝑛2
𝜇4 = 𝑛 = , (𝑛 > 4)
(𝑛 − 2)(𝑛 − 4) (𝑛 − 2)(𝑛 − 4)

178 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

and
Hence
𝜇 2 𝜇 𝑛−2
𝛽1 = 𝜇3 3 = 0 and 𝛽2 = 𝜇 42 = 3 (𝑛−4) ; (𝑛 > 4).
2 2
𝑛−2 1−(2/𝑛)
Remarks 1. As 𝑛 → ∞, 𝛽1 = 0 and 𝛽2 = lim𝑛→∞ 3 (𝑛−4) = 3lim𝑛→∞ [1−(4/𝑛)] = 3

2 Changing 𝑟 to (𝑟 − 1) in [14 ⋅ 4(𝑏)], dividing and simplifying, we shall get the


𝜇 𝑛(2𝑟−1) 𝑛
recurrence relation for the moments as 𝜇 2𝑟 = (𝑛−2𝑟) : 2 > 𝑟
2𝑟−2

3 Moment Generating Function of t-distribution. From [16.4(b)] we observe that if 𝑡 ∼


𝑡𝑛 , then all the moments of order 2𝑟 < 𝑛 exist but the moments of order 2𝑟 ≥ 𝑛 do
not exist. Hence the m.g.f. of 𝑡-distribution does not exist.

Example 1. Express the constants 𝑦, 𝑎 and 𝑚 of the distribution :


𝑚
𝑥2
𝑑𝐹(𝑥) = 𝑦0 (1 − 2 ) 𝑑𝑥, −𝑎 ≤ 𝑥 ≤ 𝑎
𝑎
in terms of its 𝜇2 and 𝛽2.
Show that if 𝑥 is related to a variable 𝑡 by the equation:
𝑎𝑡
𝑥= ,
{2(𝑚 + 1) + 𝑡 2 }1/2
then 𝑡 has Student's distribution with 2(𝑚 + 1) degrees of freedom. Use the transformation
to calculate the probability that 𝑡 ≥ 2 when the degrees of freedom are 2 and also when 4 .
Solution. First of all, we shall determine the constant 𝑦0 from the consideration that total
probability is unity.
𝑎 𝑚 𝑎 𝑚
𝑥2 𝑥2
∴ 𝑦0 ∫ (1 − 2 ) 𝑑𝑥 = 1 ⇒ 2𝑦0 ∫ (1 − 2 ) 𝑑𝑥 = 1
−𝑎 𝑎 0 𝑎
(∵ Integrand is an even function of 𝑥.)
𝜋/2
⇒ 2𝑦0 ∫ cos2𝑚 𝜃 ⋅ 𝑎cos 𝜃𝑑𝜃 = 1, (𝑥 = 𝑎sin 𝜃)
0
𝜋/2
⇒ 2𝑎𝑦0 ∫ cos2𝑚+1 𝜃𝑑𝜃 = 1
0
𝜋/2 𝑝+1 𝑞+1
But we have the Beta integral, 2∫0 sin𝑝 𝜃cos 𝑞 𝜃𝑑𝜃 = 𝐵 ( , )
2 2

179 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝜋/2
1
∴ 𝑎𝑦0 ⋅ 2 ∫ cos2𝑚+1 𝜃sin0 𝜃𝑑𝜃 = 1 ⇒ 𝑎𝑦0 𝐵 (𝑚 + 1, ) = 1
0 2
1
⇒ 𝑦0 =
1
𝑎𝐵 (𝑚 + 1 ⋅ 2)

Since the given probability function is symmetrical about the line 𝑥 = 0, we have as in

𝜇2𝑟+1 = 𝜇2𝑟+1 = 0; 𝑟 = 0,1,2, … , [∵ Mean = Origin ] The moments of even order are given
by :

𝑎 𝑎 𝑚′′
′ 2𝑟 2𝑟
𝑥2
𝜇2𝑟 = 𝜇2𝑟 (about origin) = ∫ 𝑥 𝑓(𝑥)𝑑𝑥 = 𝑦0 ∫ 𝑥 (1 − 2 ) 𝑑𝑥
−𝑎 −𝑎 𝑎
𝑎 2 𝑚 𝜋/2
2𝑟
𝑥
= 2𝑦0 ∫ 𝑥 (1 − 2 ) 𝑑𝑥 = 2𝑦0 ∫ (𝑎sin 𝜃)2𝑟 cos2𝑚 𝜃 ⋅ 𝑎cos 𝜃𝑑𝜃, (𝑥 = 𝑎sin 𝜃)
0 𝑎 0
𝜋/2
2𝑟+1
1
= 𝑦0 𝑎 ⋅2∫ sin2𝑟 𝜃 ⋅ cos2𝑚+1 𝜃𝑑𝜃 = 𝑦0 𝑎2𝑟+1 𝐵 (𝑟 + , 𝑚 + 1) [Using (1)]
0 2

1 1 3
𝐵 (𝑟 + 2 , 𝑚 + 1) Γ (𝑟 + 2) Γ (𝑚 + 2)
= 𝑎2𝑟 = 𝑎2𝑟 ⋅
1 3 1
𝐵 (𝑚 + 1, ) Γ (𝑚 + 𝑟 + ) Γ ( )
2 2 2
1
2
Γ{𝑚+(3/2)}⋅ Γ(1/2)
2 𝑎2
In particular, 𝜇2 = 𝑎 ⋅ {𝑚+(3/2)}Γ∣𝑚+(3/2)}Γ(1/2) = 2𝑚+3
⇒ 𝑎2 = (2𝑚 + 3)𝜇2
Γ(5/2) Γ{𝑚+(3/2)} 3𝑎4
Also 𝜇4 = 𝑎4 Γ{𝑚+(7/2)∣ × = (2𝑚+5)(2𝑚+3) (On simplification)
Γ(1/2)
𝜇4 3(2𝑚+3) 9−5𝛽2
∴ 𝛽2 = 𝜇 2 = ⇒ 𝑚 = 2(𝛽 (On simplification) …
2 (2𝑚+5) 2 −3)

Equations (2), (3) and (4) express the constants 𝑦0 , 𝑎 and 𝑚 in terms of 𝜇2 and 𝛽2.
𝑎𝑡 𝑥2 𝑡2
𝑥= ⇒ 2=
[2(𝑚 + 1) + 𝑡 2 ]1/2 𝑎 2(𝑚 + 1) + 𝑡 2
−1
𝑥2 2(𝑚+1) 𝑡2
i.e., 1 − 𝑎2 = 2(𝑚+1)+𝑡 2 = (1 + 𝑛 ) , (𝑛 = 2𝑚 + 2)
𝑑𝑡 1 2𝑡𝑑𝑡 1 𝑡2
Also 𝑑𝑥 = 𝑎 [(𝑛+𝑡 2 )1/2 − 𝑡 ⋅ 2 (𝑛+𝑡 2 )3/2 ] = 𝑎 (𝑛+𝑡 2 )1/2 (1 − 𝑛+𝑡 2 ) 𝑑𝑡

180 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑎𝑛 𝑎 1
= 𝑑𝑡 = ⋅ 𝑑𝑡
(𝑛 + 𝑡 2 )3/2 √11 [1 + (𝑡 2 /𝑛)]3/2
Hence the p.d.f. of 𝑋 transforms to
1 𝑎 𝑑𝑡
𝑑𝐹(𝑡) = 𝑦0 𝑚 ⋅ ⋅ 3/2
𝑡2 √𝑛 𝑡2
(1 + ) (1 + )
𝑛 𝑛
1 𝑎 𝑑𝑡
= ⋅ ⋅
1 2 𝑚+(3/2)
𝑎𝐵 (𝑚 + 1, 2) √11 (1 + 𝑡 )
𝑛
1 𝑑𝑡
= ⋅ , −∞ < 𝑡 < ∞
𝑛 1 𝑡 2 (𝑛+1)/2
√𝑛𝐵 (2 , 2) (1 + )
𝑛

which is the probability differential of Student's t-distribution with 𝑛 = 2(𝑚 + 1) d.f.

For 2 d.f., i.e., 𝑛 = 2, we get 2(𝑚 + 1) = 2 ⇒ 𝑚 = 0. Hence from ( ∗∗ ), we get ( for 𝑚 =


𝑎𝑡 √2
0), Hence the result. 𝑥 = (2+𝑡 2)1/2 ⇒ 𝑥 = 𝑎, when 𝑡 = 2.
√3
𝑎 𝑎
1
∴ 𝑃(𝑡 ≥ 2) = 𝑃[𝑋 ≥ √(2/3)𝑎] = ∫ √(2/3)𝑑𝐹(𝑥) = ∫ √(2/3) 𝑑𝑥
1
𝑎 𝑎 𝑎𝐵 (1, 2)
[From (∗), since 𝑚𝑡 = 0]
1 √2 √3 − √2 1 Γ1Γ(1/2) Γ(1/2)
= (𝑎 − 𝑎) = [∵ 𝐵 (1, ) = = = 2]
2𝑎 √3 2√3 2 Γ(3/2) (1/2)Γ(1/2)
For 4 d.f., i.e,, 𝑛 = 4, we get 𝑚 = 1. Proceeding exactly similarly we shall obtain
1 5√2
𝑃(𝑡 ≥ 2) = − .
2 16
Example 2. If the random variables 𝑋1 and 𝑋2 are independent and follow chi-square
distribution with 𝑛 d.f., show that √𝑛(𝑋1 − 𝑋2 )/2√𝑋1 𝑋2 is distributed as Student's t with 𝑛.
d.f., independently of 𝑋1 + 𝑋2.
Solution. Since 𝑋1 and 𝑋2 are independent chi-square variates each with 𝑛 d.f., their joint
p.d.f. is given by :

181 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑝(𝑥1 , 𝑥2 ) = 𝑝1 (𝑥1 ) × 𝑝2 (𝑥2 )


1 (𝑛/2)−1 (𝑛/2)−1
= 𝑛 2
⋅ 𝑒 −(𝑥1+𝑥2)/2 𝑥1 𝑥2 ; 0 ≤ 𝑥1 < ∞, 0 ≤ 𝑥2 < ∞
2 [Γ(𝑛/2)]
√𝑛(𝑥1 −𝑥2 )
Put 𝑢 = and 𝑣 = 𝑥1 + 𝑥2
2√𝑥1 𝑥2

𝑣 1 𝑣 1
⇒ 𝑥1 = 1+ , 𝑥2 = 1 −
2 𝑛 2 𝑛
√(1 + 2 ) √(1 + 2 )
[ 𝑢 ] [ 𝑢 ]
∂(𝑥1 ,𝑥2 ) 𝑣
Jacobian of transformation is: 𝐽 = = 3/2
∂(𝑢,𝑣) 𝑢2
2√𝑛(1+ )
𝑛
The joint p.d.f. of 𝑈 and 𝑉 becomes
1 𝑒 −𝑖/2 𝑣 𝑛−1
𝑔(𝑢, 𝑣) = 𝑝(𝑥1 , 𝑥2 )|𝐽| = (𝑛+1)/2
; −∞ < 𝑢 < ∞, 0 ≤ 𝑣
22𝑢−1 Γ(𝑛/2)Γ(𝑛/2)√𝑛 𝑢2
(1 + )
𝑛
<∞
Using Legender's duplication formula, viz.,
𝑛+1 Γ𝑛√𝜋
Γ𝑛 = 2𝑛−1 Γ(𝑛/2)Γ ( ) /√𝜋 ⇒ Γ(𝑛/2) = , we get
2 𝑛−1 𝑛+1
2 Γ( 2 )
22𝑛−1 ⋅ Γ𝑛√𝜋 𝑛 1 𝑛 1
22𝑛−1 Γ(𝑛/2)Γ(𝑛/2)√𝑛 = Γ ( ) √𝑛 = 2𝑛 Γ𝑛√𝑛𝐵 ( , ) [∵ √𝜋 = Γ ( )]
𝑛+1
2𝑛−1 Γ ( 2 ) 2 2 2 2

1 1 1
∴ 𝑔(𝑢, 𝑣) = ( 𝑛 𝑒 −∇/2 𝑣 𝑛−1 ) ⋅ ; 0 < 𝑣 < ∞, −∞ < 𝑢 < ∞.
2 Γ𝑛 1 𝑛 𝑢 2 (𝑛+1)/2
√ 𝑛𝐵 ( , )
[ 2 2 (1 + 𝑛 ) ]
⇒ 𝑔(𝑢, 𝑣) = 𝑔1 (𝑢)𝑔2 (𝑣),
1 1
where 𝑔1 (𝑢) = ⋅ ⋅ −∞ < 𝑢 < ∞
1 𝑛 2 (𝑛+1)/2
√𝑛𝛽 (2 , 2) (1 + 𝑢 )
𝑛
1
and 𝑔2 (𝑣) = 𝑛
𝑒 −𝑣/2 𝑣 𝑛−1 , 0 < 𝑣 < ∞
2 Γ𝑛

182 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(i). ⇒ 𝑈 = √𝑛(𝑋1 − 𝑋2 )/2√𝑋1 𝑋2 and 𝑉 = 𝑋1 + 𝑋2 are independently distributed.


(ii) ⇒ 𝑈 = √𝑛(𝑋1 − 𝑋2 )/2√𝑋1 𝑋2 ∼ 𝑡𝑛 , and
1
(iii) ⇒ 𝑉 = 𝑋1 + 𝑋2 ∼ 𝛾 (𝑎 = 2 , 𝑛)

Example 3. If 𝐼𝑥 (𝑝, 𝑞) represents the incomplete Beta function defined by :


𝑥
1
𝐼𝑥 (𝑝, 𝑞) = ∫ 𝑡 𝑝−1 (1 − 𝑡)𝑞 − 1𝑑𝑡; 𝑝 > 0, 𝑞 > 0
𝐵(𝑝, 𝑞) 0
show that the distribution function 𝐹 (.) of Student's t-distribution is given by :
−1
1 𝑛 1 𝑡2
𝐹(𝑡) = 1 − 𝐼𝑥 ( , ) , where 𝑥 = (1 + ) .
2 2 2 𝑛
Solution. If 𝑓(.)𝑖𝑠𝑝. 𝑑. 𝑓. 𝑜𝑓𝑆𝑡𝑢𝑑𝑒𝑛𝑡 ′ 𝑠𝑡 − 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛𝑤𝑖𝑡ℎ𝑛 d.f., then
𝑡 ∞ ∞ −(𝑛+1)/2
1 𝑢2
𝐹(𝑡) = ∫ 𝑓(𝑢)𝑑𝑢 = 1 − ∫ 𝑓(𝑢)𝑑𝑢 = 1 − ∫ (1 + ) 𝑑𝑢
1 𝑛 𝑛
−∞ 𝑡 √𝑛𝐵 (2 , 2) 𝑡
𝑢2 1 1−𝑧
Put 1 + = ⇒ 𝑢 = √𝑛 ( )
𝑛 𝑧 𝑧
2𝑢𝑑𝑢 −𝑑𝑧 𝑛𝑑𝑧 √𝑛 𝑧 1/2 𝑑𝑧
Also = 2 ⇒ 𝑑𝑢 = − = − ( ) ⋅ 2
𝑛 𝑧 2𝑢𝑧 2 2 1−𝑧 𝑧

Substituting in (∗∗ ), we get :


0 −1 −1
1 𝑡2 𝑡2 √𝑛 −3/2
𝐹(𝑡) =1− ∫ (1 + ) (1 + ) 𝑧 (𝑛+1)/2 {− 𝑧 (1 − 𝑧)−1/2 } 𝑑𝑧
1 𝑛 𝑛 𝑛 2
√𝑛𝐵 (2 , 2)
0 −1 −1
1 𝑡2 𝑡2
=1+ ∫ (1 + ) (1 + ) 𝑧 (𝑛/2)−1 (1 − 𝑧)−1/2 𝑑𝑧
1 𝑛 𝑛 𝑛
2𝐵 (2 , 2)
−1
1 𝑥 𝑡2
=1− 1𝑛 ∫ 𝑧 (𝑛/2)−1 (1 − 𝑧)(1/2)−1 𝑑𝑧 [ where 𝑥 = (1 + 𝑛 ) ]
2𝐵( , ) 0
22
−1
.
1 𝑛 1 𝑡2
= 1 − 2 𝐼𝑥 (2 , 2) , [𝑥 = (1 + 𝑛 ) ]

Example 4. Show that for t-distribution with 𝑛 d.f., mean deviation about mean is given by:
183 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

√𝑛Γ[(𝑛 − 1)/2]/√𝜋Γ(𝑛/2)
∞ 1 ∞ |𝑡|𝑑𝑡
Solution. 𝐸(𝑡) = 0. M.D. (about mean) = ∫−∞ |𝑡|𝑓(𝑡)𝑑𝑡 = 1𝑛 ∫−∞ (𝑛+1)/2
√𝑛𝐵(2, 2 ) 𝑡2
(1+ )
𝑛

∞ ∞
2 𝑡𝑑𝑡 √𝑛 𝑑𝑦 𝑡2
= ∫ = ∫ , ( = 𝑦)
1 𝑛 0 𝑡 2 (𝑛+1)/2 1 𝑛 0 (1 + 𝑦)(𝑛+1)/2 𝑛
√𝑛𝐵 (2 , 2) (1 + ) 𝐵 (2 , 2 )
𝑛

√𝑛 𝑦 1−1 √𝑛 𝑛−1 √𝑛Γ[(𝑛 − 1)/2]
= ∫ 𝑛−1 𝑑𝑦 = ⋅𝐵( , 1) =
1 𝑛 1 𝑛 2 √𝜋Γ(𝑛/2)
𝐵 ( , ) 0 (1 + 𝑦) 2 +1 𝐵( , )
2 2 2 2
16.2-5. Limiting Form of t-distribution. As 𝑛 → ∞, the p.d.f. of 𝑡-distribution with 𝑛 d.f. viz.,
−(𝑛+1)/2
1 𝑡2 1 1
𝑓(𝑡) = (1 + ) → ⋅ exp (− 𝑡 2 ) , −∞ < 𝑡 < ∞
1 𝑛 𝑛 √2𝜋 2
√𝑛𝐵 (2 , 2)
1
1 1 Γ[(𝑛+1)/2] 1 1 𝑛 2 1
Proof. lim𝑛→∞ 1𝑛 = lim𝑛→∞ = ⋅ ( ) =
√𝑛𝐵(2, 2 ) √𝑛 Γ(1/2)Γ(𝑛/2) √𝑛 √𝜋 2 √2𝜋

Γ(𝑛 + 𝑘)
[∵ Γ(1/2) = √𝜋 and lim = 𝑛𝑘 , (𝑐. 𝑓. Remark to §16 ⋅ 8)]
𝑛→∞ Γ(𝑛)
1 1
𝑛 − −
Proof. 1 𝑡2 2
𝑡2 2
∴ lim 𝑓(𝑡) = lim ⋅ lim [(1 + ) ] × lim (1 + )
𝑛→∞ 𝑛→∞ 1 𝑛 𝑛→∞ 𝑛 𝑛→∞ 𝑛
√𝑛𝐵 (2 , 2)
1
= exp (−𝑡 2 /2), −∞ < 𝑡 < ∞
√2𝜋
Hence for large d.f. 𝑡-distribution tends to standard normal distribution. 16.2.6. Graph of t-
distribution. The p.d.f. of 𝑡-distribution with 𝑛 d.f. is :
−(𝑛+1)/2
𝑡2
𝑓(𝑡) = 𝐶 ⋅ (1 + ) , −∞ < 𝑡 < ∞
𝑛
Since 𝑓(−𝑡) = 𝑓(𝑡), the probability curve is symmetrical about the line 𝑡 = 0. As 𝑡 increases,
𝑓(𝑡) decreases rapidly and tends to zero as 𝑡 → ∞, so that 𝑡-axis is an asymptote to the curve.
We have shown that
𝑛 3(𝑛 − 2)
𝜇2 = , 𝑛 > 2; 𝛽2 = ,𝑛 > 4
𝑛−2 (𝑛 − 4)

184 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Hence for 𝑛 > 2, 𝜇2 > 1 i.e., the variance of 𝑡-distribution is greater than that of standard
normal distribution and for 𝑛 > 4, 𝛽2 > 3 and thus 𝑡-distribution is more flat on the top than
the normal curve. In fact, for small 𝑛, we have
𝑃(|𝑡| ≥ 𝑡0 ) ≥ 𝑃(|𝑍| ≥ 𝑡0 ), 𝑍 ∼ 𝑁(0,1)
i.e., the tails of the t-distribution have a greater probability (area) than the tails of standard
normal distribution. Moreover we have also seen $$16 − 2.5], that for large 𝑛 (𝑑. 𝑓), t-
distribution tends to standard normal distribution.

Critical Values of 𝒕. The critical (or significant) values of 𝑡 at level of significance 𝛼 and
𝑑. 𝑓. 𝑣 for two-tailed test are given by the equation :
𝑃[|𝑡| > 𝑡𝑣 (𝛼)] = 𝛼
⇒ 𝑃[|𝑡| ≤ 𝑡𝑣 (𝛼)] = 1 − 𝛼

185 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

The values 𝑡𝑣 (𝛼) have been tabulated in Fisher and Yates' Tables, for different values of 𝛼
and 𝑣 and are given in Table I at the end of the chapter.
Since 𝑡-distribution is symmetric about 𝑡 = 0, we get from (16.5)
𝑃(𝑡 > 𝑡𝑣 (𝛼)] + 𝑃[𝑡 < −𝑡𝑣 (𝛼)] = 𝛼 ⇒ 2𝑃[𝑡 > 𝑡𝑣 (𝛼)] = 𝛼
⇒ 𝑃[𝑡 > 𝑡𝑣 (𝛼)] = 𝛼/2 ∴ 𝑃[𝑡 > 𝑡𝑣 (2𝛼)] = 𝛼
𝑡𝑣 (2𝛼) (from the Tables at the end of the chapter) gives the significant value of 𝑡 for a
single-tail test [Right-tail or Left-tail-since the distribution is symmetrical], at level of
significance 𝛼 and 𝑣𝑑. 𝑓.
Hence the significant values of 𝑡 at level of significance ' 𝛼 ' for a single-tailed test can be
obtained from those of two-tailed test by looking the values at level of significance 2𝛼.
For example,
𝑡8 (0.05) for single-tail test = 𝑡8 (0.10) for two-tail test = 1.86
𝑡8 (0.05) for single-tail test = 𝑡8 (0 − 10) for two-tail test
𝑡15 (0.01) for single-tail test = 𝑡15 (0.02) for two-tail test = 2.60.

APPLICATIONS OF t-DISTRIBUTION
The 𝑡-distribution has a wide number of applications in Statistics, some of which are
enumerated below.
(i) To test if the sample mean (𝑥‾) differs significantly from the hypothetical value 𝜇 of
the population mean :
(ii) To test the significance of the difference between two sample means.
(iii) To test the significance of an observed sample correlation coefficient and sample
regression coefficient.
(iv) To test the significance of observed partial correlation coefficient. In the following
sections we will discuss these applications in detail, one by one.
t-Test for Single Mean. Suppose we want to test :
(i) if a random sample 𝑥𝑖 (𝑖 = 1,2, … , 𝑛) of size 𝑛 has been drawn from a normal
population with a specified mean, say 𝜇0 , or
(ii) if the sample mean differs significantly from the hypothetical value 𝜇0 of the
population mean.
Under the null hypothesis, 𝐻0 :
(i) The sample has been drawn from the population with mean 𝜇0 or

186 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(ii) there is no significant difference between the sample mean 𝑥‾ and the population mean
𝜇0 , the statistic
𝑥‾ − 𝜇0
𝑡=
𝑆/√𝑛
1 1
where 𝑥‾ = 𝑛 ∑𝑛𝑖=1 𝑥𝑖 and 𝑆 2 = 𝑛−1 ∑𝑛𝑖=1 (𝑥𝑖 − 𝑥‾)2, follows Student's 𝑡-distribution with (𝑛 −
1) d.f.
We now compare the calculated value of 𝑡 with the tabulated value at certain level of
significance. If calculated |𝑡| > tabulated 𝑡, null hypothesis is rejected and if calculated |𝑡| <
tabulated 𝑡, 𝐻0 may be accepted at the level of significance adopted.
Remarks 1. On computation of 𝑆 2 for numerical problems. If 𝑥‾ comes out in integers, the
formula (16.6𝑎) can be conveniently used for computing 𝑆 2 . However, if 𝑥‾ comes in fractions
then the formula (16.6𝑎) for computing 𝑆 2 is very cumbersome and is not recommended. In
that case, step deviation method, given below, is quite useful.
If we take 𝑑𝑖 = 𝑥𝑖 − 𝐴, where 𝐴 is any arbitrary number, then
1 1 (∑ 𝑥𝑖 )2
𝑆2 = [Σ(𝑥𝑖 − 𝑥‾)2 ] = [∑ 𝑥𝑖2 − ]
𝑛−1 𝑛−1 𝑛
1 2
(Σ𝑑𝑖 )2
= [∑ 𝑑𝑖 − ] ., since variance is independent of change of origin.
𝑛−1 𝑛
Σ𝑑1
Also, in this case 𝑥‾ = 𝐴 + .
𝑛
1
2. We know, the sample variance : 𝑠 2 = 𝑛 ∑𝑖 (𝑥𝑖 − 𝑥‾)2 ⇒ 𝑛𝑠 2 = (𝑛 − 1)𝑆 2
𝑆2 𝑠2
∴ = ⇒ 𝑛𝑠 2 = (𝑛 − 1)𝑆 2
𝑛 𝑛−1
Hence for numerical problems, the test statistic using it becomes
𝑥‾ − 𝜇0 𝑥‾ − 𝜇0
𝑡= = ∼ 𝑡𝑛−1
√𝑠 2 /𝑛 √𝑠 2 /(𝑛 − 1)
3. Assumption for Student's t-test. The following assumptions are made in the Student's t-
test :
(i) The parent population from which the sample is drawn is normal.
(ii) The sample observations are independent, i.e., the sample is random.
(iii) The population standard deviation 𝜎 is unknown.

Example 5. A machinist is making engine parts witls axle diameters of 0.700 inch. A random
sample of 10 parts shows a mean dianteler of 0.742 inch with a standard deviation of 0.040
187 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

inch. Compute the statistic you would use to test whether the toork is meeting the
specifications. Also state how you would proceed further.
Solution. Here we are given :
𝜇 = 0.700 inche, 𝑥‾ = 0.742 inche, 𝑠 = 0.040 inche and 𝑛 = 10
Null Hypothesis, 𝐻0 : 𝜇 = 0.700, i.e., the product is conforming to specifications.
Alternative Hypothesis, 𝐻1 : 𝜇 ≠ 0.700
𝑥‾−𝜇 𝑥‾−𝜇
Test Statistic. Under 𝐻0 , the test statistic is : 𝑡 = 2 = 2 ∼ 𝑡(𝑛−1)
√𝑆 /𝑛 √𝑠 /(𝑛−1)
√9(0.742 − 0.700)
∴ 𝑡= = 3.15
0.040

How to proceed further. Here the test statistic ' 𝑡 ' follows Student's 𝑡-distribution with 10 −
1 = 9 d.f. We will now compare this calculated value with the tabulated value of 𝑡 for 9 d.f.
and at certain level of significance, say 5%. Let this tabulated value be denoted by 𝑡0 .
(i) If calculated ' 𝑡 ', viz., 3.15 > 𝑡0, we say that the value of 𝑡 is significant. This implies
that 𝑥‾ differs significantly from 𝜇 and 𝐻0 is rejected at this level of significance and we
conclude that the product is not meeting the specifications.
(ii) If calculated 𝑡 < 𝑡0, we say that the value of 𝑡 is not significant, i.e., there is no
significant difference between 𝑥‾ and 𝜇. In other words, the deviation (𝑥‾ − 𝜇) is just due
to fluctuations of sampling and null hypothesis 𝐻0 may be retained at 5% level of
significance, i.e., we may take the product conforming to specifications.
Example 6. The mean weekly sales of soap bars in departmental stores was 146.3 bars per
store. After an advertising campaign the mean weekly sales in 22 stores for a typical week
increased to 153.7 and showed a standard deviation of 17.2 . Was the advertising campaign
successful?
Solution. We are given : 𝑛 = 22, 𝑥‾ = 153 ⋅ 7, 𝑠 = 17 ⋅ 2.
Null Hypothesis. The advertising campaign is not successful, i.e., 𝐻0 : 𝜇 = 146 ⋅ 3
Alternative Hypothesis, 𝐻1 : 𝜇 > 146 ⋅ 3 (Right-tail).
𝑥‾−𝜇
Test Statistic. Under 𝐻0 , the test statistic is : 𝑡 = 2 ∼ 𝑡22−1 = 𝑡21 ∴ 𝑡 =
√𝑠 /(𝑛−1)
153⋅7−146⋅3 7.4×√21
= = 9 ⋅ 03
√(17⋅2)2 /21 17⋅2

Conclusion
Tabulated value of 𝑡 for 21𝑑. 𝑓. at 5% level of significance for singletailed test is 1.72 . Since
calculated value is much greater than the tabulated value, it is highly significant . Hence we
reject the null hypothesis and conclude that the advertising campaign was definitely successful
in promoting Scales.

188 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

t-Test for Difference of Means.


Suppose we want to test if two independent samples 𝑥𝑖 (𝑖 = 1,2, … , 𝑛1 ) and 𝑦𝑗 , (𝑗 =
1,2, … , 𝑛2 ) of sizes 𝑛1 and 𝑛2 have been drawn from two normal populations with means 𝜇𝑋
and 𝜇𝑌 respectively.
Under the null hypothesis (𝐻0 ) that the samples have been drawn from the normal
populations with means 𝜇𝑋 and 𝜇𝑌 and under the assumption that the population variance are
equal, i.e., 𝜎𝑋2 = 𝜎𝑌2 = 𝜎 2 (say), the statistic
(𝑥‾ − 𝑦‾) − (𝜇x − 𝜇𝑌 )
𝑡=
1 1
𝑆√(𝑛 + 𝑛 )
1 2
𝑛1 𝑛2
1 1
𝑥‾ = ∑ 𝑥𝑖 , 𝑦‾ = ∑ 𝑦𝑗
𝑛1 𝑛2
𝑖=1 𝑖=1

1 2
𝑆2 = [∑ (𝑥𝑖 − 𝑥‾)2 + ∑ (𝑦𝑗 − 𝑦‾) ]
𝑛1 + 𝑛2 − 2
𝑖 𝑗

where and
is an unbiased estimate of the common population variance 𝜎 2 , follows Student's
tdistribution with (𝑛1 + 𝑛2 − 2)𝑑. 𝑓.
Proof. Distribution of 𝑡 defined in (16.7).
(𝑥‾ − 𝑦‾) − 𝐸(𝑥‾ − 𝑦‾)
𝜉= ∼ 𝑁(0,1)
√𝑉(𝑥‾ − 𝑦‾)
But 𝐸(𝑥‾ − 𝑦‾) = 𝐸(𝑥‾) − 𝐸(𝑦‾) = 𝜇𝑋 − 𝜇𝑌
𝜎𝑋2 𝜎𝑌2 1 1
𝑉(𝑥‾ − 𝑦‾) = 𝑉(𝑥‾) + 𝑉(𝑦‾) = + = 𝜎2 ( + )
𝑛1 𝑛2 𝑛1 𝑛2
(By assumption)
[The covariance term vanishes since samples are independent.]
(𝑥‾ − 𝑦‾) − (𝜇𝑥 − 𝜇𝛾 )
∴ 𝜉= ∼ 𝑁(0,1)
1 1
√𝜎 2 (
𝑛1 𝑛2 )
+
1 𝑛 𝑛 2
Let 𝜒 2 = 𝜎2 [∑𝑖=1
1
(𝑥𝑖 − 𝑥‾)2 + ∑𝑗=1
2
(𝑦𝑗 − 𝑦‾) ]

189 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

2 𝑛1 𝑠𝑥2 𝑛2 𝑠𝑦2
= [∑ (𝑥𝑖 − 𝑥‾)2 /𝜎 2 ] + [∑ (𝑦𝑗 − 𝑦‾) /𝜎 2 ] = + 2
𝜎2 𝜎
𝑖 𝑗

Since 𝑛1 𝑠𝑥 2 /𝜎 2 and 𝑛2 𝑠𝑌 2 /𝜎 2 are independent 𝜒 2 -variates with (𝑛1 − 1) and (𝑛2 − 1)𝑑. 𝑓.
respectively, by the additive property of chi-square distribution, 𝜒 2 defined in ( ∗∗ ) is a 𝜒 2 -
variate with (𝑛1 − 1) + (𝑛2 − 1), i.e., 𝑛1 + 𝑛2 − 2𝑑.f. Further, since sample mean and sample
variance are independently distributed, 𝜉 and 𝜒 2 are independent random variables. Hence
Fisher's 𝑡 statistic is given by
𝜉
𝑡=
𝜒2

𝑛1 + 𝑛2 − 2

(𝑥‾ − 𝑦‾) − (𝜇𝑋 − 𝜇𝑌 ) 1


= × 1/2
1 1 1
√𝜎 2 ( [𝑛 + 𝑛 − 2 {∑𝑖 (𝑥𝑖 − 𝑥‾)2 + ∑𝑗 (𝑦1 − 𝑦‾)2 }/𝜎 2 ]
𝑛1 + 𝑛2 ) 1 2

(𝑥‾ − 𝑦‾) − (𝜇𝑥 − 𝜇𝑦 ) 1 2


= , where 𝑆 2 = [∑ (𝑥𝑖 − 𝑥‾)2 + ∑ (𝑦𝑗 − 𝑦‾) ]
1 1 𝑛1 + 𝑛2 − 2
√(
𝑛1 + 𝑛2 )
𝑖 𝑗

and it follows Student's t-distribution with (𝑛1 + 𝑛2 − 2) d.f.


Remarks 1. 𝑆 2 , defined in (16 ⋅ 7𝑎) is an unbiased estimate of the common population
variance science,
7.3.2 F- Distribution
If 𝑋 and 𝑌 are two independent chi-square variates with 𝑣1 and 𝑣2 d.f. respectively, then F-
statistic is defined by
𝑋/𝑣1
𝐹=
𝑌/𝑣2
In other words, 𝐹 is defined as the ratio of two independent chi-square variates divided by
their respective degrees of freedom and it follows Snedecor's F distribution with (𝑣1 , 𝑣2 ) d.f.
with probability function given by :
𝑣1
𝑣 2 𝑣1
(𝑣1 ) 𝐹 2
−1
2
𝑓(𝐹) = 𝑣 𝑣 ⋅ (𝑣 +𝑣 )/2
,0 ≤ 𝐹 < ∞
𝐵 ( 21 , 22 ) (1 + 𝑣1 𝐹) 1 2
𝑣 2

190 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Remarks 1. The sampling distribution of 𝐹-statistic does not involve any population
parameters and depends only on the degrees of freedom 𝑣1 and 𝑣2 .
Remarks 2. A statistic 𝐹 following Snedecor's 𝐹-distribution with (𝑣1 , 𝑣2 ) d.f. will be
denoted by 𝐹 ∼ 𝐹(𝑣1 , 𝑣2 )

Derivation of Snedecor's F-Distribution.


Since 𝑋 and 𝑌 are independent chi-square variates with 𝑣1 and 𝑣2 d.f. respectively, their joint
probability density function is given by :
1 1
𝑓(𝑥, 𝑦) ={ exp (−𝑥/2)𝑥 (𝑣1/2)−1 } × { exp (−𝑦/2)𝑦 (𝑣2/2)−1 }
2𝑣1 /2 Γ(𝑣1 /2) 2𝑣2/2 Γ(𝑣2 /2)
1 𝑥+𝑦 𝑣1 𝑣2
= (𝑣1 +𝑣2 )
exp {− }× 𝑥 ( 2 )−1 𝑦 ( 2 )−1 , 0 ≤ (𝑥, 𝑦) < ∞
𝑣 𝑣 2
2 2 Γ ( 1) Γ ( 2)
2 2
Let us make the following transformation of variables:
𝑥/𝑣 𝑣
𝐹 = 𝑦/𝑣1 and 𝑢 = 𝑦, so that 0 ≤ 𝐹 < ∞, 0 < 𝑢 < ∞ ∴ 𝑥 = 𝑣1 𝐹𝑢 and
2 2

𝑦 = 𝑢 Jacobian of transformation J is given by:


𝑣1
𝑢 0
∂(𝑥,𝑦) 𝑣 𝑣1 𝑢
𝐽 = ∂(𝐹,𝑢) = |𝑣12 |=
𝐹 1 𝑣2
𝑣2
Thus the joint p.d.f. of the transformed variables is :
1 𝑢 𝑣1
𝑔(𝐹, 𝑢) = exp {− (1 + 𝐹)}
2(𝑣1+𝑣2)/2 Γ(𝑣1 /2)Γ(𝑣2 /2) 2 𝑣2
𝑣1 (𝑣1 /2)−1
× ( 𝐹𝑢) 𝑢(𝑣2/2)−1 ∣/1
𝑣2
(𝑣1 /𝑣2 )𝑣1/2 𝑢 𝑣1
= (𝑣 +𝑣 )/2Γ(𝑣 /2)Γ(𝑣 /2) exp {− (1 + 𝐹)}
2 1 2 1 2 2 𝑣2
|(𝑣1 +𝑣2 )/2|−1 (𝑣1 /2)−1
×𝑢 𝐹 ; 0 < 𝑢 < ∞, 0 ≤ 𝐹 < ∞

Integrating 𝑤. 𝑟, to 𝑢 over the range 0 to ∞, the p.d.f. of 𝐹 becomes:

191 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

(𝑣1 /𝑣2 )(𝑣1/2) 𝐹 (𝑣1/2)−1 ∞


𝑢 𝑣1
𝑔1 (𝐹) = × [∫ exp {− (1 + 𝐹)} 𝑢((𝑣1+𝑣2)/2∣−1 𝑑𝑢]
2(𝑣1+𝑣2/2) Γ(𝑣1 /2)Γ(𝑣2 /2) 0 2 𝑣2
(𝑣1 /𝑣2 )(𝑣1/2 𝐹 (𝑣1/2)−1 Γ[(𝑣1 + 𝑣2 )/2]
= (𝑣 +𝑣 )/2 × (𝑣1 +𝑣2 )/2
2 1 2 Γ(𝑣1 /2)Γ(𝑣2 /2) 1 𝑣
[2 (1 + 𝑣1 𝐹)]
2
(𝑣1 /2)−1
(𝑣1 /𝑣2 )𝑣1/2 𝐹
∴ 𝑔1 (𝐹) = 𝑣1 𝑣2 ⋅ (𝑣 +𝑣 )/2
,0 ≤ 𝐹 < ∞
𝐵 ( 2 , 2 ) (1 + 𝑣1 𝐹) 1 2
𝑣 2

which is the required probability function of 𝐹-distribution with (𝑣1 , 𝑣2 ) d.f.


Alter.
𝑥/𝑣1
𝐹=
𝑦/𝑣2
𝑣1 𝑥
∴ 𝐹 = 𝑦, being the ratio of two independent chi-square variates with 𝑣1 and 𝑣2 𝑑.f.
𝑣2
𝑣 𝑣
respectively is a 𝛽2 ( 21 , 22) variate. Hence the probability function of 𝐹 is given by:

𝑣 (𝑣1 /2)−1
1 (𝑣1 𝐹) 𝑣1
2
𝑑𝑃(𝐹) = 𝑣 𝑣 ⋅ (𝑣 +𝑣 )/2
𝑑 ( 𝐹)
𝐵 ( 1 , 2 ) (1 + 𝑣1 𝐹) 1 2 𝑣2
2 2 𝑣 2
𝑣 𝑣1 /2
( 1) 𝐹 (𝑣1/2)−1
𝑣2
⇒ 𝑓(𝐹) = 𝑣 𝑣 ⋅ (𝑣 +𝑣 //2
,0 ≤ 𝐹 < ∞
𝐵 ( 21 , 22 ) (1 + 𝑣1 𝐹) 1 2
𝑣 2

Constants of F-Distribution.

𝜇𝑟′ (about origin) = 𝐸(𝐹 𝑟 ) = ∫ 𝐹 𝑟 𝑓(𝐹)𝑑𝐹
0
∞ (𝑣1 /2)−1
(𝑣1 /𝑣2 )𝑣1/2 𝑟
𝐹
= 𝑣1 𝑣2 ∫ 𝐹 (𝑣1 +𝑣2 )/2
𝑑𝐹
( , ) 0 𝑣1
𝐵 2 2 (1 + 𝑣 𝐹)
2
𝑣1 𝑣2
To evaluate the integral, put: 𝑣 𝐹 = 𝑦, so that 𝑑𝐹 = 𝑣 𝑑𝑦
2 1

192 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑣 𝑟+(𝑣1 /2)−1
𝑣1 /2 ∞ ( 2 𝑦)
[𝑣 1 /𝑣2 1] 𝑣1 𝑣2
𝜇𝑟′ = 𝑣1 𝑣2 ∫ (1 + 𝑦)(𝑣1+𝑣2)/2 (𝑣1 ) 𝑑𝑦
𝐵(2 , 2) 0
𝑣 𝑟
(𝑣2 ) ∞
𝑦 𝑟+(𝑣1/2)−1
1
= 𝑣 𝑣 ∫ (𝑣
𝑑𝑦
𝐵 ( 21 , 22 ) 0 (1 + 𝑦) 1/2)+𝑟+((𝑣2/2)−𝑟
𝑣2 𝑟 1 𝑣1 𝑣2
=( ) ⋅ 𝑣 𝑣 ⋅ 𝐵 (𝑟 + , − 𝑟) , 𝑣2 > 2𝑟
𝑣1 𝐵 ( 21 , 22 ) 2 2

𝑣
Aliter we could also be obtained by substituting 𝑣1 𝐹 = tan2 𝜃 and using the Beta integral:
2
𝜋/2 𝑝+1 𝑞+1
2∫0 sin𝜇 ′
𝜃cos 𝜃𝑑𝜃 = 𝐵 ( , )
2 2

𝑣2 𝑟 Γ[𝑟 + (𝑣1 /2)]Γ[(𝑣2 /2) − 𝑟] 𝑣2


∴ 𝜇𝑟′ = ( ) ⋅ ;𝑟 < ⇒ 𝑣2 > 2𝑟
𝑣1 Γ(𝑣1 /2)Γ(𝑣2 /2) 2
In particular
𝑣2 Γ[1 + (𝑣1 /2)]Γ[(𝑣2 /2) − 1] 𝑣2
𝜇1′ = , = ,𝑣 > 2
𝑣1 Γ(𝑣1 /2)Γ(𝑣2 /2) 𝑣2 − 2 2
[∵ Γ(𝑟) = (𝑟 − 1)Γ(𝑟 − 1)]
Thus the mean of 𝐹-distribution is independent of 𝑣1 .
𝑣2 2 Γ[(𝑣1 /2) + 2]Γ[(𝑣2 /2) − 2]
𝜇2′ = ( ) ⋅
𝑣1 Γ(𝑣1 /2)Γ(𝑣2 /2)
𝑣2 2 [(𝑣1 /2) + 1](𝑣1 /2) 𝑣22 (𝑣1 + 2)
=( ) ⋅ = , 𝑣 > 4.
𝑣1 [(𝑣2 /2) − 1][(𝑣2 /2) − 2] 𝑣1 (𝑣2 − 2)(𝑣2 − 4) 2
′ 2
𝑣22 (𝑣1 + 2) 𝑣22 2𝑣22 (𝑣2 + 𝑣1 − 2)
∴ 𝜇2 = 𝜇2 − 𝜇1 = − = ,𝑣 > 4
𝑣1 (𝑣2 − 2)(𝑣2 − 4) (𝑣2 − 2)2 𝑣1 (𝑣2 − 2)2 (𝑣2 − 4) 2
Similarly, on putting 𝑟 = 3 and 4 in 𝜇𝑟′ , we get 𝜇3′ and 𝜇4′ respectively, from which the
central moments 𝜇3 and 𝜇4 can be obtained.
Remark. It has been proved that for large degrees of freedom, 𝑣1 and 𝑣2 , 𝐹 tends to
𝑁[1,2{(1/𝑣1 ) + (1/𝑣2 )}] variate.

Mode and Points of Inflexion of F-distribution.


We have

193 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑣1 + 𝑣2
log 𝑓(𝐹) = 𝐶 + {(𝑣1 /2) − 1}log 𝐹 − ( ) log {1 + (𝑣1 /𝑣2 )𝐹},
2
where 𝐶 is a constant independent of 𝐹.
∂ 𝑣1 1 (𝑣1 + 𝑣2 ) 1 𝑣1
[log 𝑓(𝐹)] = ( − 1) ⋅ − ⋅ 𝑣 ⋅
∂𝐹 2 𝐹 2 (1 + 𝑣1 𝐹) 𝑣2
2
∂ 𝑣1 − 2 𝑣 1 (𝑣1 + 𝑣2 )
𝑓 ′ (𝐹) = 𝑓(𝐹) = 0 ⇒ − =0
∂𝐹 2𝐹 2(𝑣2 + 𝑣1 𝐹)
Hence
𝑣2 (𝑣1 − 2)
𝐹=
𝑣1 (𝑣2 + 2)
𝑣 (𝑣 −2)
It can be easily verified that at this point 𝑓 ′′ (𝐹) < 0. Hence mode = 𝑣2 (𝑣1+2)
1 2

Remarks 1. Since 𝐹 > 0, mode exists if and only if 𝑣1 > 2.


2.
𝑣2 𝑣1 − 2
Mode = ( )⋅( )
𝑣2 + 2 𝑣1
Hence mode of 𝐹-distribution is always less than unity.
3. The points of inflexion of 𝐹-distribution exist for 𝑣1 > 4 and are equidistant from mode.
7.3.3 Z-Distribution
To test the significance of an observed sample correlation coefficient from an uncorrelated
bivariate normal population, 𝑡-test is used. But in random sample of size 𝑛 from a bivariate
normal population in which 𝜌 ≠ 0. Prof. R.A. Fisher proved that the distribution of ' 𝑟 ' is by
no means normal and in the neighbourhood of 𝜌 = ±1, its probability curve is extremely
skewed even for large 𝑛. If 𝜌 ≠ 0, Fisher suggested the following transformation:
1 1+𝑟
𝑍 = log 𝑒 = tanh−1 𝑟
2 1−𝑟
and proved that even for small samples, the distribution of 𝑍 is approximately normal with
1 1+𝜌
mean : 𝜉 = 2 log 𝑒 1−𝜌 = tanh−1 𝜌 and variance 1/(𝑛 − 3) and for large values of 𝑛, say >
50, the approximation is fairly good.
Remark.
Applications of Z-Transformation. Z-transformation has the following applications in
Statistics.

194 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

To test if an observed value of ' 𝑟 ' differs significantly from a hypothetical value 𝜌 of the
population correlation coefficient.
𝐻0 : There is no significant difference between 𝑟 and 𝜌. In other words, the given santple has
been drawn from a bivariate normal population with correlation coefficient 𝜌. If we take 𝑍 =
1 1
log 𝑒 {(1 + 𝑟)/(1 − 𝑟)} and 𝜉 = log 𝑒 {(1 + 𝜌)/(1 − 𝜌)}, then under 𝐻0 , 𝑍 ∼
2 2
1 𝑍−𝜉
𝑁 (𝜉, 𝑛−3) ⇒ ∼ 𝑁(0,1)
√1/(𝑛−3)
Thus if (𝑍 − 𝜉)√(𝑛 − 3) > 1.96, 𝐻0 is rejected at 5% level of significance and if it is
greater than 2.58, 𝐻0 is rejected at 1% level of significance.
Remark.
Z defined in given equation should not be confused with the Z used in Fisher's 𝑧-distribution .
Example 16.29. A correlation coefficient of 0.72 is obtained from a sample of 29 pairs of
observations.
(i) Can the sample be regarded as drawn from a bivariate normal population in which true
correlation coefficient is 0.8 ?
(ii) Obtain 95% confidence limits for 𝜌 in the light of the information provided by the
sample.
Solution. (i) 𝐻0 : There is no significant difference between 𝑟 = 0.72; and 𝜌 = 0.80, i.e., the
sample can be regarded as drawn from the bivariate normal population with 𝜌 = 0.8. Here
1 1+𝑟 1+𝑟
𝑍 = log 𝑒 ( ) = 1 ⋅ 1513log10 ( ) = 1 ⋅ 1513log10 6.14 = 0.907
2 1−𝑟 1−𝑟
1 1+𝜌 (1 + 0.8)
𝜉 = log 𝑐 ( ) = 1.1513log10 = 1.1513 × 0.9541 = 1.1
2 1−𝜌 (1 − 0.8)
1 1
S.E. (𝑍) = = = 0.196
√𝑛 − 3 √26
𝑍−𝜉
Under 𝐻0 , the test statistic is: 𝑈 = 1/√𝑛−3 ∼ 𝑁(0,1)

(0.907 − 1.100)
∴ 𝑈= = −0.985
0.196
Since |𝑈| < 1.96, it is not significant at 5% level of significance and 𝐻0 may be accepted.
Hence the sample may be regarded as coming from a bivariate normal population with 𝜌 =
0 ⋅ 8.
(ii) 95% confidence limits for 𝜌 on the basis of the information supplied by the sample, are
given by:

195 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1
|𝑈| ≤ 1.96 ⇒ |𝑍 − 𝜉| ≤ 1.96 × = 1.96 × 0.196
√𝑛 − 3
⇒ |0.907 − 𝜉| ≤ 0.384 or 0.907 − 0.384 ≤ 𝜉 ≤ 0.907 + 0.384
1 1+𝜌
⇒ 0.523 ≤ 𝜉 ≤ 1.291 or 0.523 ≤ log 𝑒 ( ) ≤ 1.291
2 1−𝜌
1+𝜌 0.523 1+𝜌 1.291
⇒ 0.523 ≤ 1.1513log10 ( ) ≤ 1.291 or ≤ log10 ( )≤
1−𝜌 1.1513 1−𝜌 1.1513
1+𝜌
∴ 0.4543 ≤ log10 ( ) ≤ 1.1213
1−𝜌

1+𝜌
∴ 0.4543 ≤ log10 ( ) ≤ 1.1213
1−𝜌
1+𝜌
Now log10 (1−𝜌) = 0.4543
1+𝜌 1+𝜌
⇒ = Antilog (0.4543) = 2.846 ⇒ = Antilog (1.1213) = 13.22
1−𝜌 1−𝜌
2 ⋅ 846 − 1 1 ⋅ 846 13 ⋅ 22 − 1 12 ⋅ 22
∴ 𝜌= = = 0.4799 ∴ 𝜌 = = = 0.86
2 ⋅ 846 + 1 3 ⋅ 846 13 ⋅ 22 + 1 14 ⋅ 22
Hence, substituting in ( ∗ ), we get 0.48 ≤ 𝜌 ≤ 0.86
(2) To test the significance of the difference between two independent sample correlation
coefficients. Let 𝑟1 and 𝑟2 be the sample correlation coefficients observed in two independent
1+𝑟 1 1+𝑟
samples of sizes 𝑛1 and 𝑛2 respectively, then, 𝑍1 = log 𝑐 (1−𝑟1) and 𝑍2 = 2 log 𝑐 (1−𝑟2)
1 2

Under the null hypothesis, 𝐻0 : that sample correlation coefficients do not differ
significantly, i.e., the samples are drawn from the same bivariate normal population or from
different populations with same correlation coefficient 𝜌, (say), the statistic:
(𝑍1 − 𝑍2 ) − 𝐸(𝑍1 − 𝑍2 )
𝑍= ∼ 𝑁(0,1)
S.E. (𝑍1 − 𝑍2 )
𝐸(𝑍1 − 𝑍2 ) = 𝐸(𝑍1 ) − 𝐸(𝑍2 ) = 𝜉1 − 𝜉2 = 0
1 1+𝜌
[∵ 𝜉1 = 𝜉2 = log 𝑟 ( under 𝐻0 )]
2 1−𝜌
1 1
and S.E. (𝑍1 − 𝑍2 ) = √𝑉(𝑍1 ) + 𝑉(𝑍2 ) = √{ + }
𝑛1 −3 𝑛2 −3
Under 𝐻0 , the test statistic is :
𝑍1 −𝑍2
[Covariance term vanishes since samples are independent.] 𝑍 = 1 1
∼ 𝑁(0,1)
√{𝑛 −3+𝑛 −3}
1 2

196 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

By comparing this value with 1.96 or 2.58, 𝐻0 may be accepted or rejected at 5% and 1%
levels of significance respectively.
(3) To obtain pooled estimate of 𝜌. Let 𝑟1 , 𝑟2 , … , 𝑟𝑘 be observed correlation coefficients in 𝑘-
independent samples of sizes 𝑛1 , 𝑛2 , … , 𝑛𝑘 respectively from a bivariate normal population.
The problem is to combine these estimates of 𝜌 to get a pooled estimate for the parameter. If
we take
1 1+𝑟
𝑍𝑖 = 2 log 𝑒 (1−𝑟𝑖) ; 𝑖 = 1,2, … , 𝑘; then 𝑍𝑖 ; 𝑖 = 1,2, … , 𝑘 are independent normal variates with
𝑖
1 1 1+𝜌
variances (𝑛 ; 𝑖 = 1,2, … , 𝑘 and common mean 𝜉 = log 𝑐 ( ).
𝑖 −3) 2 1−𝜌

The weighted mean, (say 𝑍 ), of these 𝑍𝑖 's is given by :
𝑍‾ = ∑𝑘𝑖=1 𝑤𝑖 𝑍𝑖 /∑𝑘𝑖=1 𝑤𝑖 , where 𝑤𝑖 is the weight of 𝑍𝑖 .
Now 𝑍‾ is also an unbiased estimate of 𝜉, since
𝑘
1 1 1
𝐸(𝑍‾) = (𝐸 ∑ 𝑤i 𝑍𝑖 ) = [∑ 𝑤𝑖 𝐸(𝑍𝑖 )] = (∑ 𝑤𝑖 𝜉) = 𝜉
∑ 𝑤𝑖 ∑ 𝑤𝑖 ∑ 𝑤𝑖
𝑖=1 𝑖
1 1
𝑉(𝑍‾) = 𝑉 (∑ 𝑤𝑖 𝑍𝑖 ) = [∑ 𝑤𝑖2 𝑉(𝑍𝑖 )]
(∑ 𝑤𝑖 )2 (∑ 𝑤𝑖 )2

The weights 𝑤𝑖′ s, (𝑖 = 1,2, … , 𝑛) are so chosen that 𝑍‾ has minimum variance.
In order that 𝑉(𝑍‾) is minimum for variations in 𝑤𝑖 , we should have

𝑉(𝑍‾) = 0; 𝑖 = 1,2, … , 𝑘.
∂𝑤𝑖
(∑ 𝑤𝑖 )2 2𝑤𝑖 𝑉(𝑍𝑖 ) − [∑ 𝑤𝑖2 𝑉(𝑍𝑖 )]2(∑ 𝑣𝑖 ) ∑ 𝑤𝑖2 𝑉(𝑍𝑖 )
⇒ = 0 or 𝑤 𝑖 𝑉(𝑍𝑖 ) = , a constant.
(∑ 𝑤𝑖 )4 ∑ 𝑤𝑖
1
∴ 𝑤𝑖 ∝ = (𝑛𝑖 − 3); 𝑖 = 1,2, … , 𝑘.
𝑉(𝑍𝑖 )

∑𝑘
𝑖=1 𝑤𝑖 𝑍𝑖 ∑𝑘
𝑖=1 (𝑛𝑖 −3)𝑍𝑖
Hence the minimum variance estimate of 𝜉 is given by : 𝑍‾ = ∑𝑘
= ∑𝑘
𝑖=1 𝑤𝑖 𝑖=1 (𝑛𝑖 −3)

and the best estimate of 𝜌 is then given by.


1 1+𝜌 𝑒 2𝑖‾ − 1 ∑ (𝑛𝑖 − 3)𝑍𝑖
𝑍‾ = log 𝑖 ⇒ 𝜌ˆ = 2𝑖‾ = tanh 𝑧‾ = tanh [ ]
2 1−𝜌 𝑒 +1 Σ(𝑛𝑖 − 3)
Remark. Minimum variance of 𝑍‾ is given by :

197 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1
∑ {(𝑛𝑖 − 3)2 (
𝑛𝑖 − 3)} ∑ (𝑛𝑖 − 3) 1
[𝑉(𝑍‾)]min = 2
= 2
= 𝑘
|∑ (𝑛𝑖 − 3)| |∑ (𝑛𝑖 − 3)| ∑𝑖=1 (𝑛𝑖 − 3)
7.3.4 Chi-Square Distribution
The square of a standard normal variate is known as a chi-square variate (pronounced as Ki-
Sky without S) with 1 degree of freedom (d.f.).
𝑋−𝜇 𝑋−𝜇 2
Thus if 𝑋 ∼ 𝑁(𝜇, 𝜎 2 ), then 𝑍 = 𝜎 ∼ 𝑁(0,1) and 𝑍 2 = ( 𝜎 ) is a chi-square variate with
1 d.f.
In general if 𝑋𝑖 , (𝑖 = 1,2, … , 𝑛) are 𝑛 independent normal variates with means 𝜇𝑖 and
variances 𝜎𝑖2 , (𝑖 = 1,2, … , 𝑛), then
𝑋𝑖 −𝜇𝑖 2
𝜒 2 = ∑𝑛𝑖=1 ( ) , is a chi-square variate with 𝑛 d.f.
𝜎𝑖

DERIVATION OF THE CHI-SQUARE (𝒙𝟐 ) DISTRIBUTION


First Method-Method of Moment Generating Function
If 𝑋𝑖 , (𝑖 = 1,2, … , 𝑛) are independent 𝑁(𝜇𝑖 , 𝜎𝑓2 ), we want the distribution of
𝑛 𝑛
𝑋𝑖 − 𝜇𝑖 2 𝑋𝑖 − 𝜇𝑖
2
𝜒 =∑ ( ) = ∑ 𝑈𝑖2 , where 𝑈𝑖 = ∼ 𝑁(0,1)
𝜎𝑖 𝜎𝑖
𝑖=1 𝑖=1

Since 𝑋𝑖 's are independent, 𝑈𝑖 's are also independent. Therefore,


𝑛
𝑛
𝑀𝑥 2 (𝑡) = 𝑀Σ𝑢2 (𝑡) = ∏ 𝑀𝑢2 (𝑡) = [𝑀𝑢2 (𝑡)] , [∵ 𝑢𝑖′ s are 𝑖. 𝑖. 𝑑, 𝑁(0,1)] … ( ∗ )
𝑖 𝑖 𝑖
𝑖=1

𝑀𝑢2 (𝑡) = 𝐸[exp (𝑡𝑈𝑖2 )] = ∫ exp (𝑡𝑢𝑖2 )𝑓(𝑥𝑖 )𝑑𝑥𝑖
𝑖
−∞

1
=∫ exp (𝑡𝑢𝑖2 ) exp {−(𝑥𝑖 − 𝜇)2 /2𝜎 2 }𝑑𝑥𝑖
−∞ 𝜎√2𝜋

1 𝑥𝑖 − 𝜇
= ∫ exp (𝑡𝑢𝑖2 )exp (−𝑢𝑖2 /2)𝑑𝑢𝑖 , [𝑢𝑖 = ]
√2𝜋 −∞ 𝜎

1 1 − 2𝑡 2 1 √𝜋
= exp {− (
∫ ) 𝑢𝑖 } 𝑑𝑢𝑖 = ⋅ 1/2
= (1 − 2𝑡)−1/2
√2𝜋 −∞ 2 √2𝜋 1 − 2𝑡
( 2 )

198 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1 1
which is the m.g.f. of a Gamma variate with parameters 2 and 2 𝑛.
Hence, by uniqueness theorem of 𝑚. 𝑔. 𝑓. 's,
𝑋− −𝜇 2 1 1
𝜒 2 = ∑𝑛𝑖 ( ) , is a Gamma variate with parameters 2 and 2 𝑛.
𝜎𝑖

1 𝑛/2
(2) 1
∴ 𝑑𝑃(𝜒 2 ) = ⋅ [exp (− 𝜒 2 )] (𝜒 2 )(𝑛/2)−1 𝑑𝜒 2
Γ(𝑛/2) 2
1
= 𝑛/2 [exp (−𝜒 2 /2)](𝜒 2 )(𝑛/2)−1 𝑑𝜒 2 , 0 ≤ 𝜒 2 < ∞
2 Γ(𝑛/2)

which is the required p.d.f. of chi-square distribution with 𝑛 degrees of freedom.


Remarks 1. If a r.v. 𝑋 has a chi-square distribution with 𝑛𝑑. 𝑓, we write 𝑋 ∼ 𝜒 2 (𝑛) and its
p.d.f. is :
1
𝑓(𝑥) = 𝑛/2
𝑒 −𝑥/2 𝑥 (𝑛/2)−1 ; 0 ≤ 𝑥 < ∞
2 Γ(𝑛/2)
1 1
2 If 𝑋 ∼ 𝜒 2 (𝑛), then 2 𝑋 ∼ 𝛾 (2 𝑛).
1
Proof. The p.d.f. of 𝑌 = 2 𝑋, is given by :
𝑑𝑥 1 𝑛 1 𝑛
𝑔(𝑦) = 𝑓(𝑥) ⋅ | | = 𝑛 𝑒 −𝑦 ⋅ (2𝑦)( 2 )−1 ⋅ 2 = 𝑛 𝑒 −𝑦 𝑦 ( 2)−1 ; 0 ≤ 𝑦 < ∞
𝑑𝑦 𝑛 Γ (2)
2 2 Γ (2 )
1 1
∴ 𝑌 = 𝑋 ∼ 𝛾 ( 𝑛) .
2 2

Second Method - Method of Induction


1 1
If 𝑋𝑖 ∼ 𝑁(0,1), then 2 𝑋𝑖2 is a 𝛾 (2) so that 𝑋𝑖2 is a 𝜒 2 variate with 𝑑. 𝑓. 1.
If 𝑋1 and 𝑋2 are independent standard normal variates then 𝑋1 2 + 𝑋2 2 is a chisquare variate
with 2 d.f. which may be proved as follows :
The joint probability differential of 𝑋1 and 𝑋2 is given by :
𝑑𝑃(𝑥1 , 𝑥2 )
= 𝑓(𝑥1 , 𝑥2 )𝑑𝑥1 𝑑𝑥2 = 𝑓1 (𝑥1 )𝑓2 (𝑥2 )𝑑𝑥1 𝑑𝑥2
1
= exp {−(𝑥12 + 𝑥22 )/2}𝑑𝑥1 𝑑𝑥2 , −∞ < (𝑥1 , 𝑥2 ) < ∞
2𝜋
Let us transform to polar co-ordinates by substitution 𝑥1 = 𝑟cos 𝜃, 𝑥2 = 𝑟sin 𝜃. Jacobian of
transformation 𝐽 is given by:

199 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

∂𝑥1 ∂𝑥2
cos 𝜃 sin 𝜃
𝐽 = | ∂𝑟 ∂𝑟 | = | |=𝑟
∂𝑥1 ∂𝑥2 −𝑟sin 𝜃 𝑟cos 𝜃
∂𝜃 ∂𝜃
2 2 2
Also, we have 𝑟 = 𝑥1 + 𝑥2 and tan 𝜃 = 𝑥2 /𝑥1 . As 𝑥1 and 𝑥2 range from −∞ to +∞, 𝑟
varies from 0 to ∞ and 𝜃 from 0 to 2𝜋. The joint probability differential of 𝑟 and 𝜃 now
1
becomes 𝑑𝐺(𝑟, 𝜃) = 2𝜋 exp (−𝑟 2 /2)𝑟𝑑𝑟𝑑𝜃; 0 ≤ 𝑟 ≤ ∞, 0 ≤ 𝜃 ≤ 2𝜋
Integrating over 𝜃, the marginal distribution of 𝑟 is given by:
2𝜋
𝜃 2𝜋
𝑑𝐺1 (𝑟) =∫ 𝑑𝐺(𝑟, 𝜃) = 𝑟exp (−𝑟 2 /2)𝑑𝑟 | | = exp (−𝑟 2 /2)𝑟𝑑𝑟
0 2𝜋 0
1 1
⇒ 𝑑𝐺1 (𝑟 2 ) = exp (−𝑟 2 /2)𝑑𝑟 2 = exp (−𝑟 2 /2)(𝑟 2 /2)1−1 𝑑(𝑟 2 /2)
2 Γ(1)
𝑟2 𝑋 2 +𝑋 2
Thus 2 = 1 2 2 is a 𝛾(1) variate and hence 𝑟 2 = 𝑋12 + 𝑋22 is a 𝜒 2 -variate with 2 d.f. For 𝑛
variables 𝑋𝑖 , (𝑖 = 1,2, … , 𝑛), we transform (𝑋1 , 𝑋2 , … , 𝑋𝑛 ) to (𝑥, 𝜃1 , 𝜃2 , … , 𝜃𝑛−1 ); (1 − 1
transformation ) by:
𝑥1 = 𝜒cos 𝜃1 cos 𝜃2 … cos 𝜃𝑛−1
𝑥2 = 𝜒cos 𝜃1 cos 𝜃2 … cos 𝜃𝑛−2 sin 𝜃𝑛−1
𝑥3 = 𝜒cos 𝜃1 cos 𝜃2 … cos 𝜃𝑛−3 sin 𝜃𝑛−2

𝑥𝑗 = 𝜒cos 𝜃1 cos 𝜃2 … cos 𝜃𝑛−1 sin 𝜃𝑛−1+1
𝑥𝑛 = 𝜒sin 𝜃1 }
1 1 1
where 𝜒 > 0, −𝜋 < 𝜃1 < 𝜋 and − 2 𝜋 < 𝜃1 < 2 𝜋; for 𝑖 = 2,3, … 2 (𝑛 − 1).
Then 𝑥12 + 𝑥22 + ⋯ + 𝑥𝑛2 = 𝜒 2 and |𝐽| = 𝜒 𝑛−1 cos 𝑛−2 𝜃1 cos𝑛−3 𝜃2 … cos 𝜃𝑛−2 (c.f.
Advanced Theory of Statistics Vol. 1, by Kendall and Stuart.) The joint distribution of
𝑋1 , 𝑋2 , … , 𝑋𝑛 , viz.,
𝑛 𝑛
1
𝑑𝐹(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) = ( ) exp (− ∑ 𝑥𝑖2 /2) ∏
𝑑𝑥𝑖 transforms to
√2𝜋 𝑖=1
1
𝑑𝐺(𝜒, 𝜃1 , 𝜃2 , … , 𝜃𝑛−1 ) = exp (− 𝜒 2 ) 𝜒 𝑛−1 cos𝑛−2 𝜃1 cos 𝑛−3 𝜃2 … cos 𝜃𝑛−2 𝑑𝜒𝑑𝜃1 𝑑𝜃2 … 𝑑𝜃𝑛−1
2
Integrating over 𝜃1 , 𝜃2 , … , 𝜃𝑛−1, we get the distribution of 𝜒 2 as :
𝑑𝑃(𝜒 2 ) = 𝑘exp (−𝜒 2 /2)(𝜒 2 )(𝑛/2)−1 𝑑𝜒 2 , 0 ≤ 𝜒 2 < ∞

The constant 𝑘 is determined from the fact that total probability is unity, i.e.

200 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

∞ ∞ 𝑛 1
∫ 𝑑𝑃(𝜒 2)
= 1 ⇒ 𝑘 ∫ exp (−𝜒 2 /2)(𝜒 2 ) 2−1 𝑑𝜒 2 = 1 ⇒ 𝑘 =
0 0 2𝑛/2 Γ(𝑛/2)
1 𝑛
∴ 𝑑𝑃(𝜒 2 ) = 𝑛/2 exp (−𝜒 2 /2)(𝜒 2 ) 2−1 , 0 ≤ 𝜒 2 < ∞
2 Γ(𝑛/2)
1 1
Hence 2 𝜒 2 = 2 ∑𝑛𝑖=1 𝑋𝑖2 is a 𝛾(𝑛/2) variate ⇒ 𝜒 2 = ∑𝑛𝑖=1 𝑋𝑖2 is a chi-square variate with 𝑛
degrees of freedom (d.f.) and (15.2) gives p.d.f. of chi-square distribution with 𝑛 d.f.

Remarks
1 . If 𝑋𝑖 ; 𝑖 = 1,2, … , 𝑛 are 𝑛 independent normal variates with mean 𝜇𝑖 and S.D. 𝜎𝑖 , then
𝑋𝑖 −𝜇𝑖 2
∑𝑛𝑖=1 ( ) is a 𝜒 2 -variate with 𝑛 d.f.
𝜎𝑖

2. In random sampling from a normal population with mean 𝜇 and S.D. 𝜎, 𝑥‾ is distributed
normally about the mean 𝜇 with S.D. 𝜎/√𝑛.
2
𝑋‾ − 𝜇 𝑥‾ − 𝜇
∴ ∼ 𝑁(0,1) ⇒ [ ] is a 𝜒 2 -variate with 1 d.f.
𝜎/√𝑛 𝜎/√𝑛

3. Normal distribution is a particular case of 𝜒 2 -distribution when 𝑛 = 1, since for 𝑛 = 1,


1 1
𝑝(𝜒 2 ) = exp (−𝜒 2 /2)(𝜒 2 )1−1 𝑑𝜒 2 , 0 ≤ 𝜒 2 < ∞
√2Γ(1/2)
1
= exp (−𝜒 2 /2)𝑑𝜒, −∞ ≤ 𝜒 < ∞
√2𝜋
Thus 𝜒 is a standard normal variate.
4. For 𝑛 = 2,
1 1 1 −𝑥
𝑝(𝜒 2 ) = 2 exp (− 2 𝜒 2 ) , 𝜒 2 ≥ 0 ⇒ 𝑝(𝑥) = 2 exp ( 2 ) , 𝑥 ≥ 0 which is the p.d.f. of
exponential distribution with mean 2.

M.G.F. OF CHI-SQUARE DISTRIBUTION


Let 𝑋 ∼ 𝜒 2 (𝑛), then
which is the required m.g.f. of a 𝜒 2 -variate with 𝑛𝑑. 𝑓.
Remarks
1. Using Binomial expansion for negative index, we get from (15-4)

201 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑛 𝑛 𝑛 𝑛 𝑛 𝑛
( + 1) ( + 1) ( + 2) … (
𝑀(𝑡)
𝑛
= 1 + (2𝑡) + 2 2 (2𝑡)2 + ⋯ + 2 2 2 2 + 𝑟 − 1) (2𝑡)𝑟 + ⋯
2 2! 𝑟!
𝑡′
𝜇𝑟′ ′ = Coefficient of in the expansion of 𝑀(𝑡)
𝑟!
𝑛 𝑛 𝑛 𝑛
= 2 ( + 1) ( + 2) … ( + 𝑟 − 1)
2 2 2 2
= 𝑛(𝑛 + 2)(𝑛 + 4) … (𝑛 + 2𝑟 − 2)

2. If 𝑛 is even so that 𝑛/2 is a positive integer, then


𝜇𝑟′ = 2𝑟Γ[(𝑛/2) + 𝑟]/Γ(𝑛/2)
Cumulant Generating Function of 𝝌𝟐 -Distribution. If 𝑋 ∼ 𝜒 2 (𝑛), then
𝑟𝑟
In general, 𝜅𝑟 = Coefficient of in 𝐾(𝑡) = 𝑛2𝑚−1 (𝑟 − 1)!
𝑟!

Hence,
Limiting Form of 𝝌𝟐 Distribution for Large Degrees of Freedom.
1
If 𝑋 ∼ 𝜒 2 (𝑛), then 𝑀𝑋 (𝑡) = (1 − 2𝑡)−𝑛/2 , |𝑡| < 2.
The 𝑚. 𝑔. 𝑓. of standard 𝜒 2 variate 𝑍 is : 𝑀𝑋−𝜇/𝜎 (𝑡) = 𝑒 −𝜇/𝜎 𝑀𝑋 (𝑡/𝜎)
−𝑛/2
2𝑡
⇒ 𝑀𝑧 (𝑡) = 𝑒 −−1𝑡/𝜎 (1 − 2𝑡/𝜎)−𝑛/2 = 𝑒 −𝑛𝑡/√2𝑛 (1 − ) (∵ 𝜇 = 𝑛, 𝜎 2 = 2𝑛)
√2𝑛
𝑛 𝑛 2
∴ 𝐾𝑧 (𝑡) = log 𝑀2 (𝑡) = −𝑡√ − log (1 − 𝑡√ )
2 2 𝑛

𝑛 𝑛 2 𝑡 2 2 𝑡 3 2 3/2
= −𝑡√ + [𝑡 ⋅ √ + ⋅ + ( ) +⋯]
2 2 𝑛 2 𝑛 3 𝑛

𝑛 𝑛 𝑡2 𝑡2
= −𝑡√ + 𝑡. √ + + 𝑂(𝑛−1/2 ) = + 𝑂(𝑛−1/2 ),
2 2 2 2
where 𝑂(𝑛−1/2 ) are terms containing 𝑛1/2 and higher powers of 𝑛 in the denominator.
𝑡2 2
∴ lim 𝐾𝑍 (𝑡) = ⇒ 𝑀𝑍 (𝑡) = 𝑒 𝑡 /2 as 𝑛 → ∞,
𝑛→∞ 2
which is the m.g.f. of a standard normal variate. Hence, by uniqueness theorem of 𝑚. 𝑔. 𝑓. 𝑍 is
asymptotically normal. In other words, standard 𝜒 2 variate tends to standard normal variate as
𝑛 → ∞. Thus, 𝜒 2 . distribution tends to normal distribution for large 𝑑. 𝑓.

202 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

In practice for 𝑛 ≥ 30, the 𝜒 2 -approximation to normal distribution is fairly good. So


whenever 𝑛 ≥ 30, we use the normal probability tables for testing the significance of the value
of 𝜒 2 . That is why in the tables, the significant values of 𝑥 2 have been tabulated till 𝑛 = 30
only.
7.4 IN-TEXT QUESTIONS
MCQ’s Problems
Question: 1
Let 𝑋1 , 𝑋2 , … , 𝑋𝑁 be identically distributed random variable with mean 2 and variance 1. Let
N be a random variable follows Poisson distribution with mean 2 and independent of 𝑋i′ S. Let
𝑆𝑁 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑁 , then Var (SN ) is equals
A. 4
B. 10
C. 2
D. 1
Question: 2
Let 𝐴 and 𝐵 be independent Random Variables each having the uniform distribution on
[0,1]. Let 𝑈 = min{𝐴, 𝐵} and 𝑉 = max{𝐴, 𝐵}, then Cov (𝑈, 𝑉) is equals
A. -1/36
B. 1/36
C. 1
D. 0
Question: 3
Let 𝑋1 , 𝑋2 , 𝑋3 be random sample from uniform (0, 𝜃 2 ), 𝜃 > 1 then maximum likelihood
estimation (mle) of 𝜃
2
A. 𝑋(1)
B. √X(3)
C. √X(1)
D. 𝛼𝑋(1) + (1 − 𝛼)𝑋(3) ; 0 < 𝛼 < 1

203 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question: 4
1 6 1
For the discrete variate with density: 𝑓(𝑥) = 8 𝐼(−1) (𝑥) + 8 𝐼(0) (𝑥) + 8 𝐼(1) (𝑥).
Which of the following is TRUE?
1
A. 𝐸(𝑋) = 2
1
B. 𝑉(𝑋) = 2
1
C. 𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } ≤ 4
1
D. 𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } ≥ 4
Question: 5
Lę 𝑋𝑖 , 𝑌𝑖 ; (𝑖 = 1,2)
be a i.i.d random sample of size 2 from a standard normal distribution. What is the
distribution W is given by
√2(𝑋1 + 𝑋2 )
𝑊=
√(𝑋2 − 𝑋1 )2 + (𝑌2 − 𝑌1 )2

A. t-distribution with 1 d.f


B. t-distribution with 2 d.f
C. Chi-square distribution with 2 d.f
D. Does not determined
Question: 6
The moment generating function of a random variable X is given by
1 1 𝑡 1 2𝑡 1 3𝑡
𝑀𝑋 (𝑡) = + 𝑒 + 𝑒 + 𝑒 , −∞ < 𝑡 < ∞
6 3 3 6
Then P(X ≤ 2) equals
1
A. 3
1
B. 6
1
C. 2
5
D. 6

204 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question: 7
1 1
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from 𝐔 (𝜃 − 2 , 𝜃 + 2) distribution, where 𝜃 ∈ ℝ. If
𝑋(1) = min{𝑋1 , 𝑋2 , … , 𝑋𝑛 } and 𝑋(𝑛) = max{𝑋1 , 𝑋2 , … , 𝑋𝑛 }.
1 1
Define 𝑇1 = 2 (𝑋(1) + 𝑋(𝑛) ), 𝑇2 = 4 (3𝑋(1) + 𝑋(𝑛) + 1)
1
and 𝑇3 = 2 (3𝑋(𝑛) − 𝑋(1) − 2) an estimator for 𝜃, then which of the following is/are TRUE?
A. 𝑇1 and 𝑇2 are MLE for 𝜃 but 𝑇3 is not MLE for 𝜃

B. 𝑇1 is MLE for 𝜃 but 𝑇2 and 𝑇3 are not MLE for 𝜃


C. 𝑇1 , 𝑇2 and 𝑇3 are MLE for 𝜃
D. 𝑇1 , 𝑇2 and 𝑇3 are not MLE for 𝜃
Question: 8
Let 𝑋 and 𝑌 be random variable having joint probability density function
𝑘
𝑓(𝑥, 𝑦) = ; −∞ < (𝑥, 𝑦) < ∞
(1 + 𝑥 2 )(1 + 𝑦2)
Where k is constant, then which of the following is/are TRUE?
1
A. k =
𝜋2
1 1
B. 𝑓(𝑥) = 𝜋 1+𝑥 2 ; −∞ < 𝑥 < ∞

C. P(X = Y) = 0
D. All of the above
Question: 9
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be sequence of independently and identically distributed random variables
with the probability density function
1 2 −𝑥
𝑓(𝑥) = {2 𝑥 𝑒 , if 𝑥 > 0 and let
0, otherwise
𝑆𝑛 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛 then which of the following statement is/are TRUE?

205 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑆𝑛 −3𝑛
A. ∼ 𝑁(0,1) for all 𝑛 ≥ 1
√3𝑛
𝑆
B. For all 𝜀 > 0, 𝑃 (| 𝑛𝑛 − 3| > 𝜀) → 0 as n → ∞

𝑆𝑛
C. → 1 with probability 1
𝑛

D. Both A and B
Question: 10
Let 𝑋, 𝑌 are i.i.d Binomial (𝑛, 𝑝) random variables. Which of the following are true?

A. 𝑋 + 𝑌 ∼ Bin (2𝑛, 𝑝)

B. (X, Y) ∼ Multinomial (2n; p, p)

C. Var (X − Y) = E(X − Y)2

D. option A and C are correct.


7.5 SUMMARY
The main points which we have covered in this lessons are what is estimator and what is
consistency, efficiency and sufficiency of the estimator and how to get best estimator.
7.6 GLOSSARY
Motivation: These Problems are very useful in real life and we can use it in data science ,
economics as well as social sciemce.
Attention: Think how the best estimator are useful in real world problems.
7.7 ANSWER TO IN-TEXT QUESTIONS
Answer 1: B
Explanation:
Let 𝑋1 , 𝑋2 , … be identically distributed random variable and let N be a random variable.
Define 𝑆𝑁 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑁
Then E(SN ) = E(Xi ) ⋅ E(N) = 4
𝑉 (𝑆𝑁 ) = 𝐸(𝑁)Var (𝑋𝑖 ) + [𝐸 (𝑋𝑖 )]2 Var (𝑁) = 10

Answer 2: B
Explanation:
206 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

If 𝐴 and 𝐵 be independent 𝑅𝑎𝑛𝑑𝑜𝑚 𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒 each having the uniform distribution on [0,1].
Let 𝑈 = min{𝐴, 𝐵} and 𝑉 = max{𝐴, 𝐵},
then
𝐸(𝑈) = 1/3, 𝐸(𝑉) = 2/3 and 𝑈𝑉 = 𝐴𝐵 and 𝑈 + 𝑉 = 𝐴 + 𝐵
Thus Cov (𝑈, 𝑉) = 𝐸(𝑈𝑉) − 𝐸(𝑈)
𝐸(𝑉) = 𝐸(𝐴𝐵) − 𝐸(𝑈)
1 2 1
E(V) = E(A) ⋅ E(B) − E(U) ⋅ E(V) = − =
4 9 36
Answer 3: B
Explanation:
1
𝑋𝑖 ∼ 𝑈(0, 𝜃 2 ) 𝑓(𝑥) = ; 0 < 𝑥𝑖 < 𝜃 2
𝜃2

𝑋(3) ≤ 𝜃 2 ⇒ 𝜃ˆ ∈ [√𝑋(3) , ∞)
3
1
𝐿(𝑋, 𝜃) = ∏ 𝑓(𝑥𝑖 , 𝜃) =
𝜃6
𝑖=1
∂𝐿
⇒ ∂𝜃 < 0 there fore given function is decreasing then 𝜃ˆ = √𝑋(3)
Answer 4: C
Explanation:

X −1 0 1

P(x) 1/8 6/8 1/8


1 6 1
E(X) = −1 × + 0 × + 1 × = 0
8 8 8
1 6 1 1
E(𝑋 2 ) = 1 × + 0 × + 1 × =
8 8 8 4
1 1
𝑉(𝑋) = 𝐸(𝑋 2 ) − {𝐸(𝑋)}2 = ⇒ 𝜎𝑋 =
4 2
𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } = 𝑃{|𝑋| ≥ 1} = 1 − 𝑃(|𝑋| < 1)
= 1 − 𝑃(−< 𝑋 < 1) = 1 − 𝑃(𝑋 = 0) = 1/4

207 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1
𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } ≤ 4 [By Chebychev’s inequality]

Answer 5: B
Explanation:
Let 𝑋𝑖 , 𝑌𝑖 ; (𝑖 = 1,2)
be a i.i.d random sample of size 2 from a standard normal
√2(X1 +X2 )
distribution. Then W = ∼ 𝑡(2)
√(X2 −X1 )2 +(Y2 −Y1 )2
Hence option (b) is correct.
Answer 6: D
Explanation:
Let 𝑋 be Random Variable with 𝑀𝑋 (𝑡) = 𝐸(𝑒 𝑡𝑋 ) = ∑etx P(X = x)
1
; 𝑥=0
6
1
; 𝑥=1
3
Then 𝑃(𝑋 = 𝑥) = 1
; 𝑥=2
3
1
{6 ; 𝑥 = 3
𝑃(𝑋 ≤ 2) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)
1 1 1 5
= + + =
6 3 3 6

Answer 7: A
Explanation:
1 1
𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from U (𝜃 − 2 , 𝜃 + 2)
1 1
𝑓(𝑥) = 1; 𝜃 − < 𝑥𝑖 < 𝜃 +
2 2
1 1
𝜃ˆ ∈ [𝑋(𝑛) − , 𝑋(1) + ]
2 2
distribution of 𝑋 free from parameter,
1 1
then 𝜃ˆ = 𝜆 (𝑋(𝑛) − 2) + (1 − 𝜆) (𝑋(1) + 2) ; 0 < 𝜆 < 1

1 1 3
Take 𝜆 = 2 , 4 and 4 then we obtained mle of 𝜃 are

208 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1 1 1
(𝑋(1) + 𝑋(𝑛) ); 4 (3𝑋(1) + 𝑋(𝑛) + 1); 4 (3𝑋(1) + 𝑋(𝑛) + 1) respectively.
2

Hence option (a) is correct.


Answer 8: D
Explanation:
Let 𝑋 and 𝑌 be random variable having joint probability
𝑘
density function 𝑓(𝑥, 𝑦) = (1+𝑥 2)(1+𝑦 2) ; −∞ < (𝑥, 𝑦) < ∞
∞ ∞ 1
∫−∞ ∫−∞ 𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦 = 1 ⇒ 𝑘 = 2
𝜋
1 1
Since 𝑋 and 𝑌 are independent, then 𝑋 ∼ 𝑓(𝑥) = 2 ; −∞ < 𝑥 < ∞
𝜋 1+𝑥

P(X = Y) = 0{ There is no region occur corresponding to X = Y, then probability


corresponding to this region will be zero}
Answer 9: D
Explanation:
Clearly, 𝑋1 , 𝑋2 , … , 𝑋𝑛
are i.i.d 𝐺(3,1) random variables. Then, 𝐸(𝑋𝑖 ) = 3 and Var (𝑋𝑖 ) = 3, 𝑖 = 1,2, …
Let 𝑆𝑛 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛 , then E(𝑆𝑛 ) = 3𝑛 and Var (𝑆𝑛 ) = 3𝑛
Now For option (a)
𝑆𝑛 −3𝑛
Using CLT ∼ 𝑁(0,1) for all 𝑛 ≥ 1
√3𝑛

For option (b)


𝑆𝑛 3𝑛 𝑆𝑛 3𝑛
lim𝑛→∞ 𝐸 ( ) = lim𝑛→∞ = 3; lim𝑛→∞ 𝑉 ( ) = lim𝑛→∞ 2 = 0
𝑛 𝑛 𝑛 𝑛

By Using Convergence in probability condition


(Consistency Properties)

𝑆
For all 𝜀 > 0, 𝑃 (| 𝑛𝑛 − 3| > 𝜀) → 0 as n → ∞
For option (c)

209 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑆𝑛
→3
𝑛
with probability 1 (By using convergent in probability condition)

For option (d)


𝑠𝑛 −𝐸(𝑠𝑛 ) 3(𝑛−√𝑛)−𝐸(𝑆𝑛 )
lim𝑛→∞ 𝑃 ( ≥ ) = 𝑃(𝑍 ≥ −√3) = 1 −
√Var (𝑆2 ) √Var (𝑆𝑤 )
𝑃(𝑍 ≤ −√3)
1
= 1 − Φ(−√3) ≥
2
Answer 10: D
Explanation:
(A) Sum of independent binomial variate is also a binomial variate if corresponding
probability will be same Then 𝑋 + 𝑌 ∼ Bin (2𝑛, 𝑝)
(B) When there are more than two variables include, the observation lead to multinomial
distribution. (𝑋, 𝑌) not follows Multinomial (2𝑛; 𝑝, 𝑝)
(C) Var (X − Y) = E(X − Y)2 − {E(X − Y)}2 = E(X − 𝑌)2
(D) Cov (𝑋 + 𝑌, 𝑋 − 𝑌) = 𝑉(𝑋) − Cov (𝑋, 𝑌) + Cov (𝑌, 𝑋) − 𝑉(𝑌) = 0
{∴ X and Y are independent Cov (X1 Y) = Cov (Y, X) = 0}
Hence option D is correct.

7.8 REFERENCES
• Devore, J. (2012). Probability and statistics for engineers, 8th ed. Cengage Learning.
• John A. Rice (2007). Mathematical Statistics and Data Analysis, 3rd ed. Thomson
Brooks/Cole
• Larsen, R., Marx, M. (2011). An introduction to mathematical statistics and its
applications. Prentice Hall.
7.9 SUGGESTED READINGS
• S. C Gupta , V.K Kapoor, Fundamentals of Mathematical Statistics,Sultan Chand
Publication, 11th Edition.
• B.L Agarwal, Programmed Statistics ,New Age International Publishers, 2nd Edition.

210 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

LESSON 8
STATISTICAL HYPOTHESIS

STRUCTURE
8.1 Learning Objectives
8.2 Introduction
8.3 Statistical Hypothesis
8.3.1 Simple and Composite Hypothesis
8.3.2 Critical Region
8.3.3 Type I and Type II Error
8.3.4 Most Powerful Test
8.3.5 Neymann Pearson Lemma
8.4 In-Text Questions
8.5 Summary
8.6 Glossary
8.7 Answer to In-Text Questions
8.8 References
8.9 Suggested Readings
8.1 LEARNING OBJECTIVES
One of the main objectives to discuss testing of hypothesis and how it can be use in real
analysis.
8.2 INTRODUCTION
The main problems in statistical inference can be broadly classified into two areas:
(i) The area of estimation of population parameter(s) and setting up of confidence intervals
for them, i.e, the area of point and intertal estimation and
(ii) Tests of statistical hypothesis.
In Neyman-Pearson theory, we use statistical methods to arrive at decisions in certain
situations where there is lack of certainty on the basis of a sample whose size is fixed in
advance while in Wald's sequential theory the sample size is not fixed but is regarded as a
random variable.
211 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

8.3 STATISTICAL HYPOTHESIS


We will discuss these distributions in details.
(i) Simple and Composite Hypothesis
(ii) Critical Region
(iii) Type I and Type II Error
(iv) Most Powerful Test
(v) Neymann Pearson Lemma
8.3.1 Simple and Composite Hypothesis
A statistical hypothesis is some statement or assertion about a population or equivalently. about
the probability distribution characterizing a population, which we want to verify on the
basis of information available from a sample. If the statistical hypothesis specifies the
population completely then it is termed as a simple statistical hypothesis otherwise it is called
a composite statistical hypothesis.
For example, if 𝑋1 , 𝑋2 , … , 𝑋𝑛 is a random sample of size 𝑛 from a normal population with mean
𝜇 and variance 𝜎 2 , then the hypothesis 𝐻0 : 𝜇 = 𝜇0 , 𝜎 2 = 𝜎02 is a simple hypothesis, whereas
each of the following hypotheses is a composite hypothesis:
(i) 𝜇 = 𝜇0
(ii) 𝜎 2 = 𝜎02 ,
(iii) 𝜇 < 𝜇0 , 𝜎 2 = 𝜎02 ,
(iv) 𝜇 > 𝜇0 , 𝜎 2 = 𝜎02
(v) 𝜇 = 𝜇0 , 𝜎 2 < 𝜎02 ,
(vi) 𝜇 = 𝜇0 , 𝜎 2 > 𝜎02 ,
(vii) 𝜇 < 𝜇0 , 𝜎 2 > 𝜎02 .
A hypothesis which does not specify completely ' 𝑟 ' parameters of a population is termed as a
composite hypothesis with 𝑟 degrees of freedom.
Test of a Statistical Hypothesis.
A test of a statistical hypothesis is a two-action decision problem after the experimental sample
values have been obtained, the two actions being the acceptance or rejection of the hypothesis
under consideration.
Null Hypothesis.
In hypothesis testing, a statistician or decision-maker should not be motivated by prospects of
profit or loss resulting from the acceptance or rejection of the hypothesis. He should be
completely impartial and should have no brief for any party or company nor should he allow
his personal views to influence the decision. Much, therefore, depends upon how the
hypothesis is framed. For example, let us consider the 'light-bulbs' problem. Let us suppose
212 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

that the bulbs manufactured under some standard manufacturing process have an average life
of 𝜇 hours and it is proposed to test a new procedure for manufacturing light bulbs. Thus, we
have two populations of bulbs, those manufactured by standard process and those
manufactured by the new process. In this problem the following three hypotheses may be set
up:
(i) New process is better than standard process.
(ii) New process is inferior to standard process.
(iii) There is no difference between the two processes.
The first two statements appear to be biased since they reflect a preferential attitude to one or
the other of the two processes. Hence the best course is to adopt the hypothesis of no difference,
as stated in (iii). This suggests that the statistician should take up the neutral or null attitude
regarding the outcome of the test. His attitude should be on the null or zero line in which the
experimental data has the due importance and complete say in the matter. This neutral or non-
committal attitude of the statistician or decision-maker before the sample observations are
taken is the keynote of the null hypothesis.
Thus in the above example of light bulbs if 𝜇0 is the mean life (in hours) of the bulbs
manufactured by the new process then the null hypothesis which is usually denoted by H0 , can
be stated as follows: 𝐻0 : 𝜇 = 𝜇0 .
As another example let us suppose that two different concerns manufacture drugs for inducing
sleep, drug 𝐴 manufactured by first concern and drug 𝐵 manufactured by second roncern. Each
company claims that its drug is superior to that of the other and it is desired to test which is a
superior drug 𝐴 or 𝐵 ? To formulate the statistical hypothesis let 𝑋 be a random variable which
denotes the additional hours of sleep gained by an individual when drug 𝐴 is given and let the
random variable 𝑌 denote
the additional hours of sleep gained when drug 𝐵 is used, Let us suppose that 𝑋 and 𝑌 follow
the probability distributions with means 𝜇𝑋 and 𝜇𝑌 respectively. Here our null hypothesis
would be that there is no difference between the effects of two drugs. Symbolically, 𝐻0 : 𝜇𝑋 =
𝜇𝑌
Alternative Hypothesis.
It is desirable to state what is called an alternative hypothesis in respect of every statistical
hypothesis being tested because the acceptance or rejection of null hypothesis is meaningful
only when it is being tested against a rival hypothesis which should rather be explicitly
mentioned. Alternative hypothesis is usually denoted by 𝐻1 . For example, in the example of
light bulbs, alternative hypothesis could be 𝐻1 : 𝜇 > 𝜇0 or 𝜇 < 𝜇0 or 𝜇 ≠ 𝜇0 . In the example of
drugs, the alternative hypothesis could be 𝐻1 : 𝜇𝑋 > 𝜇𝑌 or 𝜇𝑋 < 𝜇𝑌 or 𝜇𝑋 ≠ 𝜇𝑌 .
In both the cases, the first two of the alternative hypotheses give rise to what are called 'one
tailed' test and the third alternative hypothesis results in 'two tailed' tests.
213 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Important Remarks
1. In the formulation of a testing problem and devising a 'test of hypothesis' the roles of 𝐻0
and 𝐻1 are not at all symmetric. In order to decide which one of the two hypotheses should
be taken as null hypothesis 𝐻0 and which one as alternative hypothesis 𝐻1 , the intrinsic
difference between the roles and the implifications of these two terms should be clearly
understood.
2. If a particular problem cannot be stated as a test between two simple hypotheses, i.e.,
simple null hypothesis against a simple alternative hypothesis, then the next best alternative
is to formulate the problem as the test of a simple null hypothesis against a composite
alternative hypothesis. In other words, one should try to structure the problem so that null
hypothesis is simple rather than composite.
3. Keeping in mind the potential losses due to wrong decisions (which may or may not be
measured in terms of money), the decision maker is somewhat conservative in holding the
null hypothesis as true unless there is a strong evidence from the experimental sample
observations that it is false. To him, the consequences of wrongly rejecting a null
hypothesis seem to be more severe than those of wrongly accepting it. In most of the cases,
the statistical hypothesis is in the form of a claim that a particular product or product
process is superior to some existing standard. The null hypothesis H0 in this case is that
there is no difference between the new product or production process and the existing
standard. In other words, null hypothesis nullifies this claim. The rejection of the null
hypothesis wrongly which amounts to the acceptance of claim wrongly involves huge
amount of pocket expenses towards a substantive overhaul of the existing set-up. The
resulting loss is comparatively regarded as more serious than the opportunity loss in
wrongly accepting H0 which amounts to wrongly rejecting the claim, i.e., in sticking to the
less efficient existing standard. In the light-bulbs problem discussed earlier, suppose the
research division of the concern, on the basis of the limited experimentation, claims that
its brand is more effective than that manufactured by standard process. If in fact, the brand
fails to be more effective the loss incurred by the concern due to an immediate obsolescence
of the product, decline of the concern's image, etc., will be quite serious. On the other hand,
the failure to bring out a superior brand in the market is an opportunity loss and is not a
consideration to be as serious as the other loss.
8.3.2 Critical Region
Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be the sample observations denoted by O. All the values of 𝑂 will be
aggregate of a sample and they constitute a space, called the sample space, which is denoted
by 𝑆.
Since the sample values 𝑥1 , 𝑥2 , … , 𝑥𝑛 can be taken as a point in 𝑛-dimensional space, we
specify some region of the 𝑛-dimensional space and see whether this point lies within this
region or outside this region. We divide the whole sample space 𝑆 into two disjoint parts 𝑊

214 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

and 𝑆 − 𝑊 or 𝑊 ‾ or 𝑊 ′ : The null hypothesis H0 is rejected if the observed sample point falls
in 𝑊 and if it falls in 𝑊 ′ we reject 𝐻1 and accept H0 . The region of rejection of H0 when H0
is true is that region of the outcome set where H0 is rejected if the sample point falls in that
region and is called critical region. Evidently, the size of the critical region is 𝛼, the probability
of committing type 1 error (discussed below).
Suppose if the test is based on a sample of size 2, then the outcome set or the sample space is
the first quadrant in a two dimensional space and a test criterion will enable us to separate our
outcome set into two complementary subsets, W and 𝑊 ‾ . If the sample point falls in the subset
𝑊, 𝐻0 is rejected, otherwise 𝐻0 is accepted. This is shown in the adjoining diagram :

8.3.3 Type I and Type II Errors :


The decision to accept or reject the null hypothesis H0 is made on the basis of the information
supplied by the observed sample observations. The conclusion drawn on the basis of a
particular sample may not always be true in respect of the population. The four possible
situations that arise in any test procedure are given in the following table.

215 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

From the above table it is obvious that in any testing problem we are liable to commit two
types of errors.
Errors of Type I and Type II. The error of rejecting 𝐻0 (accepting 𝐻1 ) when 𝐻0 is true is called
Type I error and the error of accepting 𝐻0 when 𝐻0 is false (𝐻1 is true) is called Type II error.
The probabilities of type I and type II errors are denoted by 𝛼 and 𝛽 respectively.
Thus
𝛼 = Probability of type I error = Probability of rejecting 𝐻0 when 𝐻0 is true.
An ideal test would be the one which properly keeps under control both the types of errors.
But since the commission of an error of either type is a random variable, equivalently an ideal
test should minimise the probability of both the types of errors, viz., 𝛼 and 𝛽. But
unfortunately, for a fixed sample size 𝑛, 𝛼 and 𝛽 are so related (like producer's and consumer's
risk in sampling inspection plans), that the reduction in one results in an increase in the other.
Consequently, the simultaneous minimising of both the errors is not possible. Since type I error
is deemed to be more serious than the type II error (c.f. Remark 3§18.2.3 ) the usual practice
is to control 𝛼 at a predetermined low level and subject to this constraint on the probabilities
of type I error, choose a test which minimises 𝛽 or maximises the power function 1 − 𝛽.
Generally, we choose 𝛼 = 0.05 or 0.01 .
STEPS IN SOLVING TESTING OF HYPOTHESIS PROBLEM
The major steps involved in the solution of a 'testing of hypothesis' problem may be outlined
as follows:
1) Explicit knowledge of the nature of the population distribution and the parameter(s) of
interest, i.e., the parameter(s) about which the hypotheses are set up.
2) Setting up of the null hypothesis 𝐻0 and the alternative hypothesis 𝐻1 in terms of the range
of the parameter values each one embodies.
3) The choice of a suitable statistic 𝑡 = 𝑡(𝑥1 , 𝑥2 , … . , 𝑥𝑛 ) called the test statistic, which will
best reflect upon the probability of 𝐻0 and 𝐻1 .
4) Partitioning the set of possible values of the test statistic 𝑡 into two disjoint sets 𝑊 (called
the rejection region or critical region) and 𝑊 ‾ (called the acceptance region) and framing
the following test :
(i) Reject 𝐻0 (i.e., accept 𝐻1 ) if the value of 𝑡 falls in 𝑊.
(ii) Accept 𝐻0 if the value of 𝑡 falls 𝑊 ‾.
5) After framing the above test, obtain experimental sample observations, compute the
appropriate test statistic and take action accordingly.

216 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

OPTIMUM TEST UNDER DIFFERENT SITUATIONS


The discussion enables us to obtain the so called best test under different situations. In any
testing problem the first two steps, viz, the form of the population distribution, the parameter(s)
of interest and the framing of H0 and 𝐻1 should be obvious from the description of the problem.
The most crucial step is the choice of the 'best test, i.e., the best statistic ' 𝑡 ' and the critical
region 𝑊 where by best test we mean one which in addition to controlling 𝛼 at any desired
low level has the minimum type II error 𝛽 or maximum power 1 - 𝛽, compared to 𝛽 of all other
tests having this 𝛼. This leads to the following definition.
8.3.4 Most Powerful Test
Most Powerful Test (MP Test). Let us consider the problem of testing a simple hypothesis:
𝐻0 : 𝜃 = 𝜃0 against a simple alternative hypothesis : 𝐻1 : 𝜃 = 𝜃1
Definition. The critical region 𝑊 is the most powerful (MP) critical region of size 𝛼 (and the
corresponding test a most potverful test of level 𝛼 ) for testing 𝐻0 : 𝜃 = 𝜃0 against 𝐻1 : 𝜃 = 𝜃1
if

𝑃(𝑥 ∈ 𝑊 ∣ 𝐻0 ) = ∫ 𝐿0 𝑑𝑥 = 𝛼
𝑊

and 𝑃(𝑥 ∈ 𝑊 ∣ 𝐻1 ) ≥ 𝑃(𝑥 ∈ 𝑊1 ∣ 𝐻1 )


for every othen critical region 𝑊1 satisfying (18.3).
Uniformly Most Powerful Test (UMP Test).
Let us now take up the case of testing a simple null hypothesis against a composite
alternative hypothesis, e.g., of testing 𝐻0 : 𝜃 = 𝜃0
against the alternative 𝐻1 : 𝜃 ≠ 𝜃0
against the alternative 𝐻1 : 𝜃 ≠ 𝜃0
In such a case, for a predetermined 𝛼, the best test for 𝐻0 is called the uniformly most
powerful testeftevel 𝛼.
Definition. The region 𝑊 is called uniformly most powerful (UMP) critical region of size 𝛼
[and the corresponding test as uniformly most powerful (UMP) test of level 𝛼 ] for testing
𝐻0 : 𝜃 = 𝜃0 against 𝐻1 : 𝜃 ≠ 𝜃0 i.e., 𝐻1 : 𝜃 = 𝜃1 ≠ 𝜃0 if

𝑃(𝑥 ∈ 𝑊 ∣ 𝐻0 ) = ∫ 𝐿0 𝑑𝑥 = 𝛼
𝑊
and 𝑃(𝑥 ∈ 𝑊 ∣ 𝐻1 ) ≥ 𝑃(𝑥 ∈ 𝑊1 ∣ 𝐻1 ) for all 𝜃 ≠ 𝜃0
whatever the yegion 𝑊1 satisfying may be.

217 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

8.3.5 Neymann Pearson Lemma


This Lemma provides the most powerful test of simple hypothesis against a simple alternative
hypothesis. The theorem, known as Neyman-Pearson Lemma, will be proved for density
function 𝑓(𝑥, 𝜃) of a single continuous variate and a single parameter. However, by regarding
𝑥 and 𝜃 as vectors, the proof can be easily generalised for any number of random variables
𝑥1 , 𝑥2 , … , 𝑥𝑛 and any number of parameters 𝜃1 , 𝜃2 … , 𝜃k . The variables 𝑥1 , 𝑥2 … . . 𝑥𝑛 occurring
in this theorem are understood to represent a random sample of size 𝑛 from the population
whose density function is 𝑓(𝑥, 𝜃). The lemma is concerned with a simple hypothesis 𝐻0 : 𝜃 =
𝜃0 and a simple alternative 𝐻1 : 𝜃 = 𝜃1 .
Neyman Pearson Lemma
Let 𝑘 > 0, be a constant and 𝑊 be a critical region of size 𝛼 such that
𝑓(𝑥, 𝜃1 )
𝑊 = {𝑥 ∈ 𝑆: > 𝑘}
𝑓(𝑥, 𝜃0 )
𝐿1
⇒ 𝑊 = {𝑥 ∈ 𝑆: > 𝑘}
𝐿0
𝐿
‾ = {𝑥 ∈ 𝑆: 1 < 𝑘}
and 𝑊
𝐿0

where 𝐿0 and 𝐿1 are the likelihood functions of the sample observations 𝑥 = (𝑥1 , 𝑥2 , … , 𝑥𝑛 )
under 𝐻0 and 𝐻1 respectively. Then 𝑊 is the most powerful critical region of the test
hypothesis 𝐻0 : 𝜃 = 𝜃0 against the alternative 𝐻1 : 𝜃 = 𝜃1 .

Proof. We are given

𝑃(𝑥 ∈ 𝑊 ∣ 𝐻0 ) = ∫ 𝐿0 𝑑𝑥 = 𝛼
𝑊

The power of the region is

𝑃(x ∈ 𝑊 ∣ 𝐻1 ) = ∫ 𝐿1 𝑑x = 1 − 𝛽, (say).
𝑊

In order to establish the lemma, we have to prove that there exists no other critical region, of
size less than or equal to 𝛼, which is more powerful than 𝑊.
Let 𝑊1 be another critical region of size 𝛼1 ≤ 𝛼 and power 1 − 𝛽1 so that we have

218 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑃(𝑥 ∈ 𝑊1 ∣ 𝐻0 ) = ∫ 𝐿0 𝑑𝑥 = 𝛼1
𝑊1
𝐿1
and 𝑃(𝑥 ∈ 𝑊1 ∣ 𝐻1 ) = ∫ 𝑑x = 1 − 𝛽1
𝑊1

Now we have to prove that 1 − 𝛽 ≥ 1 − 𝛽1


Let
𝑊 = 𝐴 ∪ 𝐶 and 𝑊1 = 𝐵 ∪ 𝐶

If 𝛼1 ≤ 𝛼, we have

∫ 𝐿0 𝑑x ≤ ∫ 𝐿0 𝑑x
𝑊1 𝑊

⇒ ∫ 𝐿0 𝑑x ≤ ∫ 𝐿0 𝑑x
𝐵∪𝐶 𝐴∪𝐶

⇒ ∫ 𝐿0 𝑑x ≤ ∫ 𝐿0 𝑑x
𝐵 𝐴

⇒ ∫ 𝐿0 𝑑x ≥ ∫ 𝐿0 𝑑x
𝐴 𝐵

Since 𝐴 ⊂ 𝑊,

(18.5) ⇒ ∫ 𝐿1 𝑑x > 𝑘 ∫ 𝐿0 𝑑x ≥ 𝑘 ∫ 𝐿0 𝑑x
𝐴 𝐴 𝐵

Also it implies
𝐿1

≤ 𝑘∀𝑥 ∈ 𝑊
𝐿0
⇒ ∫ 𝐿1 𝑑x ≤ 𝑘 ∫ 𝐿0 𝑑𝐱

𝑊 ‾
𝑊

219 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

‾ , say 𝑊
This result also holds for any subset of 𝑊 ‾ ∩ 𝑊1 = 𝐵. Hence

∫ 𝐿1 𝑑x ≤ 𝑘 ∫ 𝐿0 𝑑x ≤ ∫ 𝐿1 𝑑x
𝐵 𝐵 𝐴
Adding ∫𝐶 𝐿1 𝑑𝐱 to both sides, we get

∫ 𝐿1 𝑑𝐱 ≤ ∫ 𝐿1 𝑑𝐱 ⇒ 1 − 𝛽 ≥ 1 − 𝛽1
𝑊1 𝑊

Hence the Lemma.


Remark. Let W defined in (18.5) of the above theorem be the most powerful critical region
of size 𝛼 for testing 𝐻0 : 𝜃 = 𝜃0 against 𝐻1 : 𝜃 = 𝜃1 , and let it be independent of 𝜃1 ∈ Θ1 =
Θ − Θ0 , where Θ0 is the parameter space under 𝐻0 . Then we say that C.R. W is the UMP CR
of size 𝛼 for testing: 𝐻0 : 𝜃 = 𝜃0 , against 𝐻1 : 𝜃 ∈ Θ1 .
Example 1. Given the frequency function :
1
𝑓(𝑥, 𝜃) = {𝜃 , 0 ≤ 𝑥 ≤ 𝜃
0, elsewhere
and that you are testing the null hypothesis 𝐻0 : 𝜃 = 1 against 𝐻1 : 𝜃 = 2, by means of a single
observed value of 𝑥. What would be the sizes of the type 1 and type II errors, if you choose
the interval (i) 0.5 ≤ 𝑥, (ii) 1 ≤ 𝑥 ≤ 1.5 as the critical regions? Also obtain the power function
of the test.
Solution. Here we want to test 𝐻0 : 𝜃 = 1, against 𝐻1 : 𝜃 = 2.
(i) Here and
𝑊 = {𝑥: 0.5 ≤ 𝑥} = {𝑥: 𝑥 ≥ 0.5}

𝑊 = {𝑥: 𝑥 ≤ 0.5}
𝛼 = 𝑃{𝑥 ∈ 𝑊 ∣ 𝐻0 } = 𝑃{𝑥 ≥ 0.5 ∣ 𝜃 = 1} = 𝑃{0.5 ≤ 𝑥 ≤ 𝜃 ∣ 𝜃 = 1}
1 1
= 𝑃{0.5 ≤ 𝑥 ≤ 1 ∣ 𝜃 = 1} = ∫ [𝑓(𝑥, 𝜃)]𝜃=1 𝑑𝑥 = ∫ 1. 𝑑𝑥 = 0.5
0.5 0.5

Similarly,
𝛽 ‾ ∣ 𝐻1 } = 𝑃{𝑥 ≤ 0.5 ∣ 𝜃 = 2}
= 𝑃{𝑥 ∈ 𝑊
0.5 0.5
1
= ∫ [𝑓(𝑥, 𝜃)]𝜃=2 𝑑𝑥 = ∫ 𝑑𝑥 = 0.25
0 0 2
Thus the sizes of type 𝐼 and type 𝐼 errors are respectively 𝛼 = 0.5 and 𝛽 = 0.25
and power function of the test = 1 − 𝛽 = 0.75

220 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(ii) 𝑊 = {𝑥: 1 ≤ 𝑥 ≤ 1.5}


15
𝛼 = 𝑃(𝑥 ∈ 𝑊 ∣ 𝜃 = 1} = ∫ [𝑓(𝑥, 𝜃)]𝜃=1 𝑑𝑥 = 0,
1

since under 𝐻0 : 𝜃 = 1, 𝑓(𝑥, 𝜃) = 0, for 1 ≤ 𝑥 ≤ 1.5.


𝛽 = 𝑃{𝑥 ∈ 𝑊 ‾ ∣ 𝜃 = 2} = 1 − 𝑃{𝑥 ∈ 𝑊 ∣ 𝜃 = 2}
15
𝑥 1.5
= 1 − ∫ [𝑓(𝑥, 𝜃)]𝜃=2 𝑑𝑥 = 1 − | | = 0.75
1 21
∴ Power Function = 1 − 𝛽 = 1 − 0.75 = 0.25

Example 2. If 𝑥 ≥ 1 is the critical region for testing 𝐻0 : 𝜃 = 2 against the alternative 𝜃 = 1,


or the basis of the single observation from the population, 𝑓(𝑥, 𝜃) = 𝜃exp (−𝜃𝑥),0 ≤ 𝑥 < ∞,
obtain the values of type I and type II errors.

‾ = {𝑥: 𝑥 < 1} and 𝐻0 : 𝜃 = 2, 𝐻1 : 𝜃 = 1.


Solution. Here 𝑊 = {𝑥: 𝑥 ≥ 1} and 𝑊
𝛼 = Size of Type I error = 𝑃[𝑥 ∈ 𝑊 ∣ 𝐻0 ] = 𝑃[𝑥 ≥ 1 ∣ 𝜃 = 2]
∞ ∞ ∞
−2𝑥
𝑒 −2𝑥 1
= ∫ [𝑓(𝑥, 𝜃)]𝜃=2 𝑑𝑥 = 2 ∫ 𝑒 𝑑𝑥 = 2 | | = 2
1 1 −2 1 𝑒
𝛽 = Size of type II error = 𝑃[𝑥 ∈ 𝑊‾ ∣ 𝐻1 } = 𝑃{𝑥 < 1 ∣ 𝜃 = 1}
1
−𝑥
𝑒 −𝑥 1 𝑒−1
= ∫ 𝑒 𝑑𝑥 = | | = (1 − 𝑒 −1 ) = .
0 −1 0 𝑒
Example 3. Let 𝑝 be the probability that a coin will fall head in a single toss in order to test
1 3
𝐻0 /𝑝 = 2 against 𝐻1 : 𝑝 = 4. The coin is tossed 5 times and 𝐻0 is rejected if more than 3 heads
are obtained. Find the probability of type I error and power of the test.
1 3
Solution. Here 𝐻0 : 𝑝 = 2 and 𝐻1 : 𝑝 = 4.
If the r.v. X denotes the number of heads in 𝑛 tosses of a coin then 𝑋 ∼ 𝐵(𝑛, 𝑝) so that
𝑛
𝑃(𝑋 = 𝑥) = 𝐶𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 = 5 𝐶𝑥 𝑝 𝑥 (1 − 𝑝)5−𝑥 ------------(*)
‾ = {𝑥: 𝑥 ≤ 3}
The critical region is given by : 𝑊 = {𝑥: 𝑥 ≥ 4} ⇒ 𝑊

221 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝛼 = Probability of type I error = 𝑃[𝑋 ≥ 4 ∣ 𝐻0 ]


1 1 5
1 4 1 5−4 5 1 5
= 𝑃 (𝑋 = 4 ∣ 𝑝 = ) + 𝑃 (𝑋 = 5 ∣ 𝑝 = ) = 𝐶4 ( ) ( ) + 𝐶5 ( ) [From (*)]
2 2 2 2 2
1 5 1 5 1 5 3
= 5( ) + ( ) = 6( ) =
2 2 2 16
𝛽 = Probability of Type II error = 𝑃(𝑥 ∈ 𝑊 ‾ ∣ 𝐻1 ) = 1 − 𝑃(𝑥 ∈ 𝑊 ∣ 𝐻1 )
3 3 5
3 4 1 5
3 5
= 1 − [{𝑃 (𝑋 = 4 ∣ 𝑝 = ) + 𝑃 (𝑋 = 5 ∣ 𝑝 = ) = 1 − { 𝐶4 ( ) ( ) + 𝐶5 ( ) }
4 4 4 4 4
3 4 5 3 81 47
=1−( ) { + }=1− =
4 4 4 128 128
81
∴ Power of the test = 1 − 𝛽 = 128.

Example 4. Let 𝑋 ∼ 𝑁(𝜇, 4), 𝜇 unknown. To test 𝐻0 : 𝜇 = −1 against 𝐻1 : 𝜇 = 1, based on a


sample of size 10 from this population, we use the critical region : 𝑥1 + 2𝑥2 𝑦 … + 10𝑥10 ≥
0. What is its size? What is the power of the test ?
Solution. Critical Region 𝑊 = {𝑥: 𝑥1 + 2𝑥2 + ⋯ + 10𝑥10 ≥ 0}.
Let 𝑈 = 𝑥1 + 2𝑥2 + ⋯ + 10𝑥10
Since 𝑥𝑖 s are i.i.d. 𝑁(𝜇, 4),

𝑈 ∼ 𝑁[(1 + 2+. . +10)𝜇, (12 + 22 + ⋯ + 102 )𝜎 2 ] = 𝑁(55𝜇, 385𝜎 2 )


⇒ ∼∼ (55𝜇, 385 × 4) = 𝑁(55𝜇, 1540) − − − −(∗)

The size ' 𝛼 ' of the critical region is : 𝛼 = 𝑃(𝑥 ∈ 𝑊 ∣ 𝐻0 ) = 𝑃(𝑈 ≥ 0 ∣ 𝐻0 ) − − − −(∗∗)
𝑈−𝐸(𝑈) 𝑈+55
Under 𝐻0 : 𝜇 = −1, 𝑈 ∼ 𝑁(−55,1540)[ From (∗)] ⇒ 𝑍 = 𝜎 =
𝑈 √1540
55 55
∴ Under 𝐻0 , when 𝑈 = 0, 𝑍 = = 39.2428 = 1.4015
√1540
[From ( ∗∗ ) ]
(From Normal Probability Tables)
Alternatively, 𝛼 = 1 − 𝑃(𝑍 ≤ 1.4015) = 1 − Φ(1.4015),
where Φ(⋅) is the distribution function of standard normal variate.
Power of the test is : 1 − 𝛽 = 𝑃(𝑥 ∈ 𝑊 ∣ 𝐻1 ) = 𝑃(𝑈 ≥ 0 ∣ 𝐻1 )
Under 𝐻1 : 𝜇 = 1, 𝑈 ∼ 𝑁(55,1540)

222 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑈 − 𝐸(𝑈) −55
⇒ 𝑍 = = = −1.40 (when U = 0)
𝜎𝑢 √1540
∴ 1 − 𝛽 = 𝑃(𝑍 ≥ −1.40) = 𝑃(−1.4 ≤ 𝑍 ≤ 0) + 0.5
= 𝑃(0 ≤ 𝑍 ≤ 1.4) + 0.5 (By Symmetry)
= 0.4192 + 0.5 = 0.9192
Alternatively, 1 − 𝛽 = 1 − 𝑃(𝑍 ≤ −1.40) = 1 − Φ(−1.40),
where Φ(.)𝑖𝑠𝑡ℎ𝑒𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛𝑜𝑓𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑𝑛𝑜𝑟𝑚𝑎𝑙𝑣𝑎𝑟𝑖𝑎𝑡𝑒.

Example 5. Let X have a p.d.f. of the form :


1 −𝑥/𝜃
𝑓(𝑥, 𝜃) = {𝜃 𝑒 0 < 𝑥 < ∞, 𝜃 > 0
0, , elsewhere.
To test 𝐻0 : 𝜃 = 2, against 𝐻1 : 𝜃 = 1, use the random sample 𝑥1 , 𝑥2 of size 2 and define a
critical region: 𝑊 = {(𝑥1 , 𝑥2 ): 9.5 ≤ 𝑥1 + 𝑥2 }.
Find: (i) Power of the test.
(ii) Significance level of the test.

Solution. We are given the critical region :


𝑊 = {(𝑥1 , 𝑥2 ): 9.5 ≤ 𝑥1 + 𝑥2 } = {(𝑥1 𝑥2 ): 𝑥1 + 𝑥2 ≥ 9.5}
Size of the critical region 𝑖. 𝑒. , the significance level of the test is given by :
𝛼 = 𝑃(𝐱 ∈ 𝑊 ∣ 𝐻0 ) = 𝑃[𝑥1 + 𝑥2 ≥ 9.5 ∣ 𝐻0 ]
In sampling from the given exponential distribution,
2 𝑛 2
∑ 𝑥 ∼ 𝜒 2 (2𝑛) ⇒ 𝑈 = (𝑥 + 𝑥2 ) ∼ 𝜒 2 (4), (𝑛 = 2) [c.f. Example 18-8]
𝜃 𝑖=1 𝑖 𝜃 1
2 2
∴ 𝛼 = 𝑃 [ (𝑥1 + 𝑥2 ) ≥ × 9.5 ∣ 𝐻0 ] [From ( ∗ )]
𝜃 𝜃
2
= 𝑃[𝜒 (4) ≥ 9.5]
𝛼 = 0.05 (From Probability Tables of 𝜒 2 -distribution]
Power of the test is given by
1−𝛽 = 𝑃(𝑥 ∈ 𝑊 ∣ 𝐻1 ) = 𝑃(𝑥1 + 𝑥2 ≥ 9.5 ∣ 𝐻1 )
2 2
= 𝑃 [ (𝑥1 + 𝑥2 ) ≥ × 9.5 ∣ 𝐻1 ]
𝜃 𝜃
2
= 𝑃[𝜒(4) ≥ 19]
(∵ Under 𝐻1 , 𝜃 = 1 )
223 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Example 6. Use the Neyman-Pearson Lemma to obtain the region for testing 𝜃 = 𝜃0 against
𝜃 = 𝜃1 > 𝜃0 and 𝜃 = 𝜃1 < 𝜃0 , in the case of a normal population 𝑁(𝜃, 𝜎 2 ), where 𝜎 2 is
known. Hence find the power of the test.
Solution.
𝑛 𝑛 𝑛
1 1
𝐿 = ∏ 𝑓(𝑥𝑖 , 𝜃) = ( ) exp {− 2 ∑ (𝑥𝑖 − 𝜃)2 }
𝜎√2𝜋 2𝜎
𝑖=1 𝑖=1

Using Neyman-Pearson Lemma, best critical region (B.C.R.) is given by (for 𝑘 > 0 )
1 𝑛 2
𝐿1 exp {− 2𝜎 2 ∑𝑖=1 (𝑥𝑖 − 𝜃1 ) }
= ≥𝑘
𝐿0 exp {− 1 ∑𝑛 (𝑥 − 𝜃 )2 }
2𝜎 2 𝑖=1 𝑖 0
𝑛 𝑛
1
⇒ exp [− 2 {∑ (𝑥𝑖 − 𝜃1 )2 − ∑ (𝑥𝑖 − 𝜃0 )2 }] ≥ 𝑘
2𝜎
𝑖=1 𝑖=1
𝑛
𝑛 1
⇒ exp [− 2 (𝜃12 − 𝜃02 ) + 2 (𝜃1 − 𝜃0 ) ∑ 𝑥𝑖 ] ≥ 𝑘
2𝜎 𝜎
𝑖=1
𝑛
𝑛 2 2)
1
⇒ − (𝜃 − 𝜃 + (𝜃 − 𝜃0 ) ∑ 𝑥𝑖 ≥ log 𝑘
2𝜎 2 1 0
𝜎2 1
𝑖=1

(since log 𝑥 is an increasing function of 𝑥 )


𝜎2 𝜃12 − 𝜃02
⇒ 𝑥‾(𝜃1 − 𝜃0 ) ≥ log 𝑘 +
𝑛 2
Case (i) If 𝜃1 > 𝜃0 , the B.C.R. is determined by the relation (right-tailed test) :
𝜎 2 log 𝑘 𝜃1 + 𝜃0
𝑥‾ > ⋅ +
𝑛 𝜃1 − 𝜃0 2
⇒ 𝑥‾ > 𝜆1 (say).
∴ B.C.R. is: 𝑊 = {x: 𝑥‾ > 𝜆1 }
Case (ii) If 𝜃1 < 𝜃0 , the B.C.R. is given by the relation (left handed test)
𝜎 2 log 𝑘 𝜃1 + 𝜃0
𝑥‾ < ⋅ + = 𝜆2 , (say).
𝑛 𝜃1 − 𝜃0 2
Hence B.C.R. is : 𝑊1 = |𝑥: 𝑥‾ ≤ 𝜆2 |
The constants 𝜆1 and 𝜆2 are so chosen as to make the probability of each of the relations (18.10)
and (18.11) equal to 𝛼 when the hypothesis 𝐻0 is true. The sampling distribution of 𝑥‾, when
224 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝜎2
𝐻𝑖 is true is 𝑁 (𝜃𝑖 , ) , (𝑖 = 0,1). Therefore, the constants 𝜆1 and 𝜆2 are determined from the
𝑛
relations :
𝑃[𝑥‾ > 𝜆1 ∣ 𝐻0 ] = 𝛼 and 𝑃[𝑥‾ < 𝜆2 ∣ 𝐻0 ] = 𝛼
𝜆1 − 𝜃0
∴ 𝑃(𝑥‾ > 𝜆1 ∣ 𝐻0 ) = 𝑃 [𝑍 > ] = 𝛼; 𝑍 ∼ 𝑁(0,1)
𝜎/√𝑛
𝜆1 − 𝜃0 𝜎
⇒ = 𝑧𝛼 ⇒ 𝜆1 = 𝜃0 + 𝑧𝛼
𝜎/√𝑛 √𝑛

where 𝑧𝛼 is the upper 𝛼-point of the standard normal variate given by :


𝑃(𝑍 > 𝑧𝛼 ) = 𝛼
Also 𝑃(𝑥‾ < 𝜆2 ∣ 𝐻0 ) = 𝛼 ⇒ 𝑃(𝑥‾ ≥ 𝜆2 ∣ 𝐻0 ) = 1 − 𝛼
𝜆2 − 𝜃0 𝜆2 − 𝜃0
⇒ 𝑃 (𝑍 ≥ )= 1−𝛼 ⇒ = 𝑧1−𝛼
𝜎/√𝑛 𝜎/√𝑛
𝜎
⇒ 𝜆2 = 𝜃0 + 𝑧1−𝛼
√𝑛

Note. By symmetry of normal distribution, we have 𝑧1−𝛼 = −𝑧𝛼 .

Power of the test. By definition, the power of the test in case (𝑖) is :
1 − 𝛽 = 𝑃[𝑥 ∈ 𝑊 ∣ 𝐻1 ] = 𝑃[𝑥‾ ≥ 𝜆1 ∣ 𝐻1 ]
𝜆1 − 𝜃1 𝑥‾ − 𝜃1
= 𝑃 (𝑍 ≥ ) [∵ Under 𝐻1 , 𝑍 = ∼ 𝑁(0,1)]
𝜎/√𝑛 𝜎/√𝑛
𝜎
𝜃0 + 𝑧𝛼 − 𝜃1
√ 𝑛
= 𝑃 (𝑍 ≥ )
𝜎/√𝑛
𝜃1 − 𝜃0
= 𝑃 (𝑍 ≥ 𝑧𝛼 − )
𝜎/√𝑛
= 1 − 𝑃(𝑍 ≤ 𝜆3 )
= 1 − Φ(𝜆3 ),

where Φ(.)𝑖𝑠𝑡ℎ𝑒𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛𝑜𝑓𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑𝑛𝑜𝑟𝑚𝑎𝑙𝑣𝑎𝑟𝑖𝑎𝑡𝑒.
Similarly in case (ii), (𝜃1 < 𝜃0 ), the power of the test is

225 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝜆2 − 𝜃1
1−𝛽 = 𝑃(𝑥‾ < 𝜆2 ∣ 𝐻1 ) = 𝑃 (𝑍 < )
𝜎/√𝑛
𝜎
𝜃0 + 𝑧1−𝛼 − 𝜃1
√𝑛
= 𝑃 (𝑍 < )
𝜎/√𝑛

𝜃0 − 𝜃1
= 𝑃 (𝑍 < 𝑧1−𝛼 + ) = Φ(𝜆4 ), (∵ 𝜃0 > 𝜃1 ) … (18 ⋅ 13𝑎)
𝜎/√𝑛
√𝑛(𝜃0 −𝜃1 ) √𝑛(𝜃0 −𝜃1 )
where 𝜆4 = 𝑧1−𝛼 + = − 𝑧𝛼
𝜎 𝜎

8.4 IN-TEXT QUESTIONS


MCQ’s Problems
Question 1.
Let 𝑋1 , … , 𝑋𝑛 be a random sample of size n(≥ 2) from a uniform distribution with
1
; 0<𝑥<𝜃
probability density function 𝑓(𝑥, 𝜃) = {𝜃
0, otherwise
where 𝜃 ∈ (0, ∞). If 𝑋(1) = min{𝑋1 , … , 𝑋𝑛 } and 𝑋(𝑛) = max{𝑋1 , … , 𝑋𝑛 }.
1 𝜃
then, as n→ ∞, 𝜃 (𝑋(𝑛) + 𝑛+1) converges in probability to
A. 1
B. 0
C. 2
D. 3
Question 2.
Which measure is used to determine the convexity of the distribution curve?
A. skewness
B. kurtosis
C. variance
D. standard deviation
Question 3.
Consider the sample linear regression model 𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖 , 𝑖 = 1,2, … , 𝑛
Where 𝜖𝑖′ 𝑠 are i.i.d random variables with mean 0 and variance 𝜎 2 ∈ (0, ∞)
226 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Suppose that we have a data set (𝑥1 , 𝑦2 ), … , (𝑥𝑛 , 𝑦𝑛 ) with n = 10,


∑𝑛𝑖=1 𝑥𝑖 = 50, ∑𝑛𝑖=1 𝑦𝑖 = 40, ∑𝑛𝑖=1 𝑥𝑖2 = 500 ∑𝑛𝑖=1 𝑦𝑖2 = 400 and ∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 = 400.
An unbiased estimate of 𝜎 2 is
A. 5
B. 1/5
C. 10
D. 1/10
Question 4.
If 𝑋1 , 𝑋2 , … , 𝑋𝑛 is a random sample from a population with density
1 −𝑥
𝑓(𝑥, 𝜃) = {𝜃 𝑒
𝜃 if 0<𝑥 < ∞
0, otherwise
Where 𝜃 > 0 is an unknown parameter, what is a 100(1 − 𝛼)% confidence interval for 𝜃?

2∑𝑛 ln 𝑋 2∑𝑛
𝑖=1 ln 𝑋𝑖
A. [ 𝜒2𝑖=1(2𝑛)𝑖 , 2 (2𝑛) ]
𝛼 𝜒𝛼
1−
2 2

2∑𝑛
𝑖=1 𝑋𝑖 2∑𝑛
𝑖=1 𝑋𝑖
B. [ , ]
𝜒2 𝛼 (2𝑛) 2 (2𝑛)
𝜒𝑎
1−
2 2

2∑𝑛 𝑋𝑖 2∑𝑛 𝑋
C. [ 𝜒2𝑖=1 , 𝑖=1 𝑖 ]
(2𝑛) 𝜒2 (2𝑛)
𝛼 𝛼
1−
2 2

2∑𝑛
𝑖=1 ln 𝑋𝑖 2∑𝑛
𝑖=1 ln 𝑋𝑖
D. [ 2 (2𝑛) , ]
𝜒𝛼 𝜒2 𝛼 (2𝑛)
1−
2 2

Question 5.
Suppose that 𝑋 has uniform distribution on the interval [0,100]. Let 𝑌 denote the greatest
integer smaller than or equal to X. Which of the following is true?
1
A. 𝑃(𝑌 ≤ 25) =
4
26
B. 𝑃(𝑌 ≤ 25) = 100
C. 𝐸(𝑌) = 50
227 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

101
D. 𝐸(𝑌) = 2

Question 6.
Let 𝑥1 = 3, 𝑥2 = 4, 𝑥3 = 3, 𝑥4 = 2.5 be the observed values of a random sample from the
probaability density function
𝑥 𝑥
1 1 1 − 2
𝑓( 𝑥 ∣ 𝜃 ) = 3 [𝜃 𝑒 − 𝜃 + 𝜃2 𝑒 𝜃 + 𝑒 −𝑥 ] , 𝑥 > 0, 𝜃 𝑏𝑒𝑙𝑜𝑛𝑔𝑠 𝑡𝑜 {1,2,3,4}
then the method of moment estimate (MME) of 𝜃 is
A. 1
B. 2
C. 3
D. 4
Question 7.
Let the random variable 𝑋 and 𝑌 have the joint probability mass function
𝑥 3 𝑦 1 𝑥−𝑦 2𝑥
𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) = 𝑒 −2 (𝑦) ( ) ( ) , 𝑦 = 0,1,2, … , 𝑥; 𝑥 = 0,1,2, …
4 4 𝑥!
Then 𝑉(𝑌) is equal to
A. 1
B. 1/2
C. 2
D. 3/2
Question 8.
Let the discrete random variables 𝑋 and 𝑌 have the joint probability mass function
𝑒 −1
; 𝑚 = 0,1,2, … , 𝑛; 𝑛 = 0,1,2, …
𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) = {(𝑛 − 𝑚)! 𝑚! 2𝑛
0, otherwise
Which of the following statements is(are) TRUE?
A. The marginal distribution of 𝑋 is Poisson with mean 1/2
B. The random variable 𝑋 and 𝑌 are independent
1
C. The conditional distribution of X given Y = 5 is Bin (6, 2)
D. 𝑃(𝑌 = 𝑛) = (𝑛 + 1)𝑃(𝑌 = 𝑛 + 2) for 𝑛 = 0,1,2, …
228 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question 9.
Consider the trinomial distribution with the probability mass function
2! 1 𝑥 2 𝑦 3 2−𝑥−𝑦
𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) = ( ) ( ) ( )
𝑥! 𝑦! (2 − 𝑥 − 𝑦)! 6 6 6
, 𝑥 ≥ 0, 𝑦 ≥ 0, and 0 < 𝑥 + 𝑦 ≤ 2. Then Corr (𝑋, 𝑌) is equal to…
(correct up to two decimal places)
A) -0.31
B) 0.31
C) 0.35
D) 0.78
Question 10.
Let 𝑥1 = 1.1, 𝑥2 = 0.5, 𝑥3 = 1.4, 𝑥4 = 1.2 be the observed values of a random sample of size
four from a distribution with the probability density function
𝑒 𝜃−𝑥 , if 𝑥 ≥ 𝜃, 𝜃 ∈ (−∞, ∞)
𝑓(𝑥 ∣ 𝜃) = {
0, otherwise
Then the maximum likelihood estimate of 𝜃 2 + 𝜃 + 1 is equal (up to decimal place).
A) 1.75
B) 1.89
C) 1.74
D) 0.87
Question 11.
Let 𝑈 ∼ 𝐹5,8 and 𝑉 ∼ 𝐹8,5. If 𝑃[𝑈 > 3.69] = 0.05, then the value of C such that
𝑃[𝑉 > 𝑐] = 0.95 equals… (round off two decimal places)
A) 0.27
B) 1.27
C) 2.27
D) 2.29

229 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question 12.
Let P be a probability function that assigns the same weight to each of the points of the sample
space Ω = {1,2,3,4}. Consider the events E = {1,2}, F = {1,3} and G = {3,4}. Then which of
the following statement(s) is (are) TRUE?
1. E and F are independent
2. E and G are independent
3. E, F and G are independent
Select the correct answer using code given below:
A. 1 only
B. 2 only
C. 1 and 2 only
D. 1,2 and 3
Question 13.
Let 𝑋1 , 𝑋2 , … , 𝑋4 and 𝑌1 , 𝑌2 , … , 𝑌5 be two random samples of size 4 and 5 respectively,
51 𝑋 2 +𝑋 2 +𝑋 2 +𝑋 2
2 3 4
from a standard normal population. Define the statistic T = (4) 𝑌 2+𝑌 2 +𝑌 2 +𝑌 2 +𝑌 2
1 2 3 4 5

then which of the following is TRUE?


A. Expectation of 𝑇 is 0.6
B. Variance of T is 8.97
C. T has F-distribution with degree of freedom 5 and 4
D. T has F-distribution with degree of freedom 4 and 5
Question 14.
Let 𝑋, 𝑌 and 𝑍 be independent random variables with respective moment generating function
1 2
𝑀𝑋 (𝑡) = 1−𝑡 , 𝑡 < 1; 𝑀𝑌 (𝑡) = 𝑒 𝑡 /2 = 𝑀𝑍 (𝑡) 𝑡 ∈ ℝ. Let 𝑊 = 2𝑋 + 𝑌 2 + 𝑍 2 then P(W > 2)
is equals to
A. 2𝑒 −1
B. 2𝑒 −2
C. 𝑒 −1
D. 𝑒 −2

230 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question 15
Let 𝑥1 = 3, 𝑥2 = 4, 𝑥3 = 3, 𝑥4 = 2.5 be the observed values of a random sample from the
𝑥 𝑥
1 1 1 − 2
probability density function 𝑓(𝑥 ∣ 𝜃) = 3 [𝜃 𝑒 − 𝜃 + 𝜃2 𝑒 𝜃 + 𝑒 −𝑥 ] , 𝑥 > 0, 𝜃 ∈ (0, ∞)
Then the method of moment estimate (MME) of 𝜃 is
A. 1.5
B. 2.5
C. 3.5
D. 4.5
Question 16.
Let 𝑋 be a random variable with cumulative distribution function
1 𝑛+2𝑘+1
𝑃(𝑋 = ℎ, 𝑌 = 𝑘) = ( ) ; 𝑛 = −𝑘, −𝑘 + 1, … , ; 𝑘 = 1,2, …
2
Then E(Y) equals
A. 1
B. 2
C. 3
D. 4
Question 17.
Let 𝑋 be a random variable with the cumulative distribution function
0, 𝑥<0
1 + 𝑥2
, 0≤𝑥<1
𝐹(𝑥) = 10
3 + 𝑥2
, 1≤𝑥<2
10
{ 1, 𝑥≥2
Which of the following statements is (are) TRUE?
3
A. 𝑃(1 < 𝑋 < 2) =
10
31
B. 𝑃(1 < 𝑋 ≤ 2) = 5
11
C. 𝑃(1 ≤ 𝑋 < 2) = 2
231 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

41
D. 𝑃(1 ≤ 𝑋 ≤ 2) = 5

Question 18.
Let the random variables 𝑋1 and 𝑋2 have joint probability density function
𝑥1 𝑒 −𝑥1𝑥2
𝑓(𝑥1 , 𝑥2 ) = { , 1 < 𝑥1 < 3, 𝑥2 > 0
2
0, otherwise.
What is the value Var (𝑋2 ∣ 𝑋1 = 2) …(up to two decimal place)?
A) 0.27
B) 0.28
C) 0.25
D) 1.90
Question 19
Let 𝑥1 = 1, 𝑥2 = 0, 𝑥3 = 0, 𝑥4 = 1, 𝑥5 = 0, 𝑥6 = 1 be the data on a random sample of size 6
from Bin (1, 𝜃) distribution, where 𝜃 ∈ (0,1). Then the uniformly minimum variance unbiased
estimate of 𝜃(1 + 𝜃) equal to
8.5 SUMMARY
The main points which we have covered in this lessons are what is estimator and what is
consistency, efficiency and sufficiency of the estimator and how to get best estimator.
8.6 GLOSSARY
• Motivation: These Problems are very useful in real life and we can use it in data
science , economics as well as social sciemce.
• Attention: Think how the best estimator are useful in real world problems.
8.7 ANSWER TO IN-TEXT QUESTIONS
Answer 1: A
Explanation :
1
𝑓(𝑥, 𝜃) = {𝜃 ; 0 < 𝑥 < 𝜃
0, otherwisse
𝑛𝜃
E(𝑋(𝑛) ) =
𝑛+1
1 𝜃
Let Y = 𝜃 (𝑋(𝑛) + 𝑛+1)

232 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑋(𝑛) 1
so E(Y) = E( +𝑛+1) = 1
𝜃
1 1
lim𝑛→∞ 𝐸 [ (𝑋(𝑛) + )] = 1;
𝜃 𝑛+1
Hence option A is correct.

Answer 2: B
Explanation:
Convexity (peakedness) is decided by kurtosis.
Answer 3: C
Explanation :
𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖 , 𝑖 = 1,2, … , 𝑛
∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑛𝑥‾𝑦‾ 400 − 10 × 5 × 4 4 4
𝛽ˆ = 𝑛 = = ; 𝛼
ˆ = 𝑦
‾ − 𝑥
‾𝛽ˆ = 4−5× =0
∑𝑖=1 𝑥𝑖2 − 𝑛𝑥‾ 2 500 − 10 × 52 5 5
An unbiased estimate of 𝜎 2 is
1 2 1 4 2
𝜎ˆ 2 = 𝑛−2 ∑𝑛𝑖=1 (𝑦𝑖 − 𝛼ˆ − 𝛽ˆ 𝑥𝑖 ) = 10−2 ∑𝑛𝑖=1 (𝑦𝑖 − 5 𝑥𝑖 )
1 4 4 2 1 8 16
= 8 (∑𝑛𝑖=1 𝑦𝑖2 − 2 × 5 ∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 + (5) ∑𝑛𝑖=1 𝑥𝑖2 ) = 8 (400 − 5 × 400 + 25 × 500)

= 10
Hence option C is correct.
Answer 4: B
Explanation :
2
We use the random variable 𝑄 = 𝜃 ∑𝑛𝑖=1 𝑋𝑖 ∼ 𝜒(2𝑛)
2

As the pivotal quantity. The 100(1 − 𝛼)%


confidence interval for 𝜃 can be constructed from

2∑𝑛
𝑖=1 𝑋𝑖 2∑𝑛
𝑖=1 𝑋𝑖
1 − 𝛼 = 𝑃 (𝜒α2 (2𝑛) ≤ 𝑄 ≤ 𝜒1−
2
α (2𝑛)) = 𝑃 [ 2 ≤𝜃≤ 2 (2𝑛) ]
2 2 𝜒 α (2𝑛) 𝜒α
1−
2 2

233 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

2∑𝑛
𝑖=1 𝑋𝑖 2∑𝑛
𝑖=1 𝑋𝑖
Thus, 100(1 − 𝛼)% confidence interval for 𝜃 is given by [𝜒2 ≤𝜃≤ 2 (2𝑛) ]
𝛼 (2𝑛) 𝜒𝛼
1−
2 2
Hence option B is correct.
Answer 5: B
Explanation :
Let 𝑌 = [X]; where 𝑌 denote the greatest integer smaller than or equal to 𝑋.
26 1 26
𝑃(𝑌 ≤ 25) = 𝑃([𝑋] ≤ 25) = 𝑃(𝑋 ∈ (0,26)) = ∫0 𝑑𝑥 =
100 100
Hence B is the correct option.
Answer 6: C
Explanation :
1
𝑥‾ = (3 + 4 + 3.5 + 2.5) = 3.25
4
1 1
𝐸(𝑋) = [𝜃 + 𝜃 2 + 1]Γ2 = [𝜃 + 𝜃 2 + 1] = 3.25
3 3
2
𝜃 + 𝜃 − 8.75 = 0 then 𝜃 = 2.5 or −3.5
Since 𝜃 ∈ {1,2,3,4} then 𝜃 = 3
Hence option C is correct.
Answer 7: D
Explanation :
The marginal pmf of 𝑌 is given by
∞ ∞
𝑥 3 𝑦 1 𝑥−𝑦 2𝑥
𝑃(𝑌 = 𝑦) = ∑ 𝑃(𝑋 = 𝑥), 𝑌 = 𝑦) = ∑ 𝑒 −2 (𝑦) ( ) ( )
4 4 𝑥!
𝑥=𝑦 𝑥=𝑦
𝑦 ∞ 𝑢
3 𝑦+𝑢 1 2𝑦+𝑢
= 𝑒 −2 ( ) ∑ ( 𝑦 ) ( ) (Assume 𝑢 = 𝑥 − 𝑦)
4 4 (𝑦 + 𝑢)!
𝑢=0
𝑦 ∞ ∞
−2
3 (𝑦 + 𝑢)! 1 𝑢 2𝑦+𝑢 3 𝑦 2𝑦 1! 1 𝑢
=𝑒 ( ) ∑ ( ) == 𝑒 −2 ( ) ∑ ( )
4 𝑦! 𝑢! 4 (𝑦 + 𝑢)! 4 𝑦! 𝑢! 2
𝑢 0
3 3 𝑦
− ( )
−2
3 𝑦 2𝑦 1/2 𝑒 22
=𝑒 ( ) 𝑒 = , 𝑦 = 0,1, …
4 𝑦! 𝑦!

234 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Which is the pmf of Poisson random variable with parameter 3/2, so 𝐸(𝑋) = 3/2 and
𝑉(𝑋) = 3/2.
Answer 8: A
Explanation :
The marginal probability mass function of X is given by
𝑃(𝑋 = 𝑚) = ∑∞
𝑛=𝑚 𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ( for 𝑚 = 0,1,2, … )
1 1 𝑚
− ( )
𝑒 22
= , 𝑚 = 0,1,2, …
𝑚!
Thus the marginal distribution of X is Poisson with mean 1/2.
The marginal probability mass function of 𝑌 is given by
𝑃(𝑌 = 𝑛) = ∑∞ 𝑚=0 𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ( for 𝑛 = 0,1,2, … )
−1
𝑒
= , 𝑛 = 0,1,2, …
𝑛!
Thus the marginal distribution of 𝑌 is Poisson with mean 1 .
𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ≠ 𝑃(𝑋 = 𝑚)𝑃(𝑌 = 𝑛)
Therefore 𝑋 and 𝑌 are not independent.
𝑃(𝑋 = 𝑚, 𝑌 = 5) 5! 1 5
𝑃(𝑋 = 𝑚 ∣ 𝑌 = 5) = = ( ) , 𝑚 = 0,1,2, … ,5
𝑃(𝑌 = 5) 𝑚! (5 − 𝑚)! 2
1
Thus the conditional distribution of 𝑋 given 𝑌 = 5 is B in (5, 2)
𝑃(𝑌=𝑛)
Since 𝑃(𝑌=𝑛+1) = (𝑛 + 1) for 𝑛 = 0,1,2, …

Answer 9: 𝑨
Explanation :
The trinomial distribution of two r.v.'s 𝑋 and 𝑌 is given by
𝑛!
𝑓𝑋,𝑌 (𝑥, 𝑦) = 𝑝 𝑥 𝑞 𝑦 (1 − 𝑝 − 𝑞)(𝑛−𝑥−𝑦)
𝑥! 𝑦! (𝑛 − 𝑥 − 𝑦)!
for 𝑥, 𝑦 = 0,1,2, … , 𝑛 and 𝑥 + 𝑦 ≤ 𝑛, where p + q ≤ 1.
n = 2, p = 1/6 and q = 2/6
1 1 10
Var (X) = 𝑛𝑝1 (1 − 𝑝1 ) = 2 × (1 − ) = ; Var (Y) = 𝑛𝑝2 (1 − 𝑝2 )
6 6 36

235 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

2 2
= 2 × ̅6 (1 − 6) = 16/36

1 2 4
Cov (𝑋, 𝑌) = −𝑛𝑝1 𝑝2 = −2 × × =−
6 6 36
Cov (𝑋, 𝑌) 4
Corr (𝑋, 𝑌) = =− = −0.31
√Var (𝑋)√Var (𝑌) 4√10
Hence −0.31 is the correct answer.
Answer 10: 𝐀
Explanation :
Let 𝑥1 = 1.1, 𝑥2 = 0.5, 𝑥3 = 1.4, 𝑥4 = 1.2
𝑒 𝜃−𝑥 , if 𝑥 ≥ 𝜃, 𝜃 ∈ (−∞, ∞)
𝑓(𝑥 ∣ 𝜃) = {
0, otherwise
𝜃 ∈ (∞, 𝑋(1) ]
𝑑
Since 𝑑𝜃 𝑓(𝑥 ∣ 𝜃) > 0 ∀𝜃 ∈ (∞, 𝑋(1) ], then

𝑓(𝑥 ∣ 𝜃) is strictly increasing function. So 𝜃ˆ = 𝑋(1) = 0.5, therefore by invariance


property the MLE of 𝜃 2 + 𝜃 + 1 = (0.5)2 + 0.5 + 1 = 1.75.
Hence MLE for 𝜃 2 + 𝜃 + 1 is 1.75.
Answer 11: 𝐀
Explanation :
1
X ∼ 𝐹(𝑚, 𝑛) then x ∼ 𝐹(𝑛, 𝑚)
𝑃[𝑈 > 3.69] = 0.05 ⇒ 1 − 𝑃[𝑈 > 3.69] = 1 − 0.05
⇒ 𝑃[𝑈 < 3.69] = 0.95
1 1 1
⇒ 𝑃 [𝑈 > 3.69] = 0.95 ⇒ 𝑉 = 𝑈 and
1
𝑐= = 0.27
3.69
Hence c = 0.27 is the correct answer.
Answer 12: C
Explanation :
Clearly, P({𝜔}) = 1/4 ∀𝜔 ∈ Ω = {1,2,3,4}. We have E = {1,2}, F = {1,3} and G = {3,4}

236 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Then P(E) = P(F) = P(G) = 2/4 = 1/2.


Using this result, we see that E and F are independent and also E and G are independent.
Hence option C is correct.
Answer 13: D
Explanation :
5 𝑋12 + 𝑋22 + 𝑋32 + 𝑋42 𝑛 5
𝑇=( ) 2 2 2 2 2 ∼ 𝐹(4,5); 𝐸(𝑊) = =
4 𝑌1 + 𝑌2 + 𝑌3 + 𝑌4 + 𝑌5 𝑛−2 3
2(5)2 (7) 350
Var (𝑇) = = = 9.72
4(3)2 (1) 36
Hence option D is correct.
Answer 14: A
Explanation :
2
Since 𝑊 = 2𝑋 + 𝑌 2 + 𝑍 2 ∼ 𝜒(4)
1 −𝑤/2
𝑓𝑊 (𝑤) = {4 𝑤𝑒 , if 𝑤 > 0
0, otherwise
∞1
𝑃(𝑊 > 2) = ∫2 𝑤𝑒 −𝑤/2 𝑑𝑤 = 2𝑒 −1
4
Hence option A is correct.
Answer 15: C
Explanation :
1
𝑥‾ = (3 + 4 + 3.5 + 2.5) = 3.25
4
1 1
𝐸(𝑋) = [𝜃 + 𝜃 2 + 1]Γ2 = [𝜃 + 𝜃 2 + 1] = 3.25
3 3
2
𝜃 + 𝜃 − 8.75 = 0 then 𝜃 = 2.5 or −3.5
Since 𝜃 ∈ (0, ∞) then 𝜃 = 2.5
Hence option C is correct.
Answer 16: B
Explanation :
237 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑃(𝑌 = 𝑘) = ∑∞
𝑛=−𝑘 𝑃(𝑋 = 𝑛, 𝑌 = 𝑘): { put m = n + k}

1 1 𝑘−1
= ( ) {𝑘 = 1,2, …
2 2
which is the pmf of geometric distribution with parameter 1/2}
1 1 𝑘−1
𝐸(𝑌) = ∑∞ ( )
𝑘=0 𝑘 =2
2 2
Hence option B is correct.
Answer 17: A
Explanation :
3
𝑃(1 < 𝑋 < 2) = 𝐹(2) − 𝐹(1) − 𝑃(𝑋 = 2) =
10
3
𝑃(1 < 𝑋 ≤ 2) = 𝐹(2) − 𝐹(1) =
5
1
𝑃(1 ≤ 𝑋 < 2) = 𝐹(2) − 𝐹(1) − 𝑃(𝑋 = 2) + 𝑃(𝑋 = 1) =
2
4
𝑃(1 ≤ 𝑋 ≤ 2) = 𝐹(2) − 𝐹(1) + 𝑃(𝑋 = 1) =
5
Answer 18: C
Explanation :
∞ 𝑥1 𝑒 −𝑥1 𝑥2 1
The marginal pdf of 𝑋1 is 𝑔(𝑥1 ) = ∫0 𝑑𝑥2 = 2 , 1 < 𝑥1 < 3
2
𝑓(𝑥1 , 𝑥2 )
ℎ(𝑥2 ∣ 𝑥1 ) = = 𝑥1 𝑒 −𝑥1𝑥2 , 𝑥2 > 0
𝑔(𝑥1 )
1
𝑋2 ∣ 𝑋1 ∼ Exp (𝑥1 ) with mean 𝑥
1
1
Therefore Var (𝑋2 ∣ 𝑋1 = 2) = 4 = 0.25
Hence 0.25 is correct answer.
Answer 19: 0.7
Explanation :
𝑇 𝑇(𝑇−1)
+ is UMVUE of 𝜃(1 + 𝜃)
𝑛 𝑛(𝑛−1)
𝑇 𝑇(𝑇−1)
𝑬 (𝑛 + 𝑛(𝑛−1)) = 𝜃(1 + 𝜃); where

𝑇 = ∑𝑛𝑖=1 𝑋𝑖
238 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑇 𝑇(𝑇−1) 3 3(3−1) 21
Therefore, 𝑛 + 𝑛(𝑛−1) = 6 + 6(6−1) = 30 = 0.70

8.8 REFERENCES
• Devore, J. (2012). Probability and statistics for engineers, 8th ed. Cengage Learning.
• John A. Rice (2007). Mathematical Statistics and Data Analysis, 3rd ed. Thomson
Brooks/Cole
• Larsen, R., Marx, M. (2011). An introduction to mathematical statistics and its
applications. Prentice Hall.
• Miller, I., Miller, M. (2017). J. Freund’s mathematical statistics with applications, 8th
ed. Pearson.
• Demetri Kantarelis, D. and Malcolm O. Asadoorian, M. O. (2009). Essentials of
Inferential Statistics, 5th edition, University Press of America.
• Hogg, R., Tanis, E., Zimmerman, D. (2021) Probability and Statistical inference,
10TH Edition, Pearson
8.9 SUGGESTED READINGS
• S. C Gupta , V.K Kapoor, Fundamentals of Mathematical Statistics,Sultan Chand
Publication, 11th Edition.
• B.L Agarwal, Programmed Statistics ,New Age International Publishers, 2nd Edition.

239 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

LESSON 9
ERROR IN HYPOTHESIS TESTING AND
POWER OF TEST

STRUCTURE
9.1 Learning Objectives
9.2 Introduction
9.3 Error in Hypothesis Testing and Power of Test
9.3.1 Type I and Type II Error
9.3.2 Unbiased Test and Unbiased Critical Region
9.3.3 UMP (Uniformly Most Powerful) Critical Region
9.3.4 Likelihoof Ratio Test
9.4 In-Text Questions
9.5 Summary
9.6 Glossary
9.7 Answer to In-Text Questions
9.8 References
9.9 Suggested Readings
9.1 LEARNING OBJECTIVES
One of the main objectives to discuss testing of hypothesis and how it can be use in real
analysis.
9.2 INTRODUCTION
The main problems in statistical inference can be broadly classified into two areas:
(i) The area of estimation of population parameter(s) and setting up of confidence intervals
for them, i.e, the area of point and intertal estimation and
(ii) Tests of statistical hypothesis.
In Neyman-Pearson theory, we use statistical methods to arrive at decisions in certain
situations where there is lack of certainty on the basis of a sample whose size is fixed in
advance while in Wald's sequential theory the sample size is not fixed but is regarded as a
random variable.

240 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

9.3 ERROR IN HYPOTHESIS TESTING AND POWER


We will discuss these distributions in details.
(i) Type I and Type II Error
(ii) Unbiased Test and Unbisaed Critical Region
(iii) UMP ( Uniformly Most Powerful Critical Region)
(iv) Likelihood Ratio Test
9.3.1 Type I and Type II Error :
• Type I error, also known as a "false positive": the error of rejecting a null hypothesis when
it is true. In other words, this is the error of accepting an alternative hypothesis (the real
hypothesis of interest) when the results can be attributed to chance. Plainly speaking, it
occurs when we are observing a difference when in truth there is none (or more specifically
- no statistically significant difference). So the probability of making a type I error in a test
with rejection region R is 𝑃(𝑅 ∣ 𝐻0 is true ).
• Type II error, also known as a "false negative": the error of not rejecting a null hypothesis
when the alternative hypothesis is the true state of nature. In other words, this is the error
of failing to accept an alternative hypothesis when you don't have adequate power. Plainly
speaking, it occurs when we are failing to observe a difference when in truth there is one.
So the probability of making a type II error in a test with rejection region R is 1 − 𝑃(𝑅 ∣ 𝐻𝑎
is true). The power of the test can be 𝑃(𝑅 ∣ 𝐻𝑎 is true ).
Hypothesis testing is the art of testing if variation between two sample distributions can just
be explained through random chance or not. If we have to conclude that two distributions vary
in a meaningful way, we must take enough precaution to see that the differences are not just
through random chance. At the heart of Type I error is that we don't want to make an
unwarranted hypothesis so we exercise a lot of care by minimizing the chance of its occurrence.
Traditionally we try to set Type I error as .05 or .01 - as in there is only a 5 or 1 in 100 chance
that the variation that we are seeing is due to chance. This is called the 'level of significance'.
Again, there is no guarantee that 5 in 100 is rare enough so significance levels need to be
chosen carefully. For example, a factory where a six sigma quality control system has been
implemented requires that errors never add up to more than the probability of being six
standard deviations away from the mean (an incredibly rare event). Type I error is generally
reported as the p-value.
9.3.2 Unbiased Test and Unbiased Critical Region
Let us consider the testing of 𝐻0 : 𝜃 = 𝜃0 against 𝐻1 : 𝜃 = 𝜃1 : The critical region 𝑊 and
consequently the test based on it is said to be unbiased if the power of the test exceeds the size
of the critical region, i.e., if
241 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Power of the test ≥ Size of the C.R.


⇒ 1−𝛽 ≥𝛼
⇒ 𝑃𝜃1 (𝑊) ≥ 𝑃𝜃0 (𝑊)
⇒ 𝑃[𝑥: 𝑥 ∈ 𝑊 ∣ 𝐻1 ] ≥ 𝑃[𝑥: 𝑥 ∈ 𝑊 ∣ 𝐻0 ]
In other words, the critical region 𝑊 is said to be unbiased if
𝑃𝜃 (𝑊) ≥ 𝑃𝜃0 (𝑊), ∀𝜃(≠ 𝜃0 ) ∈ Θ
Theorem. Every most powerful (MP) or uniformly most powerful (UMP) critical region
(CR) is necessarily unbiased.
(a) If 𝑊 be an MPCR of size 𝛼 for testing 𝐻0 : 𝜃 = 𝜃0 against 𝐻1 : 𝜃 = 𝜃1 , then it is
necessarily unbiased.
(b) Similarly if 𝑊 be UMPCR of size 𝛼 for testing 𝐻0 : 𝜃 = 𝜃0 against 𝐻1 : 𝜃 ∈ Θ1 , then it is
also unbiased.
Proof. Since 𝑊 is the MPCR of size 𝛼 for testing 𝐻0 : 𝜃 = 𝜃0 against 𝐻1 : 𝑄 = 𝜃1 , by
Neyman Pearson Lemma, we have; for ∀𝑘 > 0,
and 𝑊 = {𝑥: 𝐿(𝑥, 𝜃1 ) ≥ 𝑘𝐿(𝑥, 𝜃0 } = {𝑥: 𝐿1 ≥ 𝑘𝐿0 ∣
where 𝑘 is determined so that the size of 𝑊 ′ = {𝑥: 𝐿(𝑥, 𝜃1 ) < 𝑘𝐿(𝑥, 𝜃0 ) ∣= {𝑥: 𝐿1 < 𝑘𝐿0 },

𝑃𝜃0 (𝑊) = 𝑃[𝑥 ∈ 𝑊 ∣ 𝐻0 ] = ∫ 𝐿0 𝑑𝑥 = 𝛼


𝑊

To prove that 𝑊 is unbiased, we have to show that :


Power of 𝑊 ≥ 𝛼 i.e., 𝑃𝜃1 (𝑊) ≥ 𝛼
We have: 𝑃𝜃1 (𝑊) = ∫𝑊 𝐿1 𝑑𝑥 ≥ 𝑘∫𝑊 𝐿0 𝑑𝑥 = 𝑘𝛼
[∵ On 𝑊, 𝐿1 ≥ 𝑘𝐿0 and Using (i)]
i.e., 𝑃𝜃1 (𝑊) ≥ 𝑘𝛼, ∀𝑘 > 0
Also

1 − 𝑃𝜃1 (𝑊) = 1 − 𝑃(𝐱 ∈ 𝑊 ∣ 𝐻1 ) = 𝑃(𝐱 ∈ 𝑊 ′ ∣ 𝐻1 ) = ∫ 𝐿1 𝑑𝐱


𝑊′

< 𝑘 ∫ 𝐿0 𝑑𝐱 = 𝑘𝑃(𝐱: 𝐱 ∈ 𝑊 ′ ∣ 𝐻0 ) [∵ On 𝑊 ′ , 𝐿1 < 𝑘𝐿0 ]


𝑊′
= 𝑘[1 − 𝑃(𝐱: 𝐱 ∈ 𝑊 ∣ 𝐻0 )] = 𝑘(1 − 𝛼)
[Using (i)]
i.e., 1 − 𝑃𝜃1 (𝑊) ≤ 𝑘(1 − 𝛼), ∀𝑘 > 0
Case (i) 𝑘 ≥ 1. If 𝑘 ≥ 1, then from (iii), we get
𝑃𝜃1 (𝑊) ≥ 𝑘𝛼 ≥ 𝛼

242 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

⇒ 𝑊 is unbiased CR.
𝑊 = {x: 𝑔𝜃1 (𝑡(x)) ⋅ ℎ(x) ≥ 𝑘 ⋅ 𝑔𝜃0 (𝑡(x)) ⋅ ℎ(x)}, ∀𝑘 > 0
= {x: 𝑔𝜃1 (𝑡(x)) ≥ 𝑘 ⋅ 𝑔𝜃0 (𝑡(x))}, ∀𝑘 > 0
Hence if 𝑇 = 𝑡(𝑥) is sufficient statistic for 𝜃 then the MPCR for the test may be defined in
terms of the marginal distribution of 𝑇 = 𝑡(x), rather than the joint distribution of
𝑥1 , 𝑥2 , … , 𝑥𝑛 .
9.3.3 UMP (Uniformly Most Powerful ) Critical Region
It provides best critical region for testing 𝐻0 : 𝜃 = 𝜃0 against the hypothesis 𝜃 = 𝜃1 , provided
𝜃1 > 𝜃0 while it defines the best critical region for testing 𝐻0 : 𝜃 = 𝜃0 against 𝐻1 : 𝜃 = 𝜃1 ,
provided 𝜃1 < 𝜃0 . Thus, the best critical region for testing simple hypothesis 𝐻0 : 𝜃 = 𝜃0
against the simple hypothesis 𝜃 = 𝜃1 + 𝑐, 𝑐 > 0 will not serve as best critical region for testing
simple hypothesis 𝐻0 : 𝜃 = 𝜃0 against simple alternative hypothesis 𝐻1 : 𝜃 = 𝜃0 − 𝑐, 𝑐 > 0.
Hence in this problem, no uniformly most powerful test exists for testing the simple hypothesis,
𝐻0 : 𝜃 = 𝜃0 against the composite alternative hypothesis, 𝐻1 : 𝜃 ≠ 𝜃0 .
However, for each alternative hypothesis, 𝐻1 : 𝜃 = 𝜃1 > 𝜃0 or 𝐻1 : 𝜃 = 𝜃 < 𝜃0 , 𝑎 UMP test
exists .
Remark. In particular, if we take 𝑛 = 2, then the B.C.R. for testing 𝐻0 : 𝜃 = 𝜃0 , against
𝐻1 : 𝜃 = 𝜃1 (> 𝜃0 ) is given by :
𝑊 = {x: (𝑥1 + 𝑥2 )/2 ≥ 𝜃0 + 𝜎𝑧𝑎 /√2}
= {x: 𝑥1 + 𝑥2 ≥ 2𝜃0 + √2𝜎𝑧𝛼 }
= {x: 𝑥1 + 𝑥2 ≥ 𝐶}, (say)
where 𝐶 = 2𝜃0 + √2𝜎𝑧𝛼 = 2𝜃0 + √2𝜎 × 1.645, if 𝛼 = 0.05.
Similarly, the B.C.R. for testing 𝐻0 : 𝜃 = 𝜃0 against 𝐻1 : 𝜃 = 𝜃1 (< 𝜃0 ) with 𝑛 = 2 and 𝛼 =
0.05 is given by
𝑊1 = {x: (𝑥1 + 𝑥2 )/2 ≤ 𝜃0 − 𝜎𝑧𝑎 /√2}
= {x: (𝑥1 + 𝑥2 ) ≤ 2𝜃0 − √2𝜎 × 1.645}
= {x: 𝑥1 + 𝑥2 ≤ C1 }, (say),
where 𝐶1 = 2𝜃0 − √2𝜎𝑧𝛼 = 2𝜃0 − √2𝜎 × 1.645, if 𝛼 = 0.05
The B.C.R. for testing 𝐻0 : 𝜃 = 𝜃0 against the two tailed alternative 𝐻1 : 𝜃 = 𝜃1 (≠ 𝜃0 ), is
given by : 𝑊2 = {x: (𝑥1 + 𝑥2 ≥ 𝐶) ∪ (𝑥1 + 𝑥2 ≤ 𝐶1 )}
The regions are given by the shaded portions in the following figures (i), (ii) and (iii)
respectively.
243 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Example 1. Show that for the normal distribution with zero mean and variance 𝜎 2 , the best
critical region for 𝐻0 : 𝜎 = 𝜎0 against the alternative 𝐻1 : 𝜎 = 𝜎1 is of the form :
𝑛

∑ 𝑥𝑖2 ≤ 𝑎𝛼 , for 𝜎0 > 𝜎1


𝑖=1
𝑛

and ∑ 𝑥𝑖2 ≥ 𝑏𝛼 , for 𝜎0 < 𝜎1


𝑖=1
𝜎2
Show that the power of the best critical region when 𝜎0 > 𝜎1 is 𝐹 (𝜎02 , 𝜒 2 𝛼, 𝑛), where 𝜒 2 𝛼𝑛
1
is lower 100𝛼-per cent point and 𝐹 is the distribution function of the 𝜒 2 -distribution with 𝑛
degrees of freedom.
and since it is independent of 𝜃1 , 𝑊1 is also UPM C.R. for 𝐻0 : 𝜃 = 𝜃0 against
𝐻1 : 𝜃 = 𝜃1 (< 𝜃0 ).
However, since the two critical regions 𝑊0 and 𝑊1 are different, there exists no critical region
of size 𝛼 which is U.M.P. for 𝐻0 : 𝜃 = 𝜃0 against the two tailed alternative, 𝐻1 : 𝜃 ≠ 𝜃0 .
Power of the test. The power of the test for testing 𝐻0 : 𝜃 = 𝜃0 , against 𝐻1 : 𝜃 = 𝜃1 (> 𝜃0 ) is
given by
𝑛
1 2
1−𝛽 = 𝑃[𝑥 ∈ 𝑊0 ∣ 𝐻1 ] = 𝑃 (∑ 𝑥𝑖 ≤ 𝜒 ∣ 𝐻1 )
2𝜃0 1−𝛼,2𝑛
𝑖=1
𝑛
𝜃1 2
= 𝑃 (2𝜃1 ∑ 𝑥𝑖 ≤ 𝜒 ∣ 𝐻1 )
𝜃0 1−𝛼,2𝑛
𝑖=1
2 𝜃1 2
= 𝑃 {𝜒(2𝑛) ≤ 𝜒 },
𝜃0 1−𝛼2

244 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

since under 𝐻1 , 2𝜃1 ∑𝑛𝑖=1 𝑥𝑖 ∼ 𝜒(2𝑛)


2
.
Similarly the power of the test for testing 𝐻0 : 𝜃 = 𝜃0 , against 𝐻1 : 𝜃 = 𝜃1 (< 𝜃0 ) is given by:
𝑛
1 2
1−𝛽 = 𝑃(𝑥 ∈ 𝑊1 ∣ 𝐻1 ) = 𝑃 (∑ 𝑥𝑖 ≥ 𝜒 𝛼, 2𝑛 ∣ 𝐻1 )
2𝜃0
𝑖=1
𝑛
𝜃1 2
= 𝑃 (2𝜃1 ∑ 𝑥𝑖 ≥ 𝜒 𝛼, 2𝑛 ∣ 𝐻1 )
𝜃0
𝑖=1
𝜃1 2
= 𝑃 {𝜒 2 (2𝑛) ≥ 𝜒 𝛼, 2𝑛}
𝜃0
Remark. The graphic representation of the B.C.R. for 𝐻0 : 𝜃 = 𝜃0 against different alternatives
𝐻1 : 𝜃 = 𝜃1 (> 𝜃0 ), 𝐻1 : 𝜃 = 𝜃1 (< 𝜃0 ) and 𝐻1 : 𝜃 = 𝜃1 (≠ 𝜃0 ) for 𝑛 = 2, can be done similarly
as in Example 18.6, for the mean of normal distribution.
𝛽exp {−𝛽(𝑥 − 𝛾) ∣ 𝑑𝑥, 𝑥 ≥ 𝛾
Example 2. For the distribution 𝑑𝐹 = { show that for a
0, 𝑥 < 𝛾
hypothesis 𝐻0 that 𝛽 = 𝛽0 , 𝛾 = 𝛾0 and an alternative 𝐻1 that 𝛽 = 𝛽1, 𝛾 = 𝛾1 , the best critical
region is given by
1 1 𝛽1
𝑥‾ ≤ {𝛾1 𝛽1 − 𝛾0 𝛽0 − log 𝑘 + log }
𝛽1 − 𝛽0 𝑛 𝛽0
provided that the admissible hypothesis is restricted by the condition 𝛾1 ≤ 𝛾0 , 𝛽1 ≥ 𝛽0
𝛽exp {−𝛽(𝑥 − 𝛾)}, 𝑥 ≥ 𝛾
Solution. 𝑓(𝑥; 𝛽, 𝛾) = {
0, otherwise
𝑛
𝑛
𝛽 𝑛 exp {−𝛽 ∑ (𝑥𝑖 − 𝛾)} ; 𝑥1 , 𝑥2 , … , 𝑥𝑛 ≥ 𝛾
∴ ∏ 𝑓(𝑥𝑖 ; 𝛽, 𝛾) = {
𝑖=1
𝑖=1
0, otherwise
Using Neyman-Pearson Lemma, B.C.R. for 𝑘 > 0, is given by
𝛽1 𝑛 exp {−𝛽1 ∑𝑛𝑖=1 (𝑥𝑖 − 𝛾1 )}
≥𝑘
𝛽0 𝑛 exp {−𝛽0 ∑𝑛𝑖=1 (𝑥𝑖 − 𝛾0 )}

245 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑛 𝑛
𝛽1 𝑛
⇒ ( ) exp {−𝛽1 ∑ (𝑥𝑖 − 𝛾1 ) + 𝛽0 ∑ (𝑥𝑖 − 𝛾0 )} ≥ 𝑘
𝛽0
𝑖=1 𝑖=1
𝑛
𝛽1
⇒ ( ) exp [−𝛽1 𝑛(𝑥‾ − 𝛾1 ) + 𝛽0 𝑛(𝑥‾ − 𝛾0 )] ≥ 𝑘
𝛽0
⇒ 𝑛log (𝛽1 /𝛽0 ) − 𝑛𝑥‾(𝛽1 − 𝛽0 ) + 𝑛𝛽1 𝛾1 − 𝑛𝛽0 𝛾0 ≥ log 𝑘
(since log 𝑥 is an increasing function of 𝑥 ).
1 𝛽1
⇒ 𝑥‾(𝛽1 − 𝛽0 ) ≤ {𝛾1 𝛽1 − 𝛾0 𝛽0 − log 𝑘 + log ( )}
𝑛 𝛽0
1 1 𝛽1
∴ 𝑥‾ ≤ {𝛾1 𝛽1 − 𝛾0 𝛽0 − log 𝑘 + log ( )} provided 𝛽1 > 𝛽0 .
𝛽1 − 𝛽0 𝑛 𝛽0

Example 3. Examine whether a best critical region exists for testing the null hypothesis
𝐻0 : 𝜃 = 𝜃0 against the alternative hypothesis 𝐻1 : 𝜃 > 𝜃0 for the parameter 𝜃 of the
distribution:
1+𝜃
𝑓(𝑥, 𝜃) = ,1 ≤ 𝑥 < ∞
(𝑥 + 𝜃)2
1
Solution. ∏𝑛𝑖=1 𝑓(𝑥𝑖 , 𝜃) = (1 + 𝜃)𝑛 ∏𝑛𝑖=1 (𝑥 +𝜃)2
𝑖

By Neyman-Pearson Lemma, the B.C.R. for 𝑘 > 0, is given by


𝑛 𝑛
1 1
(1 + 𝜃1 )𝑛 ∏ 2
≥ 𝑘(1 + 𝜃0 )𝑛 ∏
(𝑥𝑖 + 𝜃1 ) (𝑥𝑖 + 𝜃0 )2
𝑖=1 𝑖=1
𝑛 𝑛

⇒ 𝑛log (1 + 𝜃1 ) − 2 ∑ log (𝑥𝑖 + 𝜃1 ) ≥ log 𝑘 + 𝑛log (1 + 𝜃0 ) − 2 ∑ log (𝑥𝑖 + 𝜃0 )


𝑖=1 𝑖=1
𝑛
𝑥𝑖 + 𝜃0 1 + 𝜃0
⇒ 2 ∑ log ( ) ≥ log 𝑘 + 𝑛log ( )
𝑥𝑖 + 𝜃1 1 + 𝜃1
𝑖=1

𝑥 +𝜃
Thus the test criterion is ∑𝑛𝑖=1 log (𝑥𝑖 +𝜃0), which cannot be put in the form of a function of the
𝑖 1
sample observations, not depending on the hypothesis. Hence no B.C.R. exists in this case.
9.3.4 Likelihood Ratio Test:
Neyman-Pearson Lemma based on the magnitude of the ratio of two probability density
functions provides best test for testing simple hypothesis against simple alternative hypothesis.
The best test in any given situation depends on the nature of the population distribution and
246 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

the form of the alternative hypothesis being considered. In this section we shall discuss a
general method of test construction called the Likelihood Ratio (L.R.) Test introduced by
Neyman and Pearson for testing a hypothesis, simple or composite, against a simple or
composite alternative hypothesis. This test is related to the maximum likelihood estimates.
Before defining the test, we give below some notations and terminology.

Parameter Space. Let us consider a random variable 𝑋 with p.d.f. 𝑓(𝑥, 𝜃). In most common
applications, though not always, the functional form of the population distribution is assumed
to be known except for the value of some unknown parameter(s) 𝜃 which may take any value
on a set Θ. This is expressed by writing the p.d.f. in the form 𝑓(𝑥, 𝜃), 𝜃 ∈ Θ. The set Θ, which
is the set of all possible values of 𝜃 is called the parameter space. Such a situation gives rise
not to one probability distribution but a family of probability distributions which we write as I
𝑓(𝑥, 𝜃) = 𝜃 ∈ Θ}. For example if 𝑋 ∼ 𝑁(𝜇, 𝜎 2 ), then the parameter space is :
Θ = {(𝜇, 𝜎 2 ): −∞ < 𝜇 < ∞, 0 < 𝜎 < ∞}
In particular, for 𝜎 2 = 1, the family of probability distributions is given by
{𝑁(𝜇, 1); 𝜇 ∈ Θ}, where Θ = {𝜇: −∞ < 𝜇 < ∞}
In the following discussion we shall consider a general family of distributions:
{𝑓(𝑥: 𝜃1 , 𝜃2 , … , 𝜃𝑘 ): 𝜃𝑖 ∈ Θ, 𝑖 = 1,2, … , 𝑘}
The null hypothesis 𝐻0 will state that the parameters belong to some subspace Θ0 of the
parameter space Θ.
Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be a random sample of size 𝑛 > 1 from a population with p.d.f. 𝑓(𝑥,
𝜃1 , 𝜃2 , … , 𝜃𝑘 ), where Θ, the parameter space is the totality of all points that (𝜃1 , 𝜃2 , …, 𝜃𝑘 ) can
assume. We want to test the null hypothesis :
𝐻0 : (𝜃1 , 𝜃2 , … , 𝜃𝑘 ) ∈ Θ0
against all alternative hypotheses of the type :
𝐻1 : (𝜃1 , 𝜃2 , … , 𝜃𝑘 ) ∈ Θ − Θ0
The likelihood function of the sample observations is given by
𝑛

𝐿 = ∏ 𝑓(𝑥𝑖 ; 𝜃1 , 𝜃2 , … , 𝜃𝑘 )
𝑖=1

According to the principle of maximum likelihood, the likelihood equation for estimating any
parameter 𝜃𝑖 is given by

247 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

∂𝐿
= 0, (𝑖 = 1,2, … , 𝑘)
∂𝜃𝑖
Using (18.17), we can obtain the maximum likelihood estimates for the parameters
(𝜃1 , 𝜃2 , … , 𝜃𝑘 ) as they are allowed to vary over the parameter space Θ and the subspace Θ0 .
Substituting these estimates in (18.16), we obtain the maximum values of the likelihood
function for variation of the parameters in Θ and Θ0 respectively. Then the criterion for the
likelihood ratio test is defined as the quotient of these two maxima and is given by
𝐿(Θ̂0 ) Sup𝜃∈Θ0 𝐿(x, 𝜃)
𝜆 = 𝜆(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) = =
𝐿(Θ̂) Sup𝜃∈Θ 𝐿(x, 𝜃)
where 𝐿(Θ̂0 ) and 𝐿Θ̂ are the maxima of the likelihood function with respect to the parameters
in the regions Θ0 and Θ respectively.
The quantity 𝜆 is a function of the sample observations only and does not involve re
parameters. Thus 𝜆 being a function of the random variables, is also a random variable.
Obvious 𝜆 > 0. Further Θ0 ⊂ Θ ⇒ 𝐿(Θ0 ) ≤ 𝐿(Θ) ⇒ 𝜆 ≤ 1
Hence, we get
The critical region for testing 𝐻0 (against 𝐻1 ) is an interval
0 < 𝜆 < 𝜆0
where 𝜆0 is some number (< 1) determined by the distribution of 𝜆 and the desired probability
of type 1 error, i.e., 𝜆0 is given by the equation :
𝑃(𝜆 < 𝜆0 ∣ 𝐻0 ) = 𝛼
For example, if 𝑔(.)𝑖𝑠 𝑡ℎ𝑒 𝑝. 𝑑. 𝑓. 𝑜𝑓𝜆 then 𝜆0 is determined from the equation :
𝜆0
∫ 𝑔(𝜆 ∣ 𝐻0 )𝑑𝜆 = 𝛼
0

A test that has critical region defined is a likelihood ratio test for testing H0 .
Remark. define the critical region for testing the hypothesis 𝐻0 by the likelihood ratio test.
Suppose that the distribution of 𝜆 is not known but the distribution of some function of 𝜆 is
known, then this knowledge can be utilized as given in the following theorem.
Theorem . If 𝜆 is the likelihood ratio for testing a simple hypothesis 𝐻0 and if 𝑈 = 𝜙(𝜆) is a
monotonic increasing (decreasing) function of 𝜆 then the test based on 𝑈 is equivalent to the
likelihood ratio test. The critical region for the test based on 𝑈 is :
𝜙(0) < 𝑈 < 𝜙(𝜆0 ), ∣ [[𝜙(𝜆0 ) < 𝑈 < 𝜙(0)]
Proof. The critical region for the likelihood ratio test is given by 0 < 𝜆 < 𝜆0 , where 𝜆0 is
determined by

248 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝜆0
∫ 𝑔(𝜆 ∣ 𝐻0 )𝑑𝜆 = 𝛼
0

Let 𝑈 = 𝜙(𝜆) be a monotonically increasing function of 𝜆. Then ( ∗ ) gives


𝜆0 𝜙(𝜆0 )
𝛼=∫ 𝑔(𝜆 ∣ 𝐻0 )𝑑𝜆 = ∫ ℎ(𝑢 ∣ 𝐻0 )𝑑𝑢
0 𝜙(0)

where ℎ(𝑢 ∣ 𝐻0 ) is the p.d.f. of 𝑈 when 𝐻0 is true. Here the critical region 0 < 𝜆 < 𝜆0
transforms to 𝜙(0) < 𝑈 < 𝜙(𝜆0 ). However if 𝑈 = 𝜙(𝜆) is a monotonic decreasing function
of 𝜆, then the inequalities are reversed and we get the critical region as 𝜙(𝜆0 ) < 𝑈 < 𝜙(0).
2. If we are testing a simple null hypothesis 𝐻0 then there is a unique distribution determined
for 𝜆. But if 𝐻0 is composite, then the distribution of 𝜆 may or may not be unique. In such a
case the distribution of 𝜆 may possibly be different for different parameter points in Θ0 and
then 𝜆0 is to be chosen such that
𝜆0
∫ 𝑔(𝜆 ∣ 𝐻0 )𝑑𝜆 ≤ 𝛼
0

for all values of the parameters in Θ0 .


However, if we are dealing with large samples, a fairly satisfactory situation to this testing of
hypothesis problem exists as stated (without proof) in the following theorem.
Theorem. Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be a random sample from a population with p.d.f. 𝑓(𝑥;
𝜃1 , 𝜃2 , … , 𝜃𝑘 ) where the parameter space Θ is 𝑘-dimensional. Suppose we want to test the
composite hypothesis
𝐻0 : 𝜃1 = 𝜃1′ ; 𝜃2 = 𝜃2′ ; … , 𝜃𝑟 = 𝜃𝑟′ ; 𝑟 < 𝑘
where 𝜃1′ , 𝜃2′ , … , 𝜃𝑟′ are specified numbers. When 𝐻0 is true, −2log 𝑒 𝜆 is asymptotically
distributed as chi-square with 𝑟 degrees of freedom, i.e., under 𝐻0 ,
−2log 𝜆 ∼ 𝜒(𝑟)2 , if 𝑛 is large.
Since 0 ≤ 𝜆 ≤ 1, −2log 𝑒 𝜆 is an increasing function of 𝜆 and approaches infinity when 𝜆 →
0, the critical region for −2log 𝜆 being the right hand tail of the chi-square distribution. Thus
at the level of significance ' 𝛼 ', the test may be given as follows :
Reject 𝐻0 if −2log 𝑒 𝜆 > 𝜒𝑟2 (𝛼)
where 𝜒𝑟2 (𝛼) is the upper 𝛼-point of the chi-square distribution with 𝑟 d.f. given by :
2
𝑃[𝜒 2 > 𝜒(𝑟) (𝛼)] = 𝛼
otherwise 𝐻0 may be accepted.

249 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Properties of Likelihood Ratio Test. Likelihood ratio (L.R.) test principle is an intuitive one.
If we are testing a simple hypothesis H0 against a simple alternative hypothesis 𝐻1 then the 𝐿𝑅
principle leads to the same test as given by the Neyman-Pearson lemma. This suggests that 𝐿𝑅
test has some desirable properties, specially large sample properties.
In 𝐿𝑅 test, the probability of type 𝐼 error is controlled by suitably choosing the cut off point
𝜆0 . LR test is generally UMP if an UMP test at all exists. We state below, the two asymptotic
properties of 𝐿𝑅 tests.
1 Under certain conditions, −2log 𝑒 𝜆 has an asymptotic chi-square distribution.
Under certain assumptions, 𝐿𝑅 test is consistent.
Now we shall illustrate how the likelihood ratio criterion can be used to obtain various
standard tests of significance in Statistics.

9.4 IN-TEXT QUESTIONS


MCQ’s Problems
Question 1.
Let 𝑋1 , … , 𝑋𝑛 be a random sample of size n(≥ 2) from a uniform distribution with
probability density function
1
; 0<𝑥<𝜃
𝑓(𝑥, 𝜃) = {𝜃
0, otherwise
where 𝜃 ∈ (0, ∞). If 𝑋(1) = min{𝑋1 , … , 𝑋𝑛 } and 𝑋(𝑛) = max{𝑋1 , … , 𝑋𝑛 }.
1 𝜃
Then, as n→ ∞, 𝜃 (𝑋(𝑛) + 𝑛+1) converges in probability to
A. 1
B. 0
C. 2
D. 3
Question 2.
Which measure is used to determine the convexity of the distribution curve?
A. skewness
B. kurtosis
C. variance
D. standard deviation

250 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question 3.
Consider the sample linear regression model 𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖 , 𝑖 = 1,2, … , 𝑛 Where 𝜖𝑖′ 𝑠
are i.i.d random variables with mean 0 and variance 𝜎 2 ∈ (0, ∞). Suppose that we have a data
set (𝑥1 , 𝑦2 ), … , (𝑥𝑛 , 𝑦𝑛 ) with n = 10, ∑𝑛𝑖=1 𝑥𝑖 = 50, ∑𝑛𝑖=1 𝑦𝑖 = 40, ∑𝑛𝑖=1 𝑥𝑖2 = 500
∑𝑛𝑖=1 𝑦𝑖2 = 400 and ∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 = 400. An unbiased estimate of 𝜎 2 is :
A. 5
B. 1/5
C. 10
D. 1/10
Question 4.
If 𝑋1 , 𝑋2 , … , 𝑋𝑛 is a random sample from a population with density
1 −𝑥
𝑓(𝑥, 𝜃) = {𝜃 𝑒
𝜃 if 0<𝑥 < ∞
0, otherwise
Where 𝜃 > 0 is an unknown parameter, what is a 100(1 − 𝛼)% confidence interval for 𝜃?

2∑𝑛 ln 𝑋 2∑𝑛
𝑖=1 ln 𝑋𝑖
A. [ 𝜒2𝑖=1(2𝑛)𝑖 , 2 (2𝑛) ]
𝛼 𝜒𝛼
1−
2 2

2∑𝑛
𝑖=1 𝑋𝑖 2∑𝑛
𝑖=1 𝑋𝑖
B. [𝜒2 , 2 (2𝑛) ]
𝛼 (2𝑛) 𝜒𝑎
1−
2 2

2∑𝑛 𝑋𝑖 2∑𝑛 𝑋
C. [ 𝜒2𝑖=1 , 𝑖=1 𝑖 ]
(2𝑛) 𝜒2 (2𝑛)
𝛼 𝛼
1−
2 2

2∑𝑛
𝑖=1 ln 𝑋𝑖 2∑𝑛𝑖=1 ln 𝑋𝑖
D. [ 2 (2𝑛) , ]
𝜒𝛼 𝜒2 𝛼 (2𝑛)
1−
2 2

Question 5.
Suppose that 𝑋 has uniform distribution on the interval [0,100]. Let 𝑌 denote the greatest
integer smaller than or equal to X. Which of the following is true?
1
A. 𝑃(𝑌 ≤ 25) = 4

251 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

26
B. 𝑃(𝑌 ≤ 25) = 100
C. 𝐸(𝑌) = 50
101
D. 𝐸(𝑌) = 2

Question 6.
Let 𝑥1 = 3, 𝑥2 = 4, 𝑥3 = 3, 𝑥4 = 2.5 be the observed values of a random sample from the
𝑥 𝑥
1 1 1

probaability density function 𝑓( 𝑥 ∣ 𝜃 ) = 3 [𝜃 𝑒 − 𝜃 + 𝜃2 𝑒 𝜃2 + 𝑒 −𝑥 ] , 𝑥 >
0, 𝜃 𝑏𝑒𝑙𝑜𝑛𝑔𝑠 𝑡𝑜 {1,2,3,4}. Then the method of moment estimate (MME) of 𝜃 is
A. 1
B. 2
C. 3
D. 4
Question 7.
Let the random variable 𝑋 and 𝑌 have the joint probability mass function

−2
𝑥 3 𝑦 1 𝑥−𝑦 2𝑥
𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) = 𝑒 (𝑦) ( ) ( ) , 𝑦 = 0,1,2, … , 𝑥; 𝑥 = 0,1,2, …
4 4 𝑥!
Then 𝑉(𝑌) is equal to
A. 1
B. 1/2
C. 2
D. 3/2
Question 8.
Let the discrete random variables 𝑋 and 𝑌 have the joint probability mass function
𝑒 −1
; 𝑚 = 0,1,2, … , 𝑛; 𝑛 = 0,1,2, …
𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) = {(𝑛 − 𝑚)! 𝑚! 2𝑛
0, otherwise
Which of the following statements is(are) TRUE?
A. The marginal distribution of 𝑋 is Poisson with mean 1/2
B. The random variable 𝑋 and 𝑌 are independent
1
C. The conditional distribution of X given Y = 5 is Bin (6, 2)
252 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

D. 𝑃(𝑌 = 𝑛) = (𝑛 + 1)𝑃(𝑌 = 𝑛 + 2) for 𝑛 = 0,1,2, …


Question 9.
Consider the trinomial distribution with the probability mass function
2! 1 𝑥 2 𝑦 3 2−𝑥−𝑦
𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) = ( ) ( ) ( )
𝑥! 𝑦! (2 − 𝑥 − 𝑦)! 6 6 6
, 𝑥 ≥ 0, 𝑦 ≥ 0, and 0 < 𝑥 + 𝑦 ≤ 2. Then Corr (𝑋, 𝑌) is equal to…
(correct up to two decimal places)
A) -0.31
B) 0.31
C) 0.35
D) 0.78
Question 10.
Let 𝑥1 = 1.1, 𝑥2 = 0.5, 𝑥3 = 1.4, 𝑥4 = 1.2 be the observed values of a random sample of size
four from a distribution with the probability density function
𝑒 𝜃−𝑥 , if 𝑥 ≥ 𝜃, 𝜃 ∈ (−∞, ∞)
𝑓(𝑥 ∣ 𝜃) = {
0, otherwise
Then the maximum likelihood estimate of 𝜃 2 + 𝜃 + 1 is equal (up to decimal place).
A) 1.75
B) 1.89
C) 1.74
D) 0.87
Question 11.
Let 𝑈 ∼ 𝐹5,8 and 𝑉 ∼ 𝐹8,5. If 𝑃[𝑈 > 3.69] = 0.05, then the value of C such that
𝑃[𝑉 > 𝑐] = 0.95 equals… (round off two decimal places)
A) 0.27
B) 1.27
C) 2.27
D) 2.29

253 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question 12.
Let P be a probability function that assigns the same weight to each of the points of the
sample space Ω = {1,2,3,4}. Consider the events E = {1,2}, F = {1,3} and G = {3,4}. Then
which of the following statement(s) is (are) TRUE?
1. E and F are independent
2. E and G are independent
3. E, F and G are independent
Select the correct answer using code given below:
A. 1 only
B. 2 only
C. 1 and 2 only
D. 1,2 and 3
Question 13.
Let 𝑋1 , 𝑋2 , … , 𝑋4 and 𝑌1 , 𝑌2 , … , 𝑌5 be two random samples of size 4 and 5 respectively, from a
51 𝑋 2 +𝑋 2 +𝑋 2 +𝑋 2
2 3 4
standard normal population. Define the statistic T = (4) 𝑌 2+𝑌 2 +𝑌 2 +𝑌 2 +𝑌 2 , then which of the
1 2 3 4 5
following is TRUE?
A. Expectation of 𝑇 is 0.6
B. Variance of T is 8.97
C. T has F-distribution with degree of freedom 5 and 4
D. T has F-distribution with degree of freedom 4 and 5
Question 14.
Let 𝑋, 𝑌 and 𝑍 be independent random variables with respective moment generating function
1 2
𝑀𝑋 (𝑡) = 1−𝑡 , 𝑡 < 1; 𝑀𝑌 (𝑡) = 𝑒 𝑡 /2 = 𝑀𝑍 (𝑡) 𝑡 ∈ ℝ. Let 𝑊 = 2𝑋 + 𝑌 2 + 𝑍 2 then P(W > 2)
is equals to
A. 2𝑒 −1
B. 2𝑒 −2
C. 𝑒 −1
D. 𝑒 −2

254 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question 15.
Let 𝑥1 = 3, 𝑥2 = 4, 𝑥3 = 3, 𝑥4 = 2.5 be the observed values of a random sample from the
probability density function
1 1 𝑥 1 −𝑥
𝑓(𝑥 ∣ 𝜃) = [ 𝑒 − 𝜃 + 2 𝑒 𝜃2 + 𝑒 −𝑥 ] , 𝑥 > 0, 𝜃 ∈ (0, ∞)
3 𝜃 𝜃
Then the method of moment estimate (MME) of 𝜃 is
A. 1.5
B. 2.5
C. 3.5
D. 4.5
Question 16.
Let 𝑋 be a random variable with cumulative distribution function
1 𝑛+2𝑘+1
𝑃(𝑋 = ℎ, 𝑌 = 𝑘) = ( ) ; 𝑛 = −𝑘, −𝑘 + 1, … , ; 𝑘 = 1,2, …
2
Then E(Y) equals
A. 1
B. 2
C. 3
D. 4
Question 17.
Let 𝑋 be a random variable with the cumulative distribution function
0, 𝑥<0
1 + 𝑥2
, 0≤𝑥<1
𝐹(𝑥) = 10
3 + 𝑥2
, 1≤𝑥<2
10
{ 1, 𝑥≥2
Which of the following statements is (are) TRUE?
3
A. 𝑃(1 < 𝑋 < 2) = 10
31
B. 𝑃(1 < 𝑋 ≤ 2) = 5
255 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

11
C. 𝑃(1 ≤ 𝑋 < 2) = 2
41
D. 𝑃(1 ≤ 𝑋 ≤ 2) = 5

Question 18.
Let the random variables 𝑋1 and 𝑋2 have joint probability density function
𝑥1 𝑒 −𝑥1𝑥2
𝑓(𝑥1 , 𝑥2 ) = { , 1 < 𝑥1 < 3, 𝑥2 > 0
2
0, otherwise.
What is the value Var (𝑋2 ∣ 𝑋1 = 2) …(up to two decimal place)?
A) 0.27
B) 0.28
C) 0.25
D) 1.90
Question 19.
Let 𝑥1 = 1, 𝑥2 = 0, 𝑥3 = 0, 𝑥4 = 1, 𝑥5 = 0, 𝑥6 = 1 be the data on a random sample of size 6
from Bin (1, 𝜃) distribution, where 𝜃 ∈ (0,1). Then the uniformly minimum variance unbiased
estimate of 𝜃(1 + 𝜃) equal to
Question: 20
Let 𝑋1 , 𝑋2 , … , 𝑋𝑁 be identically distributed random variable with mean 2 and variance 1. Let
N be a random variable follows Poisson distribution with mean 2 and independent of 𝑋i′ S. Let
𝑆𝑁 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑁 , then Var (SN ) is equals:
A. 4
B. 10
C. 2
D. 1
Question: 21
Let 𝐴 and 𝐵 be independent Random Variables each having the uniform distribution on
[0,1]. Let 𝑈 = min{𝐴, 𝐵} and 𝑉 = max{𝐴, 𝐵}, then Cov (𝑈, 𝑉) is equals
A. -1/36
B. 1/36
C. 1

256 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

D. 0
Question 22.
Let 𝑋1 , 𝑋2 , 𝑋3 be random sample from uniform (0, 𝜃 2 ), 𝜃 > 1, then maximum likelihood
estimation (mle) of 𝜃
2
A. 𝑋(1)
B. √X(3)
C. √X(1)
D. 𝛼𝑋(1) + (1 − 𝛼)𝑋(3) ; 0 < 𝛼 < 1
Question 23.
For the discrete variate with density:
1 6 1
𝑓(𝑥) = 𝐼(−1) (𝑥) + 𝐼(0) (𝑥) + 𝐼(1) (𝑥).
8 8 8
Which of the following is TRUE?
1
A. 𝐸(𝑋) = 2
1
B. 𝑉(𝑋) = 2
1
C. 𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } ≤ 4
1
D. 𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } ≥ 4
Question: 24
Lęt 𝑋𝑖 , 𝑌𝑖 ; (𝑖 = 1,2)
be a i.i.d random sample of size 2 from a standard normal distribution. What is the
distribution W is given by
√2(𝑋1 + 𝑋2 )
𝑊=
√(𝑋2 − 𝑋1 )2 + (𝑌2 − 𝑌1 )2
A. t-distribution with 1 d.f
B. t-distribution with 2 d.f
C. Chi-square distribution with 2 d.f
D. Does not determined

257 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question: 25
The moment generating function of a random variable X is given by
1 1 1 1
𝑀𝑋 (𝑡) = 6 + 3 𝑒 𝑡 + 3 𝑒 2𝑡 + 6 𝑒 3𝑡 , −∞ < 𝑡 < ∞, then P(X ≤ 2) equals
1
A. 3
1
B. 6
1
C. 2
5
D. 6

Question: 26
1 1
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from 𝐔 (𝜃 − 2 , 𝜃 + 2) distribution, where 𝜃 ∈ ℝ. If
𝑋(1) = min{𝑋1 , 𝑋2 , … , 𝑋𝑛 } and 𝑋(𝑛) = max{𝑋1 , 𝑋2 , … , 𝑋𝑛 }.
1 1 1
Define 𝑇1 = 2 (𝑋(1) + 𝑋(𝑛) ), 𝑇2 = 4 (3𝑋(1) + 𝑋(𝑛) + 1) and 𝑇3 = 2 (3𝑋(𝑛) − 𝑋(1) − 2) an
estimator for 𝜃, then which of the following is/are TRUE?
A. 𝑇1 and 𝑇2 are MLE for 𝜃 but 𝑇3 is not MLE for 𝜃
B. 𝑇1 is MLE for 𝜃 but 𝑇2 and 𝑇3 are not MLE for 𝜃
C. 𝑇1 , 𝑇2 and 𝑇3 are MLE for 𝜃
D. 𝑇1 , 𝑇2 and 𝑇3 are not MLE for 𝜃
Question: 27
Let 𝑋 and 𝑌 be random variable having joint probability density function
𝑘
𝑓(𝑥, 𝑦) = ; −∞ < (𝑥, 𝑦) < ∞
(1 + 𝑥 2 )(1 + 𝑦2)
Where k is constant, then which of the following is/are TRUE?
1
A. k = 𝜋2
1 1
B. 𝑓(𝑥) = 𝜋 1+𝑥 2 ; −∞ < 𝑥 < ∞
C. P(X = Y) = 0
D. All of the above

258 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question: 28
Lę 𝑋1 , 𝑋2 , … , 𝑋𝑛 be sequence of independently and identically distributed random variables
with the probability density function
1 2 −𝑥
𝑓(𝑥) = {2 𝑥 𝑒 , if 𝑥 > 0 and let
0, otherwise
𝑆𝑛 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛 , then which of the following statement is/are TRUE?
𝑆𝑛 −3𝑛
A. ∼ 𝑁(0,1) for all 𝑛 ≥ 1
√3𝑛
𝑆
B. For all 𝜀 > 0, 𝑃 (| 𝑛𝑛 − 3| > 𝜀) → 0 as n → ∞
𝑆𝑛
C. → 1 with probability 1
𝑛

D. Both A and B
Question: 29
Let 𝑋, 𝑌 are i.i.d Binomial (𝑛, 𝑝) random variables. Which of the following are true?
A. 𝑋 + 𝑌 ∼ Bin (2𝑛, 𝑝)

B. (X, Y) ∼ Multinomial (2n; p, p)

C. Var (X − Y) = E(X − Y)2

D. option A and C are correct.


Question: 30
Let 𝑋 and 𝑌 be continuous random variables with the joint probability density function
1 2 2
1
𝑓(𝑥, 𝑦) = 2𝜋 𝑒 −2(𝑥 +𝑦 ) ; (𝑥, 𝑦) ∈ ℝ2
Which of the following statement is/are TRUE?
1
A. 𝑃(𝑋 > 0) = 2
1
B. P(X > 0 ∣ Y < 0) = 2
1
C. P(X > 0, Y < 0) = 4
D. All of the above

259 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question: 31
Let X and 𝑌 are random variable with 𝐸[𝑋] = 𝐸[𝑌], then which of the following is NOT
TRUE?
A. E{E[X ∣ Y]} = E[Y]
B. V(𝑋 − 𝑌) = 𝐸(𝑋 − 𝑌)2
C. 𝐸[𝑉(𝑋 ∣ 𝑌)] + 𝑉[𝐸(𝑋 ∣ 𝑌)] = 𝑉(𝑋)
D. X and Y have same distribution
Question: 32
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from Exp (𝜃 ) distribution, where 𝜃 ∈ (0, ∞).
1
If 𝑋‾ = 𝑛 ∑𝑛𝑖=1 𝑋𝑖 , then a 95% confidence interval for 𝜃 is
2
𝜒2𝑛,0.95
A. (0, ]
𝑛𝑋‾
2
𝜒2𝑛,0.95
B. [ , ∞)
𝑛𝑋‾
2
𝜒2𝑛,0.95
C. (0, ]
2𝑛𝑋‾
2
𝜒2𝑛,0.95
D. [ , ∞)
2𝑛𝑋‾

Question: 33
𝑋𝑖 , 𝑖 = 1,2, … be independent random variables all distributed according to the PDF 𝑓𝑥 (𝑥) =
1,0 ≤ 𝑥 ≤ 1. Define 𝑌𝑛 = 𝑋1 𝑋2 𝑋3 … 𝑋𝑛 , for some integer n. Then Var (𝑌𝑛 ) is equal to
𝑛
A. 12
1 1
B. − 22𝑛
3𝑛
1
C. 12𝑛
1
D. 12

Question: 34
Let 𝑋1 , 𝑋2 , … , 𝑋4 be i.i.d random variables having continuous distribution.
Then 𝑃(𝑋3 < 𝑋2 < max(𝑋1 , 𝑋4 )) equal
A. 1/2
B. 1/3
C. 1/4
D. 1/6
260 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question: 35
1 1
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from U (𝜃 − 0 , 𝜃 + 2) distribution, where 𝜃 ∈ ℝ. If
𝑋(1) = min{𝑋1 , 𝑋2 , … , 𝑋𝑛 } and 𝑋(𝑛) = max{𝑋1 , 𝑋2 , … , 𝑋𝑛 }
Consider the following statement on above:
1
1. 𝑇1 = 2 (𝑋(1) + 𝑋(𝑛) ) is consistent for 𝜃
1
2. 𝑇2 = (3𝑋(1) + 𝑋(𝑛) + 1)
4
is unbiased consistent for 𝜃
Select the correct answer using code given below:
A. 1 only
B. 2 only
C. Both 1 and 2
D. Neither 1 nor 2
9.5 SUMMARY
The main points which we have covered in this lessons are what is estimator and what is
consistency, efficiency and sufficiency of the estimator and how to get best estimator.
9.6 GLOSSARY
• Motivation: These Problems are very useful in real life and we can use it in data
science , economics as well as social sciemce.
• Attention: Think how the best estimator are useful in real world problems.
9.7 ANSWER TO IN-TEXT QUESTIONS
Answer 1: A
Explanation :
1
𝑓(𝑥, 𝜃) = {𝜃 ; 0 < 𝑥 < 𝜃
0, otherwisse
𝑛𝜃
E(𝑋(𝑛) ) =
𝑛+1
1 𝜃
Let Y = 𝜃 (𝑋(𝑛) + 𝑛+1)
𝑋(𝑛) 1
so E(Y) = E( +𝑛+1) = 1
𝜃

261 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1 1
lim𝑛→∞ 𝐸 [ (𝑋(𝑛) + )] = 1;
𝜃 𝑛+1
Hence option A is correct.
Answer 2: B
Explanation:
Convexity (peakedness) is decided by kurtosis.
Answer 3: C
Explanation :
𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖 , 𝑖 = 1,2, … , 𝑛
∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑛𝑥‾𝑦‾ 400 − 10 × 5 × 4 4 4
𝛽ˆ = 𝑛 2 2
= 2
= ; 𝛼ˆ = 𝑦‾ − 𝑥‾𝛽ˆ = 4 − 5 × = 0
∑𝑖=1 𝑥𝑖 − 𝑛𝑥‾ 500 − 10 × 5 5 5
An unbiased estimate of 𝜎 2 is
1 2 1 4 2
𝜎ˆ 2 = 𝑛−2 ∑𝑛𝑖=1 (𝑦𝑖 − 𝛼ˆ − 𝛽ˆ 𝑥𝑖 ) = 10−2 ∑𝑛𝑖=1 (𝑦𝑖 − 5 𝑥𝑖 )
1 4 4 2 1 8 16
= 8 (∑𝑛𝑖=1 𝑦𝑖2 − 2 × 5 ∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 + (5) ∑𝑛𝑖=1 𝑥𝑖2 ) = 8 (400 − 5 × 400 + 25 × 500)

= 10
Hence option C is correct.
Answer 4: B
Explanation :
2
We use the random variable 𝑄 = 𝜃 ∑𝑛𝑖=1 𝑋𝑖 ∼ 𝜒(2𝑛)
2

As the pivotal quantity. The 100(1 − 𝛼)%


confidence interval for 𝜃 can be constructed from

2∑𝑛
𝑖=1 𝑋𝑖 2∑𝑛
𝑖=1 𝑋𝑖
1 − 𝛼 = 𝑃 (𝜒α2 (2𝑛) ≤ 𝑄 ≤ 𝜒1−
2
α (2𝑛)) = 𝑃 [ 2 ≤𝜃≤ 2 (2𝑛) ]
2 2 𝜒 α (2𝑛) 𝜒α
1−
2 2

2∑𝑛
𝑖=1 𝑋𝑖 2∑𝑛
𝑖=1 𝑋𝑖
Thus, 100(1 − 𝛼)% confidence interval for 𝜃 is given by [𝜒2 ≤𝜃≤ 2 (2𝑛) ]
𝛼 (2𝑛) 𝜒𝛼
1−
2 2
Hence option B is correct.
Answer 5: B
Explanation :
262 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Let 𝑌 = [X]; where 𝑌 denote the greatest integer smaller than or equal to 𝑋.
26 1 26
𝑃(𝑌 ≤ 25) = 𝑃([𝑋] ≤ 25) = 𝑃(𝑋 ∈ (0,26)) = ∫0 𝑑𝑥 =
100 100
Hence B is the correct option.
Answer 6: C
Explanation :
1
𝑥‾ = (3 + 4 + 3.5 + 2.5) = 3.25
4
1 1
𝐸(𝑋) = [𝜃 + 𝜃 2 + 1]Γ2 = [𝜃 + 𝜃 2 + 1] = 3.25
3 3
2
𝜃 + 𝜃 − 8.75 = 0 then 𝜃 = 2.5 or −3.5
Since 𝜃 ∈ {1,2,3,4} then 𝜃 = 3
Hence option C is correct.
Answer 7: D
Explanation :
The marginal pmf of 𝑌 is given by
∞ ∞
𝑥 3 𝑦 1 𝑥−𝑦 2𝑥
𝑃(𝑌 = 𝑦) = ∑ 𝑃(𝑋 = 𝑥), 𝑌 = 𝑦) = ∑ 𝑒 −2 (𝑦) ( ) ( )
4 4 𝑥!
𝑥=𝑦 𝑥=𝑦
𝑦 ∞ 𝑢
3 𝑦+𝑢 1 2𝑦+𝑢
= 𝑒 −2 ( ) ∑ ( 𝑦 ) ( ) (Assume 𝑢 = 𝑥 − 𝑦)
4 4 (𝑦 + 𝑢)!
𝑢=0
𝑦 ∞ ∞
−2
3 (𝑦 + 𝑢)! 1 𝑢 2𝑦+𝑢 3 𝑦 2𝑦 1! 1 𝑢
=𝑒 ( ) ∑ ( ) == 𝑒 −2 ( ) ∑ ( )
4 𝑦! 𝑢! 4 (𝑦 + 𝑢)! 4 𝑦! 𝑢! 2
𝑢 0
3 3 𝑦
− ( )
−2
3 𝑦 2𝑦 1/2 𝑒 22
=𝑒 ( ) 𝑒 = , 𝑦 = 0,1, …
4 𝑦! 𝑦!
Which is the pmf of Poisson random variable with parameter 3/2, so 𝐸(𝑋) = 3/2 and
𝑉(𝑋) = 3/2.
Answer 8: A
Explanation :
The marginal probability mass function of X is given by
263 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑃(𝑋 = 𝑚) = ∑∞
𝑛=𝑚 𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ( for 𝑚 = 0,1,2, … )
1 1 𝑚
𝑒 − 2(2)
= , 𝑚 = 0,1,2, …
𝑚!
Thus the marginal distribution of X is Poisson with mean 1/2.
The marginal probability mass function of 𝑌 is given by
𝑃(𝑌 = 𝑛) = ∑∞ 𝑚=0 𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ( for 𝑛 = 0,1,2, … )
𝑒 −1
= , 𝑛 = 0,1,2, …
𝑛!
Thus the marginal distribution of 𝑌 is Poisson with mean 1 .
𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ≠ 𝑃(𝑋 = 𝑚)𝑃(𝑌 = 𝑛)
Therefore 𝑋 and 𝑌 are not independent.
𝑃(𝑋 = 𝑚, 𝑌 = 5) 5! 1 5
𝑃(𝑋 = 𝑚 ∣ 𝑌 = 5) = = ( ) , 𝑚 = 0,1,2, … ,5
𝑃(𝑌 = 5) 𝑚! (5 − 𝑚)! 2
1
Thus the conditional distribution of 𝑋 given 𝑌 = 5 is B in (5, 2)
𝑃(𝑌=𝑛)
Since 𝑃(𝑌=𝑛+1) = (𝑛 + 1) for 𝑛 = 0,1,2, …

Answer 9: 𝑨
Explanation :
The trinomial distribution of two r.v.'s 𝑋 and 𝑌 is given by
𝑛!
𝑓𝑋,𝑌 (𝑥, 𝑦) = 𝑝 𝑥 𝑞 𝑦 (1 − 𝑝 − 𝑞)(𝑛−𝑥−𝑦)
𝑥! 𝑦! (𝑛 − 𝑥 − 𝑦)!
for 𝑥, 𝑦 = 0,1,2, … , 𝑛 and 𝑥 + 𝑦 ≤ 𝑛, where p + q ≤ 1.
n = 2, p = 1/6 and q = 2/6
1 1 10
Var (X) = 𝑛𝑝1 (1 − 𝑝1 ) = 2 × (1 − ) = ; Var (Y) = 𝑛𝑝2 (1 − 𝑝2 )
6 6 36
2 2
= 2 × 6̅ (1 − 6) = 16/36

1 2 4
Cov (𝑋, 𝑌) = −𝑛𝑝1 𝑝2 = −2 × × =−
6 6 36
Cov (𝑋, 𝑌) 4
Corr (𝑋, 𝑌) = =− = −0.31
√Var (𝑋)√Var (𝑌) 4√10
Hence −0.31 is the correct answer.
264 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Answer 10: 𝐀
Explanation :
Let 𝑥1 = 1.1, 𝑥2 = 0.5, 𝑥3 = 1.4, 𝑥4 = 1.2
𝑒 𝜃−𝑥 , if 𝑥 ≥ 𝜃, 𝜃 ∈ (−∞, ∞)
𝑓(𝑥 ∣ 𝜃) = {
0, otherwise
𝜃 ∈ (∞, 𝑋(1) ]
𝑑
Since 𝑑𝜃 𝑓(𝑥 ∣ 𝜃) > 0 ∀𝜃 ∈ (∞, 𝑋(1) ], then

𝑓(𝑥 ∣ 𝜃) is strictly increasing function. So 𝜃ˆ = 𝑋(1) = 0.5, therefore by invariance property


the MLE of 𝜃 2 + 𝜃 + 1 = (0.5)2 + 0.5 + 1 = 1.75.
Hence MLE for 𝜃 2 + 𝜃 + 1 is 1.75.
Answer 11: 𝐀
Explanation :
1
X ∼ 𝐹(𝑚, 𝑛) then x ∼ 𝐹(𝑛, 𝑚)
𝑃[𝑈 > 3.69] = 0.05 ⇒ 1 − 𝑃[𝑈 > 3.69] = 1 − 0.05
⇒ 𝑃[𝑈 < 3.69] = 0.95
1 1 1
⇒ 𝑃 [𝑈 > 3.69] = 0.95 ⇒ 𝑉 = 𝑈 and
1
𝑐= = 0.27
3.69
Hence c = 0.27 is the correct answer.
Answer 12: C
Explanation :
Clearly, P({𝜔}) = 1/4 ∀𝜔 ∈ Ω = {1,2,3,4}. We have E = {1,2}, F = {1,3} and G = {3,4}
Then P(E) = P(F) = P(G) = 2/4 = 1/2.
Using this result, we see that E and F are independent and also E and G are independent.
Hence option C is correct.
Answer 13: D
Explanation :

265 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

5 𝑋12 + 𝑋22 + 𝑋32 + 𝑋42 𝑛 5


𝑇=( ) 2 ∼ 𝐹(4,5); 𝐸(𝑊) = =
4 𝑌1 + 𝑌22 + 𝑌32 + 𝑌42 + 𝑌52 𝑛−2 3
2(5)2 (7) 350
Var (𝑇) = = = 9.72
4(3)2 (1) 36
Hence option D is correct.
Answer 14: A
Explanation :
2
Since 𝑊 = 2𝑋 + 𝑌 2 + 𝑍 2 ∼ 𝜒(4)
1 −𝑤/2
𝑓𝑊 (𝑤) = {4 𝑤𝑒 , if 𝑤 > 0
0, otherwise
∞1
𝑃(𝑊 > 2) = ∫2 𝑤𝑒 −𝑤/2 𝑑𝑤 = 2𝑒 −1
4
Hence option A is correct.
Answer 15: C
Explanation :
1
𝑥‾ = (3 + 4 + 3.5 + 2.5) = 3.25
4
1 1
𝐸(𝑋) = [𝜃 + 𝜃 2 + 1]Γ2 = [𝜃 + 𝜃 2 + 1] = 3.25
3 3
2
𝜃 + 𝜃 − 8.75 = 0 then 𝜃 = 2.5 or −3.5
Since 𝜃 ∈ (0, ∞) then 𝜃 = 2.5
Hence option C is correct.
Answer 16: B
Explanation :
𝑃(𝑌 = 𝑘) = ∑∞
𝑛=−𝑘 𝑃(𝑋 = 𝑛, 𝑌 = 𝑘): { put m = n + k}

1 1 𝑘−1
= ( ) {𝑘 = 1,2, …
2 2
which is the pmf of geometric distribution with parameter 1/2}
1 1 𝑘−1
𝐸(𝑌) = ∑∞
𝑘=0 𝑘 ( ) =2
2 2
266 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Hence option B is correct.


Answer 17: A
Explanation :
3
𝑃(1 < 𝑋 < 2) = 𝐹(2) − 𝐹(1) − 𝑃(𝑋 = 2) =
10
3
𝑃(1 < 𝑋 ≤ 2) = 𝐹(2) − 𝐹(1) =
5
1
𝑃(1 ≤ 𝑋 < 2) = 𝐹(2) − 𝐹(1) − 𝑃(𝑋 = 2) + 𝑃(𝑋 = 1) =
2
4
𝑃(1 ≤ 𝑋 ≤ 2) = 𝐹(2) − 𝐹(1) + 𝑃(𝑋 = 1) =
5
Answer 18: C
Explanation :
∞ 𝑥1 𝑒 −𝑥1 𝑥2 1
The marginal pdf of 𝑋1 is 𝑔(𝑥1 ) = ∫0 𝑑𝑥2 = 2 , 1 < 𝑥1 < 3
2
𝑓(𝑥1 , 𝑥2 )
ℎ(𝑥2 ∣ 𝑥1 ) = = 𝑥1 𝑒 −𝑥1𝑥2 , 𝑥2 > 0
𝑔(𝑥1 )
1
𝑋2 ∣ 𝑋1 ∼ Exp (𝑥1 ) with mean 𝑥
1
1
Therefore Var (𝑋2 ∣ 𝑋1 = 2) = 4 = 0.25
Hence 0.25 is correct answer.
Answer 19: 0.7
Explanation :
𝑇 𝑇(𝑇−1)
+ 𝑛(𝑛−1) is UMVUE of 𝜃(1 + 𝜃)
𝑛
𝑇 𝑇(𝑇−1)
𝑬 (𝑛 + 𝑛(𝑛−1)) = 𝜃(1 + 𝜃); where

𝑇 = ∑𝑛𝑖=1 𝑋𝑖
𝑇 𝑇(𝑇−1) 3 3(3−1) 21
Therefore, 𝑛 + 𝑛(𝑛−1) = 6 + 6(6−1) = 30 = 0.70

Answer 20: B
Explanation:
Let 𝑋1 , 𝑋2 , … be identically distributed random variable and let N be a random variable.

267 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Define 𝑆𝑁 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑁
Then E(SN ) = E(Xi ) ⋅ E(N) = 4

𝑉(𝑆𝑁 ) = 𝐸(𝑁)Var (𝑋𝑖 ) + [𝐸(𝑋𝑖 )]2 Var (𝑁) = 10


Answer 21: B
Explanation:
If 𝐴 and 𝐵 be independent 𝑅𝑎𝑛𝑑𝑜𝑚 𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒 each having the uniform distribution on [0,1].
Let 𝑈 = min{𝐴, 𝐵} and 𝑉 = max{𝐴, 𝐵},
then
𝐸(𝑈) = 1/3, 𝐸(𝑉) = 2/3 and 𝑈𝑉 = 𝐴𝐵 and 𝑈 + 𝑉 = 𝐴 + 𝐵
Thus Cov (𝑈, 𝑉) = 𝐸(𝑈𝑉) − 𝐸(𝑈)
𝐸(𝑉) = 𝐸(𝐴𝐵) − 𝐸(𝑈)
1 2 1
E(V) = E(A) ⋅ E(B) − E(U) ⋅ E(V) = − =
4 9 36
Answer 22: B
Explanation:
1
𝑋𝑖 ∼ 𝑈(0, 𝜃 2 ) 𝑓(𝑥) = ; 0 < 𝑥𝑖 < 𝜃 2
𝜃2

𝑋(3) ≤ 𝜃 2 ⇒ 𝜃ˆ ∈ [√𝑋(3) , ∞)
3
1
𝐿(𝑋, 𝜃) = ∏ 𝑓(𝑥𝑖 , 𝜃) =
𝜃6
𝑖=1
∂𝐿
⇒ ∂𝜃 < 0 there fore given function is decreasing then 𝜃ˆ = √𝑋(3)
Answer 23: C
Explanation:

X −1 0 1

P(x) 1/8 6/8 1/8


1 6 1
E(X) = −1 × + 0 × + 1 × = 0
8 8 8
1 6 1 1
E(𝑋 2 ) = 1 × + 0 × + 1 × =
8 8 8 4

268 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1 1
𝑉(𝑋) = 𝐸(𝑋 2 ) − {𝐸(𝑋)}2 = ⇒ 𝜎𝑋 =
4 2
𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } = 𝑃{|𝑋| ≥ 1} = 1 − 𝑃(|𝑋| < 1)
= 1 − 𝑃(−< 𝑋 < 1) = 1 − 𝑃(𝑋 = 0) = 1/4
1
𝑃{|𝑋 − 𝜇𝑥 | ≥ 2𝜎𝑥 } ≤ 4 [By Chebychev’s inequality]
Answer 24: B
Explanation:
Let 𝑋𝑖 , 𝑌𝑖 ; (𝑖 = 1,2) be a i.i.d random sample of size 2 from a standard normal distribution.
√2(X1 +X2 )
Then W = ∼ 𝑡(2)
√(X2 −X1 )2 +(Y2 −Y1 )2
Hence option (b) is correct.
Answer 25: D
Explanation:
Let 𝑋 be Random Variable with 𝑀𝑋 (𝑡) = 𝐸(𝑒 𝑡𝑋 ) = ∑etx P(X = x)
1
; 𝑥=0
6
1
; 𝑥=1
3
Then 𝑃(𝑋 = 𝑥) = 1
; 𝑥=2
3
1
{6 ; 𝑥 = 3
𝑃(𝑋 ≤ 2) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)
1 1 1 5
= + + =
6 3 3 6
Answer 26: A
Explanation:
1 1
𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from U (𝜃 − 2 , 𝜃 + 2)
1 1
𝑓(𝑥) = 1; 𝜃 − < 𝑥𝑖 < 𝜃 +
2 2
1 1
𝜃ˆ ∈ [𝑋(𝑛) − , 𝑋(1) + ]
2 2
distribution of 𝑋 free from parameter,
1 1
then 𝜃ˆ = 𝜆 (𝑋(𝑛) − 2) + (1 − 𝜆) (𝑋(1) + 2) ; 0 < 𝜆 < 1
1 1 3
Take 𝜆 = 2 , 4 and 4 then we obtained mle of 𝜃 are
269 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1 1 1
(𝑋(1) + 𝑋(𝑛) ); 4 (3𝑋(1) + 𝑋(𝑛) + 1); 4 (3𝑋(1) + 𝑋(𝑛) + 1) respectively.
2

Hence option (a) is correct.


Answer 27: D
Explanation:
Let 𝑋 and 𝑌 be random variable having joint probability
𝑘
density function 𝑓(𝑥, 𝑦) = (1+𝑥 2)(1+𝑦 2) ; −∞ < (𝑥, 𝑦) < ∞
∞ ∞ 1
∫−∞ ∫−∞ 𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦 = 1 ⇒ 𝑘 = 2
𝜋
1 1
Since 𝑋 and 𝑌 are independent, then 𝑋 ∼ 𝑓(𝑥) = 𝜋 1+𝑥 2 ; −∞ < 𝑥 < ∞
P(X = Y) = 0{ There is no region occur corresponding to X = Y, then probability
corresponding to this region will be zero}

Answer 28: D
Explanation:
Clearly, 𝑋1 , 𝑋2 , … , 𝑋𝑛
are i.i.d 𝐺(3,1) random variables. Then, 𝐸(𝑋𝑖 ) = 3 and Var (𝑋𝑖 ) = 3, 𝑖 = 1,2, …
Let 𝑆𝑛 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛 , then E(𝑆𝑛 ) = 3𝑛 and Var (𝑆𝑛 ) = 3𝑛

Now For option (a)


𝑆𝑛 −3𝑛
Using CLT ∼ 𝑁(0,1) for all 𝑛 ≥ 1
√3𝑛

For option (b)


𝑆 3𝑛 𝑆 3𝑛
lim𝑛→∞ 𝐸 ( 𝑛𝑛) = lim𝑛→∞ 𝑛 = 3; lim𝑛→∞ 𝑉 ( 𝑛𝑛) = lim𝑛→∞ 𝑛2 = 0
By Using Convergence in probability condition
(Consistency Properties)
𝑆
For all 𝜀 > 0, 𝑃 (| 𝑛𝑛 − 3| > 𝜀) → 0 as n → ∞
For option (c)
𝑆𝑛
→3
𝑛
with probability 1 (By using convergent in probability condition)

For option (d)

270 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑠𝑛 −𝐸(𝑠𝑛 ) 3(𝑛−√𝑛)−𝐸(𝑆𝑛 )
lim𝑛→∞ 𝑃 ( ≥ ) = 𝑃(𝑍 ≥ −√3) = 1 −
√Var (𝑆2 ) √Var (𝑆𝑤 )
𝑃(𝑍 ≤ −√3)
1
= 1 − Φ(−√3) ≥
2
Answer 29: D
Explanation:
(A) Sum of independent binomial variate is also a binomial variate if corresponding
probability will be same then 𝑋 + 𝑌 ∼ Bin(2𝑛, 𝑝)
(B) When there are more than two variables include, the observation lead to multinomial
distribution.
(𝑋, 𝑌) not follows Multinomial (2𝑛; 𝑝, 𝑝)
(C) Var (X − Y) = E(X − Y)2 − {E(X − Y)}2 = E(X − 𝑌)2
(D) Cov (𝑋 + 𝑌, 𝑋 − 𝑌) = 𝑉(𝑋) − Cov (𝑋, 𝑌) + Cov (𝑌, 𝑋) − 𝑉(𝑌) = 0
{∴ X and Y are independent Cov (X1 Y) = Cov (Y, X) = 0}
Hence option D is correct.
Answer 30: D
Explanation:
The joint pdf of 𝑋 and 𝑌 is
1 2 +𝑦 2 ) 1 2 1 2
1 1 1
𝑓(𝑥, 𝑦) = 2𝜋 𝑒 −2(𝑥 = 𝑒 −2(𝑥 ) × 𝑒 −2(𝑦 ) ; (𝑥, 𝑦) ∈ ℝ2
√2𝜋 √2𝜋
It is easy to see that 𝑋 and 𝑌 are i.i.d 𝑁(0,1) random variables, and therefore,
1
𝑃(𝑋 > 0) = 2
1 1 1
𝑃(𝑋 > 0)𝑃(𝑌 < 0) = × =
2 2 4

𝑃(𝑋 > 0, 𝑌 < 0) 1


𝑃(𝑋 > 0 ∣ 𝑌 < 0) = =
𝑃(𝑌 < 0) 2
Answer 31: D
Explanation:
E{E[ X ∣ Y ]} = E[X] = E[Y] {Given that E[X] = E[Y]}

271 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

V(𝑋 − 𝑌) = 𝐸(𝑋 − 𝑌)2 − {𝐸(𝑋 − 𝑌)}2 = 𝐸(𝑋 − 𝑌)2


{ Since 𝐸[𝑋 − 𝑌] = 0}
𝐸[ V(X ∣ Y)] + V[E(X ∣ Y)] = V(X)
𝑋 and 𝑌 may or may not be same distribution.
Hence option (D) is correct.
Answer 32: C
Explanation:
𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from Exp (𝜃)
distribution, where 𝜃 ∈ (0, ∞)
Then 2𝜃∑𝑛𝑖=1 𝑋𝑖 ∼ 𝜒2𝑛 2
⇒ P(0 < 2𝜃∑𝑛𝑖=1 𝑋𝑖 ≤ 𝜒2𝑛,0.95
2
)=
0.95
2
𝜒2𝑛,0.95
0 < 2𝜃∑𝑛𝑖=1 𝑋𝑖 ≤ 𝜒2𝑛,0.95
2
⇒ 𝜃 ∈ (0, ]
2𝑛𝑥‾
Hence option C is correct.
Answer 33: B
Explanation:
𝑋1 , 𝑋2 , … , 𝑋𝑛 are independent, we have that E(𝑌𝑛 ) = E(𝑋1 ) × … × 𝐸(𝑋2 ). Similarly,
𝐸(𝑌𝑛2 ) = E(𝑋12 ) × … × 𝐸(𝑌𝑛2 ). Since
E(𝑋𝑖 ) = 1/2 and E(𝑌𝑖2 ) = 1/3 for i = 1,2, … , n
it follows that
1 1
Var (𝑌𝑛 ) = 𝐸(𝑌𝑛2 ) − [E(𝑌𝑛 )]2 = 𝑛 − 2𝑛
3 2
Hence option (B) is correct.
Answer 34: C
Explanation:
Note that 𝑃(𝑋1 < 𝑋2 ) + 𝑃(𝑋2 < 𝑋1 ) + 𝑃(𝑋1 = 𝑋2 ) = 1
since the corresponding events are disjoint and exhaust
all the probabilities. But 𝑃(𝑋1 < 𝑋2 ) = 𝑃(𝑋2 < 𝑋1 )
by symmetry. Furthermore, 𝑃(𝑋1 = 𝑋2 ) = 0
1
since the random variables are continuous. Therefore, 𝑃(𝑋1 < 𝑋2 ) = 2. From above results
1
𝑃(𝑋3 < 𝑋2 < max(𝑋1, 𝑋4 )) =
4
Hence option C is correct.

272 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Answer 35: A
Explanation:
1 1
𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from U (𝜃 − 2 , 𝜃 + 2)
1 1
𝑓(𝑥) = 1; 𝜃 − < 𝑥𝑖 < 𝜃 +
2 2
1 1
ˆ
𝜃 ∈ [𝑋(𝑛) − 2 , 𝑋(1) + 2] ; distribution of 𝑋
1 1
free from parameter, then 𝜃ˆ = 𝜆 (𝑋(𝑛) − ) + (1 − 𝜆) (𝑋(1) + ) ; 0 < 𝜆 < 1
2 2
1 1
Take 𝜆 = 2 , 4 we get
1
𝑇1 = 2 (𝑋(1) + 𝑋(𝑛) ) is MLE as well as consistent for 𝜃
1
𝑇2 = (3𝑋(1) + 𝑋(𝑛) + 1)
4
is MLE as well as consistent for 𝜃 but not unbiased…
Hence option (A) is correct.

9.8 REFERENCES
• Devore, J. (2012). Probability and statistics for engineers, 8th ed. Cengage Learning.
• John A. Rice (2007). Mathematical Statistics and Data Analysis, 3rd ed. Thomson
Brooks/Cole
• Larsen, R., Marx, M. (2011). An introduction to mathematical statistics and its
applications. Prentice Hall.
• Miller, I., Miller, M. (2017). J. Freund’s mathematical statistics with applications, 8th
ed. Pearson.
• Demetri Kantarelis, D. and Malcolm O. Asadoorian, M. O. (2009). Essentials of
Inferential Statistics, 5th edition, University Press of America.
• Hogg, R., Tanis, E., Zimmerman, D. (2021) Probability and Statistical inference,
10TH Edition, Pearson
9.9 SUGGESTED READINGS
• S. C Gupta, V.K Kapoor, Fundamentals of Mathematical Statistics,Sultan Chand
Publication, 11th Edition.
• B.L Agarwal, Programmed Statistics ,New Age International Publishers, 2nd Edition.

273 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

LESSON 10
TESTING OF EQUALITY OF MEAN AND VARIANCE

STRUCTURE
10.1 Learning Objectives
10.2 Introduction
10.3 Testing of Equality of Mean and Variance
10.3.1 Test for the Mean of a Normal Population
10.3.2 Test for the Mean of Several Normal Population
10.3.3 Test for the Varinace of Normal Population
10.3.4 Test for the Variance of Several Normal Population
10.4 In-Text Questions
10.5 Summary
10.6 Glossary
10.7 Answer To In-Text Questions
10.8 References
10.9 Suggested Readings
10.1 LEARNING OBJECTIVES
In this chapter our main aim to understand how to test equality of mean and variance of two
normal as well as several normal population.
10.2 INTRODUCTION
The main problems in statistical inference can be broadly classified into two areas:
(i) The area of estimation of population parameter(s) and setting up of confidence intervals
for them, i.e, the area of point and intertal estimation and
(ii) Tests of statistical hypothesis.
In Neyman-Pearson theory, we use statistical methods to arrive at decisions in certain
situations where there is lack of certainty on the basis of a sample whose size is fixed in
advance while in Wald's sequential theory the sample size is not fixed but is regarded as a
random variable.

274 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

10.3 TESTING OF EQUALITY OF MEAN AND VARIANCE


In this Section we will discuss these topics in details.
(i) Test for the Mean of a Normal Population
(ii) Test for the Mean of Several Normal Population
(iii) Test for the Varinace of Normal Population
(iv) Test for the Variance of Several Normal Population
10.3.1 Test for the Mean of a Normal Population
Let us take the problem of testing if the mean of a normal population has a specified value. Let
(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) be a random sample of size 𝑛 from the normal population with mean 𝜇 and
variance 𝜎 2 , where 𝜇 and 𝜎 2 are unknown. Suppose we want to test the (composite) null
hypothesis: 𝐻0 : 𝜇 = 𝜇0 (specified), 0 < 𝜎 2 < ∞ against the composite alternative hypothesis
: 𝐻1 : 𝜇 ≠ 𝜇0 ; 0 < 𝜎 2 < ∞
In this case the parameter space Θ is given by
Θ = {(𝜇, 𝜎 2 ): −∞ < 𝜇 < ∞, 0 < 𝜎 2 < ∞}
and the subspace Θ0 determined by the null hypothesis 𝐻0 is given by
Θ0 = {(𝜇, 𝜎 2 ): 𝜇 = 𝜇0 , 0 < 𝜎 2 < ∞}

The likelihood function of the sample observations 𝑥1 , 𝑥2 , … , 𝑥𝑛 is given by


𝑛
1 𝑛/2 1
𝐿=( 2
) ⋅ exp {− 2 ∑ (𝑥𝑖 − 𝜇)2 }
2𝜋𝜎 2𝜎
𝑖=1
2
The maximum likelihood estimates of 𝜇 and 𝜎 are given by :
𝑛 𝑛
1 1
𝜇ˆ = ∑ 𝑥𝑖 = 𝑥‾, 𝜎ˆ 2 = ∑ (𝑥𝑖 − 𝑥‾)2 = 𝑠 2
𝑛 𝑛
𝑖=1 𝑖=1

Hence, substituting in (18.25), the maximum of 𝐿 in the parameter space Θ is given by


1 𝑛/2
𝐿(Θ̂) = ( ) ⋅ exp (−𝑛/2)
2𝜋𝑠 2
In Θ0 , the only variate parameter is 𝜎 2 and 𝑀𝐿𝐸 of 𝜎 2 for given 𝜇 = 𝜇0 is given by

275 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1
𝜎ˆ 2 = ∑ (𝑥𝑖 − 𝜇0 )2 = 𝑠02 , ( say )
𝑛
1
= ∑ (𝑥𝑖 − 𝑥‾ + 𝑥‾ − 𝜇0 )2
𝑛
1
= ∑ (𝑥𝑖 − 𝑥‾)2 + (𝑥‾ − 𝜇0 )2 ,
𝑛
the product term vanishes, since ∑(𝑥𝑖 − 𝑥‾)(𝑥‾ − 𝜇0 ) = (𝑥‾ − 𝜇0 )Σ(𝑥𝑖 − 𝑥‾) = 0
𝜎ˆ 2 = 𝑠 2 + (𝑥‾ − 𝜇0 )2 = 𝑠02 , ( say ).
Hence, substituting then,
𝑛/2
1
𝐿(Θ̂0 ) = ( ) exp (−𝑛/2)
2𝜋𝑠02
The ratio of gives the likelihood ratio criterion
𝑛/2
𝐿(Θ̂0 )𝑠2
𝜆 = = ( 2)
𝐿(Θ̂) 𝑠0
𝑛/2 𝑛/2
𝑠2 1
={ 2 } ={ }
𝑠 + (𝑥‾ − 𝜇0 )2 1 + [(𝑥‾ − 𝜇0 )2 /𝑠 2 ]

we have proved earlier that under 𝐻0 , the statistic


𝑥‾ − 𝜇0 1 2 2
𝑛𝑠 2
𝑡= , where 𝑆 = Σ(𝑥𝑖 − 𝑥‾) =
𝑆/√𝑛 𝑛−1 𝑛−1

follows Student's 𝑡-distribution with (𝑛 − 1) d.f.


𝑥‾−𝜇 𝑥‾−𝜇0
Thus, 𝑡 = 𝑆/ 𝑛0 = 𝑠/√𝑛−1 ∼ 𝑡𝑛−1

after Substituting , we get
1
𝜆= 𝑛/2
= 𝜙(𝑡 2 ), ( say )
𝑡2
(1 + )
𝑛−1
The likelihood ratio test for testing 𝐻0 against 𝐻1 consists in finding a critical region of the
type 0 < 𝜆 < 𝜆0 , where 𝜆0 is given by, which requires the distribution of 𝜆 under H0 . In this
case, it is not necessary to obtain the distribution of 𝜆 since 𝜆 = 𝜙(𝑡) is a monotonic function
of 𝑡 2 and the test can well be carried on with 𝑡 2 as a criterion as with 𝜆 . Now 𝑡 2 = 0 when
𝜆 = 1 and 𝑡 2 becomes infinite when 𝜆 = 0. The critical region of the 𝐿𝑅 test viz., 0 < 𝜆 < 𝜆0 ,
now it is equivalent to

276 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

−𝑛/2 𝑛/2
𝑡2 𝑡2
(1 + ) ≤ 𝜆0 ⇒ (1 + ) ≥ 𝜆−1
0
𝑛−1 𝑛−1
𝑡2
⇒ ≥ (𝜆0 )−2/𝑛 − 1 ⇒ 𝑡 2 ≥ (𝑛 − 1)[𝜆0 − 2/𝑛 − 1] = 𝐴2 , (say).
𝑛−1

Thus the critical region may 𝑢.ll be defined by


√𝑛(𝑥‾ − 𝜇0 )
|𝑡| = | |≥𝐴
𝑆
where the constant 𝐴 is determined such that
𝑃[|𝑡| ≥ 𝐴 ∣ 𝐻0 ] = 𝛼
Since under 𝐻0 , the statistic 𝑡 follows Student's 𝑡 distribution with (𝑛 − 1)𝑑. 𝑓. , 𝐴 =
𝑡𝑛−1 (𝛼/2) where the symbol 𝑡𝑛 (𝛼) stands for the right tail 100𝛼% point of the 𝑡-distribution
with 𝑛 d.f. given by :

𝑃{𝑡 > 𝑡𝑛 (𝛼)} = ∫ 𝑓(𝑡)𝑑𝑡 = 𝛼,
𝑡𝑛 (𝛼)

where 𝑓(⋅) is the p.d.f. of Student's 𝑡 with 𝑛 d.f. The critical region is shown in the following
diagram.

Thus for testing 𝐻0 : 𝜇 = 𝜇0 against 𝜇 ≠ 𝜇0 ( 𝜎 2 -unknown), we have the two-tailed t-test


defined as follows :
√𝑛(𝑥‾−𝜇0 )
If |𝑡| = | | > 𝑡𝑛−1 (𝛼/2), reject H0 and if |𝑡| < 𝑡𝑛−1 (𝛼/2), 𝐻0 may be accepted.
𝑆

Important Remarks 1. Let us now consider the problem of testing the hypothesis :
𝐻0 : 𝜇 = 𝜇0 , 0 < 𝜎 2 < ∞.
against the alternative hypothesis
𝐻1 : 𝜇 > 𝜇0 , 0 < 𝜎 2 < ∞

277 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Here
and
Θ = {(𝜇, 𝜎 2 ): −∞ < 𝜇 < ∞, 0 < 𝜎 2 < ∞}
Θ0 = {(𝜇, 𝜎 2 ): 𝜇 = 𝜇0 , 0 < 𝜎 2 < ∞}
The maximum likelihood estimates of 𝜇 and 𝜎 2 belonging to Θ are given by
𝑥‾, if 𝑥‾ ≥ 𝜇0
𝜇ˆ = {
𝜇0 , if 𝑥‾ < 𝜇0
𝑠 2 , if 𝑥‾ ≥ 𝜇0
𝜎ˆ 2 = { 2
𝑠0 , if 𝑥‾ < 𝜇0
𝑛
1
𝑠02 = ∑ (𝑥𝑖 − 𝜇0 )2
𝑛
𝑖=1

Thus
1 𝑛/2
( ) ⋅ exp (−𝑛/2), if 𝑥‾ ≥ 𝜇0
2𝜋𝑠 2
𝐿(Θ̂) = 𝑛/2
1
( 2) ⋅ exp (−𝑛/2), if 𝑥‾ < 𝜇0
{ 2𝜋𝑠0
In Θ0 , the only unknown parameter is 𝜎 2 whose 𝑀𝐿𝐸 is given by 𝜎ˆ 2 = 𝑠02 . Thus
𝑛/2
1
𝐿(Θ̂0 ) =( ) ⋅ exp (−𝑛/2)
2𝜋𝑠02
𝐿(Θ̂0 ) (𝑠 2 /𝑠02 )𝑛/2 , if 𝑥‾ ≥ 𝜇0
∴ 𝜆 = ={
𝐿(Θ̂) 1, if 𝑥‾ < 𝜇0
Thus the sample observations (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) for which 𝑥‾ < 𝜇0 are to be included in the
acceptance region. Hence for the sample observations for which 𝑥‾ ≥ 𝜇0 , the likelihood ratio
criterion becomes
𝜆 = (𝑠 2 /𝑠02 )𝑛/2 , 𝑥‾ ≥ 𝜇0
which is the same as the expression obtained in (18.29). Proceeding similarly as in the above
problem, the critical region of the form 0 < 𝜆 < 𝜆0 will be equivalently given by 𝑡 2 =
𝑛(𝑥‾−𝜇0 )2 √𝑛(𝑥‾−𝜇 )
≥ 𝐴2 or by 𝑡 = 0
≥ 𝐴 where 𝑡 follows Student's t distribution with (𝑛 − 1)
𝑆2 𝑆
d.f. The constant 𝐴 is to be determined so that 𝑃(𝑡 > 𝐴) = 𝛼 ⇒ 𝐴 = 𝑡𝑛−1 (𝛼)
Hence for testing 𝐻0 : 𝜇 = 𝜇0 against 𝐻1 : 𝜇 > 𝜇0 , we have the right tailed-t-test defined as
follows :

278 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

√𝑛(𝑥‾−𝜇0 )
1. Reject 𝐻0 if 𝑡 = > 𝑡𝑛−1 (𝛼) and if 𝑡 < 𝑡𝑛−1 (𝛼), 𝐻0 may be accepted.
𝑆

2. If we want to test 𝐻0 : 𝜇 = 𝜇0 , 0 < 𝜎 2 < ∞ against the alternative hypothesis : 𝐻1 : 𝜇 <


𝜇0 , 0 < 𝜎 2 < ∞, then proceeding exactly similarly as in Remark 1 above, we shall get
the critical region given by : 𝑡 < −𝑡𝑛−1 (𝛼)
In this case we have the left tailed t-test defined as follows :
√𝑛(𝑥‾−𝜇0 )
If 𝑡 = < −𝑡𝑛−1 (𝛼), reject 𝐻0 otherwise 𝐻0 may be accepted.
𝑆

3. We summarise below in a tabular form the test criterion, along with the confidence
interval for the parameter for testing the hypothesis 𝐻0 : 𝜇 = 𝜇0 against various
alternatives for the normal population when 𝜎 2 is not known.
[Here 𝑡𝑛 (𝛼) is upper 𝛼-point of the 𝑡-distrbution with 𝑛 d.f. as defined in (18.33a)].
NORMAL POPULATION 𝑵(𝝁, 𝝈𝟐 ); 𝝈𝟐 UNKNOWN

Test For the Equality of Means of Two Normal Populations.


Let us consider two independent random variables 𝑋1 and 𝑋2 following normal distributions
𝑁(𝜇1 , 𝜎1 2 ) and 𝑁(𝜇2 , 𝜎2 2 ) respectively where the means 𝜇1 , 𝜇2 and the variances 𝜎1 2 , 𝜎2 2
are unspecified. Suppose we want to test the hypothesis :
𝐻0 : 𝜇1 = 𝜇2 = 𝜇, (say), (unspecified); 0 < 𝜎12 < ∞, 0 < 𝜎22 < ∞,
against the alternative hypothesis
𝐻1 : 𝜇1 ≠ 𝜇2 , 𝜎12 > 0, 𝜎22 > 0

279 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Case I. Population variances are unequal.


Θ = {(𝜇1 , 𝜇2 , 𝜎12 , 𝜎2 2 ): −∞ < 𝜇𝑖 < ∞, 𝜎𝑖2 > 0, 𝑖 = 1,2}
and
Θ0 = {(𝜇, 𝜎1 2 , 𝜎2 2 ): −∞ < 𝜇 < ∞, 𝜎𝑖 2 > 0, 𝑖 = 1,2}
Let 𝑥1𝑖 (𝑖 = 1,2, … , 𝑚) and 𝑥2𝑗 (𝑗 = 1,2, … , 𝑛) be two independent random samples of sizes
𝑚 and 𝑛 from the populations 𝑁(𝜇1 , 𝜎1 2 ) and 𝑁(𝜇2 , 𝜎2 2 ) respectively. Then the likelihood
function is given by
𝑚/2 𝑚
1 1
𝐿=( ) ⋅ exp { − 2 ∑ (𝑥1𝑖 − 𝜇1 )2 }
2𝜋𝜎12 2𝜎1
𝑖=1
𝑛/2 𝑛
1 1 2
×( ) ⋅ exp {− 2 ∑ (𝑥2𝑗 − 𝜇2 ) }
2𝜋𝜎22 2𝜎2
𝑗=1

The maximum likelihood estimates for 𝜇1 , 𝜇2 , 𝜎12 and 𝜎22 are given by the equations :
𝑚
∂ 1
log 𝐿 = 0 ⇒ 𝜇ˆ1 = ∑ 𝑥1𝑖 = 𝑥‾1
∂𝜇1 𝑚
𝑖=1
𝑛
∂ 1
log 𝐿 = 0 ⇒ 𝜇2 = ∑ 𝑥2𝑗 = 𝑥‾2
∂𝜇2 𝑛
𝑗=1
𝑚
∂ 1
ˆ12 = ∑ (𝑥1𝑖 − 𝑥‾1 )2 = 𝑠12 , (say).
2 log 𝐿 = 0 ⇒ 𝜎
∂𝜎1 𝑚
𝑖=1
𝑛
∂ 2
1 2
and 2 log 𝐿 = 0 ⇒ 𝜎
ˆ 2 = ∑ (𝑥2𝑗 − 𝑥‾2 ) = 𝑠22 , (say).
∂𝜎2 𝑛
𝑗=1

Substituting in (16.41), we get


𝑚/2 𝑛/2
1 1
𝐿(Θ) = ( ) ⋅( ) ⋅ 𝑒 −(𝑚+𝑛)/2
2𝜋𝑠12 2𝜋𝑠22
In Θ0 , we have 𝜇1 = 𝜇2 = 𝜇 and the likelihood function is given by :

280 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑚/2 𝑚 𝑛/2
1 1 2
1
𝐿(Θ0 ) = ( ) ⋅ exp {− ∑ (𝑥1𝑖 − 𝜇) } × ( )
2𝜋𝜎1 2 2𝜎1 2 2𝜋𝜎2 2
𝑖=1
𝑛
1 2
⋅ exp {− 2
∑ (𝑥2𝑗 − 𝜇) }
2𝜎2
𝑗=1

To obtain the maximum value of 𝐿(Θ0 ) for variations in 𝜇, 𝜎12 and 𝜎22 , it will be seen that
estimate of 𝜇 is obtained as the root of a cubic equation
𝑚2 (𝑥‾1 − 𝜇) 𝑛2 (𝑥‾2 − 𝜇)
+
∑𝑚
𝑖=1 (𝑥1𝑖 − 𝜇ˆ)
2 ∑𝑛𝑖=1 (𝑥2𝑖 − 𝜇ˆ)2
and is thus a complicated function of the sample observations. Consequently the likelihood
ratio criterion 𝜆 will be a complex function of the observations and its distribution is quite
tedious since it involves the ratio of two variances. Consequently, it is impossible to obtain the
critical region 0 < 𝜆 < 𝜆0 for given 𝛼 since the distribution of the population variances is
ordinarily unknown. However, in any given instance the cubic equation (16.43) can be solved
for 𝜇 by numerical analysis technique and thus 𝜆 can be computed. Finally, as an approximate
test, −2log 𝑒 𝜆 can be regarded as a 𝜒 2 -variate with 1 d.f. (c.f. Theorem 18.2).

Case 2. Population Variances are equal, i.e., 𝜎12 = 𝜎22 = 𝜎 2 , (say). In this case
Θ = {(𝜇1 , 𝜇2 , 𝜎 2 ): −∞ < 𝜇𝑖 < ∞, 𝜎 2 > 0, (𝑖 = 1,2)}
Θ0 = {(𝜇, 𝜎 2 ): −∞ < 𝜇 < ∞, 𝜎 2 > 0}
The likelihood function is then given by
𝑚 𝑛
1 (𝑚+𝑛)/2 1 2
𝐿=( 2
) ⋅ exp [− 2 {∑ (𝑥1𝑖 − 𝜇1 )2 + ∑ (𝑥2𝑗 − 𝜇2 ) )]
2𝜋𝜎 2𝜎
𝑖=1 𝑗=1
2
For 𝜇1 , 𝜇2 , 𝜎 ∈ Θ, the maximum

For 𝜇1 , 𝜇2 , 𝜎 2 ∈ Θ, the maximum likelihood equations are given by

∂ ∂
log 𝐿 = 0 ⇒ 𝜇ˆ1 = 𝑥‾1 and log 𝐿 = 0 ⇒ 𝜇ˆ2 = 𝑥‾2
∂𝜇1 ∂𝜇2
∂ 2
1 2
2
log 𝐿 = 0 ⇒ 𝜎
ˆ = {Σ(𝑥1𝑖 − 𝜇ˆ1 )2 + Σ(𝑥2𝑗 − 𝜇ˆ2 ) }
∂𝜎 𝑚+𝑛

281 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1 2 1
⇒ 𝜎ˆ 2 = {Σ(𝑥1𝑖 − 𝑥‾1 )2 + Σ(𝑥2𝑗 − 𝑥‾2 ) } = (𝑚𝑠12 + 𝑛𝑠22 )
𝑚+𝑛 𝑚+𝑛
… (18.45𝑎) 33
Substituting the values from (18.45) and (18.45a) in (18.44), we get
(𝑚+𝑛)/2
(𝑚 + 𝑛) 1
𝐿(Θ̂) = { 2 2
} ⋅ exp {− (𝑚 + 𝑛)}
2𝜋(𝑚𝑠1 + 𝑛𝑠2 ) 2
In Θ0 , 𝜇1 = 𝜇2 = 𝜇 (say) and we get

𝑚 𝑛
1 (𝑚+𝑛)/2 1 2
𝐿(Θ0 ) =( 2
) ⋅ exp [− 2 {∑ (𝑥1𝑖 − 𝜇)2 + ∑ (𝑥2𝑗 − 𝜇) }]
2𝜋𝜎 2𝜎
𝑖=1 𝑗=1

𝑚+𝑛 1 2
⇒ log 𝐿(Θ0 ) =𝐶− log 𝜎 2 − 2 {∑ (𝑥1𝑖 − 𝜇)2 + ∑ (𝑥2𝑗 − 𝜇) } ,
2 2𝜎
𝑖 𝑗

where 𝐶 is a constant independent of 𝜇 and 𝜎 2 .

The likelihood equation for estimating 𝜇 gives


𝑚 𝑛
∂ 1
log 𝐿 = − 2 {∑ (𝑥1𝑖 − 𝜇) + ∑ (𝑥2𝑗 − 𝜇)} = 0 ⇒ (𝑚𝑥‾1 + 𝑛𝑥‾2 ) − (𝑚 + 𝑛)𝜇 = 0
∂𝜇 𝜎
𝑖=1 𝑗=1
1
⇒ 𝜇ˆ = [𝑚𝑥‾1 + 𝑛𝑥‾2 ]
𝑚+𝑛
∂ (𝑚 + 𝑛) 1 2
Also 2
log 𝐿 = 0 ⇒ − 2
+ 4
[∑ (𝑥1𝑖 − 𝜇)2 + ∑ (𝑥2𝑗 − 𝜇) ] = 0
∂𝜎 2𝜎 2𝜎
1
⇒ 𝜎ˆ 2 = {Σ(𝑥1𝑖 − 𝜇ˆ)2 + Σ(𝑥2𝑖 − 𝜇ˆ)2 }
𝑚+𝑛
= Σ(𝑥1𝑖 − 𝑥‾1 )2 + 𝑚(𝑥‾1 − 𝜇ˆ)2 ,

the product term vanishes since ∑𝑖 (𝑥1𝑖 − 𝑥‾1 ) = 0.


𝑚
𝑚𝑥‾1 + 𝑛𝑥‾2 2
∴ ∑ (𝑥1𝑖 − 𝜇ˆ) 2
= 𝑚𝑠12 + 𝑚 (𝑥‾ − )
𝑚+𝑛
𝑖=1
𝑚𝑚2 (𝑥‾1 − 𝑥‾2 )2
= 𝑚𝑠12 +
(𝑚 + 𝑛)2
Similarly, we shall get :

282 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑛
2 𝑛𝑚2 (𝑥‾2 − 𝑥‾1 )2
∑ (𝑥2𝑗 − 𝜇ˆ) = 𝑛𝑠22 +
(𝑚 + 𝑛)2
𝑗=1

After Substituting, we get


1 𝑚𝑛
𝜎ˆ 2 = {𝑚𝑠12 + 𝑛𝑠22 + (𝑥‾ − 𝑥‾2 )2 }
𝑚+𝑛 𝑚+𝑛 1
Substituting from (18.48) and (18.49𝑎) in (18.47), we get
(𝑚+𝑛)/2
(𝑚 + 𝑛) 𝑚+𝑛
𝐿(Θ̂0 ) = { 𝑚𝑚 } × exp (− )
2𝜋 (𝑚𝑠12 + 𝑛𝑠22 + 𝑚 + 𝑛 (𝑥‾1 − 𝑥‾2 )2 ) 2

(𝑚+𝑛)/2
2 2
𝐿(Θ̂0 ) 𝑚𝑠1 + 𝑛𝑠2
∴ 𝜆 = ={ 𝑚𝑛 }
𝐿(Θ̂2 𝑚𝑠1 2 + 𝑛𝑠2 2 + 𝑚 + 𝑛 (𝑥‾1 − 𝑥‾2 )2
(𝑚+𝑛)/2

1
=[ ]
𝑚𝑛(𝑥‾1 − 𝑥‾2 )2
{1 + }
(𝑚 + 𝑛)(𝑚𝑠1 2 + 𝑛𝑠2 2 )

We know that (c.f. $16 ⋅ 3.3 ), under the null hypothesis 𝐻0 : 𝜇1 = 𝜇2 , the statistic :
ar
where
𝑥‾1 − 𝑥‾2
𝑡 = ,
1 1
𝑠√𝑚 + 𝑛
1
𝑆2 = (𝑚𝑠1 2 + 𝑛𝑠2 2 )
𝑚+𝑛−2
follows Student's 𝑡-distribution with (𝑚 + 𝑛 − 2) d.f. Thus in terms of 𝑡, we get
−(𝑚+𝑛)/2
𝑡2
𝜆 = (1 + )
𝑚+𝑛−2

the test can as well be carried with 𝑡 rather than with 𝜆. The critical region 0 < 𝜆 < 𝜆0
transforms to the critical region of the type

283 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

1
𝑡 2 > (𝑚 + 𝑛 − 2) [ 2 − 1] = 𝐴2 , ( say )
𝜆0 /(𝑚 + 𝑛)
i.e., by
where 𝐴 is determined so that: 𝑃[|𝑡| > 𝐴 ∣ 𝐻0 ] = 𝛼
Since under 𝐻0 , the statistic 𝑡 follows Student's 𝑡-distribution with (𝑚 + 𝑛 − 2) d.f.,
𝛼
we get : 𝐴 = 𝑡𝑚+𝑛−2 ( 2 )
where, 𝑡𝑛 (𝛼) is the right 100𝛼% point of the 𝑡-distribution with 𝑛𝑑. 𝑓.
Thus for testing the null hypothesis against the alternative :
𝐻0 : 𝜇1 = 𝜇2 ; 𝜎12 = 𝜎22 = 𝜎 2 > 0
𝐻1 : 𝜇1 ≠ 𝜇2 , 𝜎12 = 𝜎22 = 𝜎 2 > 0

𝜒 we have the two-tailed 𝑡-test defined as follows;


𝑥‾1 −𝑥‾2
If |𝑡| = | 1 1
| > 𝑡𝑚+𝑛−2 (𝛼/2), reject 𝐻0 , otherwise 𝐻0 may be accepted.
𝑠√ +
𝑚 𝑛

Remarks 1. Proceeding similarly as in Remarks to §18 ⋅ 7 ⋅ 1, we can obtain the critical


regions for testing
𝐻0 : 𝜇1 = 𝜇2 ; 𝜎12 = 𝜎2 2 = 𝜎 2 > 0
against the alternative hypothesis
or
𝐻1 : 𝜇1 > 𝜇2 ; 𝜎1 2 = 𝜎2 2 = 𝜎 2 > 0
𝐻1 : 𝜇1 < 𝜇2 ; 𝜎1 2 = 𝜎2 2 = 𝜎 2 > 0
We give below, in a tabular form the critical region, the test statistic and the confidence
interval for testing the hypothesis
𝐻0 : 𝛿 = 𝜇1 − 𝜇2 = 𝛿0 , (say),
against various alternatives, viz., 𝛿 > 𝛿0 , 𝛿 < 𝛿0 or 𝛿 ≠ 𝛿0 .
2 For testing 𝐻0 : 𝛿 = 𝛿0 against the alternative 𝐻1 : 𝛿 < 𝛿0 , the roles of 𝑥1 and 𝑥2 are
nterchanged and the case 1 of the table is applied.
3 If 𝛿0 = 0, the above test reduces to testing 𝐻0 : 𝜇1 = 𝜇2 , i.c., the equality of two
population
4 If the two population variances are not equal, then for testing 𝐻0 : 𝛿 = 𝛿0 , we use
Fisher-Behrens' d-test.

284 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

10.3.2 Test for the Equality of Mean of Several Normal Populations:


Let 𝑋𝑖𝑗 𝑗 = 1,2, … , 𝑛𝑖 ; 𝑖 = 1,2, … 𝑘 ) be 𝑘 independent random samples from 𝑘 normal
opulations with means 𝜇1 , 𝜇2 , … , 𝜇𝑘 respectively and unknown but common variance S 2 . In
other words, the 𝑘 normal populations are supposed to be homoscedastic. We vant to test the
null hypothesis
𝐻0 : 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑘 = 𝜇 (say), (unspecified)
𝜎12 = 𝜎22 = ⋯ = 𝜎𝑘2 = 𝜎 2 (say), (unspecified)
against the alternative hypothesis
𝐻1 : 𝜇𝑖′ 's are not all equal, and
𝜎12 = 𝜎22 = ⋯ = 𝜎𝑘2 = 𝜎 2 , (unspecified)
Thus we have, and Θ0 = {(𝜇1 , 𝜇2 , … , 𝜇𝑘 , 𝜎 2 ) − ∞ < 𝜇𝑖 = 𝜇 < ∞, (𝑖 = 1,2, … , 𝑘)𝜎 2 > 0}
The likelihood function of the sample observations is given by
1 𝑛/2 1 2
𝑛
pI 𝐿(Θ) = (2𝜋𝜎2 ) ⋅ exp {− 2𝜎2 ∑𝑘𝑖=1 ∑𝑗=1 𝑖
(𝑥𝑖𝑗 − 𝜇𝑖 ) } where 𝑛 = ∑𝑘𝑖=1 𝑛𝑖 .
𝑁 ( For variations of 𝜇𝑖 , (𝑖 = 1,2, … , 𝑘) and 𝜎 2 in Θ, the maximum likelihood estimates
given by
𝑛𝑖
∂ 1
agai log 𝐿(Θ) = 0 ⇒ ∑ (𝑥𝑖𝑗 − 𝜇𝑖 ) = 0 ⇒ 𝜇ˆ𝑖 = ∑ 𝑥𝑖𝑗 = 𝑥‾𝑖
∂𝜇𝑖 𝑛𝑖
𝑗 𝑗=1
∂ 1 2
2
log 𝐿(Θ) = 0 ⇒ 𝜎ˆ 2 = ∑ ∑ (𝑥𝑖𝑗 − 𝜇ˆ𝑖 )
∂𝜎 𝑛
𝑖 𝑗
1 2 𝑆𝑊
⇒ 𝜎ˆ 2 = ∑ ∑ (𝑥𝑖𝑗 − 𝑥‾𝑖 ) = , (say),
𝑛 𝑛
𝑖 𝑗

285 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

where in ANOVA (Analysis of Variance) terminology, 𝑆𝑊 is called within sample sum of


squares (W.S.S.).
In Θ0 , the only variable parameters are 𝜇 and 𝜎 2 and we have
1 𝑛/2 1 2
𝐿(Θ0 ) = ( 2
) ⋅ exp {− 2 ∑ ∑ (𝑥𝑖𝑗 − 𝜇) }
2𝜋𝜎 2𝜎
𝑖 𝑖
2
The MLE's of 𝜇 and 𝜎 are given by
∂ 1
log 𝐿(Θ0 ) = 0 ⇒ ∑ (𝑥𝑖𝑗 − 𝜇) = 0 ⇒ 𝜇 = ∑ ∑ 𝑥𝑖𝑗 = 𝑥‾
∂𝜇 𝑛
𝑖𝑗

and
1 2 𝑆𝑇
⇒ 𝜎ˆ 2 =
∑ ∑ (𝑥𝑖𝑗 − 𝑥‾) = (say),
𝑛 𝑛
where in ANOVA terminology, 𝑆𝑇 , is called total sum of squares (T.S.S.)
after Substituting , we get respectively
𝑛 𝑛/2 𝑛
𝐿(Θ̂) =( ) ⋅ exp (− )
2𝜋𝑆𝑊 2
𝑛 𝑛/2 𝑛
𝐿(Θ̂0 ) =( ) ⋅ exp (− )
2𝜋𝑆𝑇 2
𝑛/2
𝐿(Θ̂0 ) 𝑆𝑊
𝜆 = =( )
𝐿(Θ̂ 𝑆𝑇
And We have
2 2
𝑆𝑇 = ∑ ∑ (𝑥𝑖𝑗 − 𝑥‾) = ∑ (𝑥𝑖𝑗 − 𝑥‾𝑖 + 𝑥‾𝑖 − 𝑥‾)
𝑖𝑗 𝑗 𝑖𝑗

2
= ∑ ∑ (𝑥𝑖𝑗 − 𝑥‾𝑖 ) + ∑ (𝑥‾𝑖 − 𝑥‾)2 + 2 ∑ [(𝑥‾𝑖 − 𝑥‾) ∑ (𝑥𝑖𝑗 − 𝑥‾𝑖 )]
𝑖𝑗 𝑗 𝑖𝑗 𝑖 𝑗
𝑛 𝑖
But ∑𝑗=1 (𝑥𝑖𝑗 − 𝑥‾𝑖 ) = 0, being the algebraic sum of the deviations of the observations of the
𝑖 th sample from its mean.
2
∴ 𝑆𝑇 = ∑ ∑ (𝑥𝑖𝑗 − 𝑥‾𝑖 ) + ∑ 𝑛𝑖 (𝑥‾𝑖 − 𝑥‾)2 = 𝑆𝑊 + 𝑆𝐵 , (say),
𝑖𝑗 𝑗 𝑖
2
where 𝑆𝐵 = ∑𝑖 𝑛𝑖 (𝑥‾𝑖 − 𝑥‾) , in ANOVA terminology is called between samples sum squares
(B.S.S.).
after Substituting, we get
286 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑛/2
𝑆𝑊 1
𝜆=( ) =
𝑆𝑊 + 𝑆𝐵 𝑆𝐵 𝑛/2
(1 + 𝑆 )
𝑊

We know that under H0 , the statistic :


𝑆𝐵 /(𝑘 − 1)
𝐹=
𝑆𝑊 /(𝑛 − 𝑘)
follows 𝐹-distribution with (𝑘 − 1, 𝑛 − 𝑘) d.f.
Substituting in (18.64), the likelihood ratio criterion 𝜆 in terms of 𝐹 is given by
𝑘 − 1 −𝑛/2
𝜆 = (1 + 𝐹)
𝑛−𝑘
Since 𝜆 is a monotonic function of 𝐹, the test can well be carried on with 𝐹 as test statistic
rather than with 𝜆. The critical region for testing H0 against H1 , viz., 0 < 𝜆 < 𝜆0 is
equivalently given by
𝑘 − 1 𝑛/2
(1 + 𝐹) > 𝜆−1
0
𝑛−𝑘
𝑛−𝑘
⇒ 𝐹> {(𝜆0 )−2/𝑛 − 1} = 𝐴, (say),
𝑘−1
where 𝐴 is determined from the equation: 𝑃[𝐹 > 𝐴 ∣ 𝐻0 ] = 𝛼
Since F follows F-distribution with (𝑘 − 1, 𝑛 − 𝑘)𝑑. 𝑓. , 𝐴 = 𝐹𝑘−1,𝑛−𝑘 (𝛼) where
𝐹𝑘−1,𝑛−𝑘 (𝛼) denotes the upper 𝛼-point of the 𝐹-distribution with (𝑘 − 1, 𝑛 − 𝑘)𝑑. 𝑓
Hence the test for testing
𝐻0 : 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑘 = 𝜇, 𝜎12 = 𝜎22 = ⋯ = 𝜎𝑘2 = 𝜎 2 > 0
against the alternative inpothesis
𝐻1 : 𝜇𝑖′ s are not all equal,,𝜎12 = 𝜎22 = ⋯ = 𝜎𝑘2 = 𝜎 2 > 0
is defined as follows :
Reject 𝐻0 if 𝐹 > 𝐹𝑘−1,𝑛−𝑘 (𝛼), otherwise 𝐻0 may be accepted, where 𝐹 is defined in (18.65).

Remark. In ANOVA terminology, 𝑆𝐵 /(𝑘 − 1) is called Between Samples Mean Sum of


Squares (M.S.S.) while 𝑆𝑊 /(𝑛 − 𝑘) is called Within Samples (or Error) Mean Sum of Squares
and thus 𝐹 is defined as
Between Samples M.S.S.
𝐹=
Within Samples M.S.S.

287 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Test For the Variance of a Normal Population. Let us now consider the problem of testing if
the variance of a normal population has a specified value 𝜎02 on the basis of a random sample
𝑥1 , 𝑥2 , … , 𝑥𝑛 of size 𝑛 from normal population 𝑁(𝜇, 𝜎 2 )
We want to test the hypothesis: 𝐻0 : 𝜎 2 = 𝜎02 , (specified),
against the alternative hypothesis: 𝐻1 : 𝜎 2 ≠ 𝜎02
Here we have and
Θ = {(𝜇, 𝜎 2 ): −∞ < 𝜇 < ∞, 𝜎 2 > 0}
Θ0 = {(𝜇, 𝜎 2 ): −∞ < 𝜇 < ∞, 𝜎 2 = 𝜎02 }
The likelihood function of the sample observations is given by
𝑛
1 𝑛/2 1
𝐿=( 2
) , exp {− 2 ∑ (𝑥𝑖 − 𝜇)2 }
2𝜋𝜎 2𝜎
𝑖=1

we shall get
1 𝑛/2 𝑛
𝐿(Θ̂) = ( 2
) exp (− )
2𝜋𝑠 2
In Θ0 , we have only one variable parameter, viz., 𝜇 and
𝑛
1 𝑛/2 1
)
𝐿(Θ0 = ( 2
) exp {− 2 ∑ (𝑥𝑖 − 𝜇)2 }
2𝜋𝜎 2𝜎0
𝑖=1

The MLE for 𝜇 is given by : log 𝐿 = 0 ⇒ 𝜇ˆ = 𝑥‾
∂𝜇
𝑛/2 𝑛
1 1
∴ 𝐿(Θ̂0 ) =( ) exp {− 2 ∑ (𝑥𝑖 − 𝑥‾)2 }
2𝜋𝜎02 2𝜎0
𝑖=1
𝑛/2
1
=( ) exp (−𝑛𝑠 2 /2𝜎02 )
2𝜋𝜎02
The likelihood ratio criterion is given by
𝑛/2
𝐿(Θ̂0 ) 𝑠2 1 𝑛𝑠 2
𝜆= = ( 2) exp {− ( 2 − 𝑛)}
𝐿(Θ̂) 𝜎0 2 𝜎0
𝑛𝑠2
We know that under 𝐻0 , the statistic : 𝜒 2 = follows chi-square distribution with (𝑛 − 1)
𝜎02
d.f. In terms of 𝜒 2 , we have
𝑛/2
𝜒2 1
𝜆 = [𝑛] ⋅ exp [− 2 (𝜒 2 − 𝑛)]

288 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Since 𝜆 is a monotonic function of 𝜒 2 , the test may be done using 𝜒 2 as a criterion. The critical
region 0 < 𝜆 < 𝜆0 is now equivalent to
1
(𝜒 2 /𝑛)𝑛/2 exp [− (𝜒 2 − 𝑛)] < 𝜆0
2
1 2
or exp (− 𝜒 ) (𝜒 2 )𝑛/2 < 𝜆0 ⋅ (𝑛𝑒 −1 )𝑛/2 = 𝐵, (say).
2
Since 𝜒 2 has chi-square distribution with (𝑛 − 1) d.f., the critical region is determined by a
pair of intervals 0 < 𝜒 2 < 𝜒22 and 𝜒12 < 𝜒 2 < ∞, where 𝜒12 and 𝜒2 2 are to be determined such
that the ordinates of (18.73) are equal, i.e.,
1 1
(𝜒1 2 )𝑛/2 exp (− 𝜒12 ) = (𝜒2 2 )𝑛/2 exp (− 𝜒2 2 )
2 2

Critical region is shown as shaded region in the above diagram.


In other words, 𝜒1 2 and 𝜒2 2 are defined by the equations
𝛼 𝛼
𝑃(𝜒 2 > 𝜒12 ) = and 𝑃(𝜒 2 > 𝜒22 ) = 1 −
2 2
In other words, 𝜒12 = 𝜒 2 𝑛 − 1(𝛼/2) and 𝜒22 = 𝜒𝑛−1 2
(1 − 𝛼/2), where 𝜒 2 (𝑛 − 1)(𝛼) is the
upper 𝛼-point of the chi-square distribution with (𝑛 − 1) d.f. Thus the critical region for
testing 𝐻0 : 𝜎 2 = 𝜎02 against 𝐻1 : 𝜎 2 ≠ 𝜎02 , is a two-tailed region given by: 𝜒 2 > 𝜒𝑛−1
2
(𝛼/2)
2 2
and 𝜒 < 𝜒 𝑛 − 1(1 − 𝛼/2) ...(18.76)
Thus, in this case we have a two-tailed test.
Remarks. 1. If we want to test 𝐻0 : 𝜎 2 = 𝜎02 against the alternative hypothesis 𝐻1 : 𝜎 2 < 𝜎0 2 ,
we get a one-tailed (left-tail) test with critical region 𝜒 2 < 𝜒 2 (𝑛 − 1)(𝑎) while for testing 𝐻0
against 𝐻1 : 𝜎 2 > 𝜎02 , we have a right tail test with critical region 𝜒 2 > 𝜒 2 (𝑛 − 1) (𝜔) .
We give below in a tabular form, the test statistic, the test criterion and the confidence interval
for the parameter 𝜎 2 for testing 𝐻0 : 𝜎 2 = 𝜎02 , 𝜇 (unknown) against various alternative
hypotheses.

289 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

10.3.3 Test for Equality of Variance of Two Normal Populations


Consider two normal populations 𝑁(𝜇1 , 𝜎1 2 ) and 𝑁(𝜇2 , 𝜎2 2 ) where the means 𝜇1 and 𝜇2 and
variances 𝜎12 , 𝜎22 are unspecified. We want to test the hypothesis : 𝐻0 : 𝜎1 2 = 𝜎2 2 = 𝜎 2
(unspecified) with 𝜇1 and 𝜇2 (unspecified) against the alternative hypothesis : 𝐻1 : 𝜎12 ≠
𝜎2 2 ; 𝜇1 and 𝜇2 (unspecified).
If 𝑥1𝑖 (𝑖 = 1,2, … , 𝑚) and 𝑥2𝑗 (𝑗 = 1,2, … , 𝑛) be independent random samples of sizes 𝑚 and
𝑛 from 𝑁(𝜇1 , 𝜎1 2 ) and 𝑁(𝜇2 , 𝜎22 ) respectively then
𝑚/2 𝑚 𝑛/2 𝑛
1 1 1 1 2
𝐿=( ) exp {− 2 ∑ (𝑥1𝑖 − 𝜇1 )2 } × ( ) exp {− 2 ∑ (𝑥2𝑗 − 𝜇2 ) }
2𝜋𝜎12 2𝜎1 2𝜋𝜎22 2𝜎2
𝑖=1 𝑗=1

In this case Θ = {𝜇1 , 𝜇2 , 𝜎1 2 , 𝜎2 2 ): −∞ < 𝜇𝑖 < ∞; 𝜎𝑖 2 > 0, (𝑖 = 1,2)}


and Θ0 = {(𝜇1 , 𝜇2 , 𝜎 2 ): −∞ < 𝜇𝑖 < ∞; (𝑖 = 1,2), 𝜎 2 > 0}
𝑚/2 𝑛/2
1 1 1
𝐿(Θ̂) = ( ) ⋅( ) ⋅ exp {− (𝑚 + 𝑛)}
2𝜋𝑠12 2𝜋𝑠22 2
where 𝑠1 2 and 𝑠2 2 are as defined above
In Θ0 , the likelihood function (18.77) is given by

1 (𝑚+𝑛)/2 1 2
𝐿(Θ0 ) = ( 2
) ⋅ exp [− 2 {∑ (𝑥1𝑖 − 𝜇1 )2 + ∑ (𝑥2𝑗 − 𝜇2 ) }]
2𝜋𝜎 2𝜎
𝑖 𝑗

and the MLE's for 𝜇1 , 𝜇2 and 𝜎 2 are now given by 𝜇ˆ1 = 𝑥‾1 , 𝜇ˆ2 = 𝑥‾2
and

1 2
𝜎ˆ 2 = {∑ (𝑥1𝑖 − 𝜇ˆ1 )2 + ∑ (𝑥2𝑗 − 𝜇ˆ2 ) }
(𝑚 + 𝑛)
𝑖 𝑗

1 2 𝑚𝑠12 + 𝑛𝑠22
= {∑ (𝑥1𝑖 − 𝑥‾1 )2 + ∑ (𝑥2𝑗 − 𝑥‾2 ) } =
𝑚+𝑛 𝑚+𝑛
𝑖 𝑗

After Substituting , we get

290 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(𝑚+𝑛)/2
𝑚+𝑛 1
𝐿(Θ̂0 ) ={ 2 2
} ⋅ exp {− (𝑚 + 𝑛)}
2𝜋(𝑚𝑠1 + 𝑛𝑠2 ) 2
𝐿(Θ̂)0 (𝑚+𝑛)/2
(𝑠1 ) (𝑠2 2 )𝑛/2
2 𝑚/2
∴ 𝜆 = = (𝑚 + 𝑛) { }
𝐿(Θ̂) [𝑚𝑠1 2 + 𝑛𝑠2 2 ](𝑚+𝑛)/2
(𝑚 + 𝑛)(𝑚+𝑛)/2 (𝑚𝑠1 2 )𝑚/2 (𝑛𝑠2 2 )𝑛/2
= { }
𝑚𝑚/2 ⋅ 𝑛𝑛/2 [𝑚𝑠12 + 𝑛𝑠2 2 ](𝑚+𝑛)/2

∑(𝑥1𝑖 −𝑥‾1 )2 /(𝑚−1) 𝑆2


We know that under 𝐻0 , the statistic 𝐹 = 2 = 𝑆12
∑(𝑥2𝑗 −𝑥‾2 ) /(𝑛−1) 2

follows 𝐹-distribution with (𝑚 − 1, 𝑛 − 1) d.f. also implies

𝑚(𝑛 − 1)𝑠12 𝑚−1 𝑚𝑠12


𝐹= ⇒ ( ) 𝐹 =
𝑛(𝑚 − 1)𝑠22 𝑛−1 𝑛𝑠2 2
After Substituting in and simplifying, we get
𝑚 − 1 𝑚/2
(𝑚 + 𝑛)(𝑚+𝑛)/2 ( 𝑛 − 1 𝐹)
𝜆= { }
𝑚𝑚/2 𝑛𝑛/2 𝑚 − 1 (𝑚+𝑛)/2
(1 + 𝑛 − 1 𝐹)
Thus 𝜆 is a monotonic function of 𝐹 and hence the test can be carried on with 𝐹, defined as
test statistic. The critical region 0 < 𝜆 < 𝜆0 can be equivalently seen to be given by pair of
intervals 𝐹 ≤ 𝐹1 and 𝐹 ≥ 𝐹2 , where 𝐹1 and 𝐹2 are determined so that under H0
𝑃(𝐹 ≥ 𝐹2 ) = 𝛼/2 and 𝑃(𝐹 ≥ 𝐹1 ) = 1 − 𝛼/2
Since, under 𝐻0 , 𝐹 follows Snedecor's 𝐹-distribution with (𝑚 − 1, 𝑛 − 1) d.f., we have
𝐹2 = 𝐹𝑚−1,𝑛−1 (𝛼/2) and 𝐹1 = 𝐹𝑚−1,𝑛−1 (1 − 𝛼/2)
where 𝐹𝑚,𝑛 (𝛼) is the upper 𝛼-point of 𝐹-distribution with (𝑚, 𝑛) d.f. Consequently, for
testing 𝐻0 : 𝜎12 = 𝜎22 against the alternative hypothesis 𝐻1 : 𝜎12 ≠ 𝜎22 , we have a twotailed F-
test, the critical region being given by
𝐹 > 𝐹𝑚−1,𝑛−1 (𝛼/2) and 𝐹 < 𝐹𝑚−1,𝑛−1 (1 − 𝛼/2)
𝜎2
Remark. Let us suppose that we want to test the hypothesis 𝐻0 : 𝜎12 = 𝛿02 Without loss of
2
generality, we can assume that 𝑆1 2 > 𝑆2 2 , where 𝑆1 2 and 𝑆2 2 are unbiased estimates of 𝜎12
and 𝜎2 2 respectively. We know that the statistic

291 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

𝑆1 2 /𝜎12 𝑆1 2 1
𝐹= 2 = 2 ⋅ 2 (under 𝐻0 ),
𝑆2 /𝜎2 2 𝑆2 𝛿1
follows 𝐹-distribution with (𝑚 − 1, 𝑛 − 1) d.f. The test-statistic, the test criterion and (1 − 𝛼)
confidence interval for the parameter for various alternative hypotheses are given in the
following table.
If 𝛿0 = 1, the above test reduces to testing the equality of population variances.
𝜎2
NORMAL POPULATION; 𝐻0 : 𝜎12 = 𝛿02
2

10.3.4 Test for Equality of Variance of Several Normal Populations :


Let 𝑋𝑖𝑗, (𝑗 = 1,2, … , 𝑛𝑖 ) be a random sample of size 𝑛𝑖 from the normal population 𝑁(𝜇𝑖 , 𝜎𝑖2 ),
𝑖 = 1,2, … , 𝑘. We want to test the null hypothesis :
𝐻0 : 𝜎12 = 𝜎22 = ⋯ = 𝜎𝑘2 = 𝜎 2 (unspecified), with 𝜇1 , 𝜇2 , … , 𝜇𝑘 (unspecified), against the
alternative hypothesis :
𝐻1 : 𝜎𝑖 2 (𝑖, 2, … , 𝑘), are not all equal; 𝜇1 , 𝜇2 , … , 𝜇𝑘 (unspecified).
Here we have
Θ = {𝜇1 , 𝜇2 , … , 𝜇𝑘 ; 𝜎1 2 , 𝜎2 2 , … , 𝜎𝑘2 ): −∞ < 𝜇𝑖 < ∞, 𝜎𝑖2 > 0(𝑖 = 1,2, … , 𝑘)}
and
Θ0 = {𝜇1 , 𝜇2 , … , 𝜇𝑘 ; 𝜎12 , 𝜎22 , … , 𝜎𝑘2 ): −∞ < 𝜇𝑖 < ∞, 𝜎𝑖2 = 𝜎 2 > 0, (𝑖 = 1,2, … , 𝑘)}
The likelihood function of the sample observations 𝑥𝑖𝑗 (𝑗 = 1,2, … , 𝑛𝑖 : 𝑖 = 1,2, … , 𝑘) is given
by

292 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑘 𝑛𝑖 /2 𝑛𝑖
1 1
𝐿 = ∏ [( ) ⋅ exp {− 2 ∑ (𝑥𝑖𝑗 − 𝜇𝑖 )}]
2𝜋𝜎𝑖2 2𝜎𝑖
𝑖=1 𝑗=1

It can be easily seen that in Θ the MLE's of 𝜇𝑖′ 's and 𝜎𝑖′ 's are given by
𝑛𝑖
1 2
𝜇ˆ𝑖 = and 𝜎ˆ𝑖2 = ∑ (𝑥𝑖𝑗 − 𝑥‾𝑖 ) = 𝑠𝑖 2
𝑛𝑖
𝑗=1
𝑘 𝑛𝑖 /2
1 𝑛𝑖
16.4. ∴ 𝐿(Θ̂) = ∏ {( ) ⋅ exp (− )}
2𝜋𝑠𝑖2 2
𝑖=1
𝑘 𝑛𝑖 /2
𝑛 1
= exp (− ) ⋅ ∏ {( ) } , where 𝑛 = ∑ 𝑛𝑖
2 2𝜋𝑠𝑖2
𝑖=1

In Θ0 , 𝜎12 = 𝜎22 =⋯= 𝜎𝑘2 2


= 𝜎 and therefore

1 𝑛/2 1 2
𝐿(Θ0 ) = ( 2
) ⋅ exp {− 2 ∑ ∑ (𝑥𝑖𝑗 − 𝜇𝑖 ) }
2𝜋𝜎 2𝜎
𝑗 𝑖

The 𝑀𝐿𝐸 's of 𝜇𝑖′ s and 𝜎 2 are given by


1 2 1
𝜇ˆ𝑖 = 𝑥‾𝑖 and 𝜎ˆ 2 = ∑ ∑ (𝑥𝑖𝑗 − 𝑥‾𝑖 ) = ∑ 𝑛𝑖 𝑠𝑇 2
𝑛 𝑛
𝑖 𝑖

After Substituting , we get


𝑛/2
𝑛 𝑛
𝐿(Θ̂0 ) =( ) ⋅ exp (− )
2𝜋 ∑ 𝑛𝑖 𝑠𝑖2 2
𝑛/2 ∏1 2 𝑛𝑖 /2
𝐿(Θ̂0 ) 𝑛 𝑖=1 [(𝑠𝑖 ) ]
𝜆 = =
𝐿(Θ̂2 ) {∑𝑘𝑖=1 𝑛𝑠𝑖2 }𝑛/2
∏1𝑖=1 {(𝑠𝑖2 )𝑛𝑖/2 } 1
= 2 2𝑛 /2
, where 𝑠 2
= ∑ 𝑛𝑖 𝑠𝑖2
(𝑠 ) 𝑖 𝑛
𝑖
𝑘 𝑛 /2
𝑠𝑖2 𝑖
=∏ {( 2 ) }
𝑠
𝑖=1

𝜆 is thus a complicated function of sample observations and it is not easy to obtain its
distribution. However, if 𝑛𝑖′ 's are large (𝑖 = 1,2, … , 𝑘), it provides an approximate test defined
as follows :
293 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

For large 𝑛𝑖′ s, the quantity −2log 𝜆 𝜆 is approximately distributed as a chi-square variate with
2𝑘 − (𝑘 + 1) = 𝑘 − 1 d.f.
The test can, however, be made even if 𝑛𝑖 's are not large. It has been investigated and found
that the distribution of −2log 𝑐 𝜆 is approximately a 𝜒 2 -distribution with (𝑘 − 1) d.f. even for
small 𝑛𝑖 's. However, a better approximation is provided by the Bartlett's test statistic:
−2log 𝜆′
𝜒2 =
1 1 1
1+ {∑ ( ) − }
3(𝑘 − 1) 𝑖 𝑛𝑖 ∑𝑖 𝑛𝑖
where 𝜆′ is obtained from 𝜆 on replacing 𝑛𝑖 by (𝑛𝑖 − 1) , which follows 𝜒 2 -distribution with
(𝑘 − 1) d.f. Thus the test statistic, under 𝐻0 is given by
𝑠2
∑𝑘𝑖=1 {(𝑛𝑖 − 1)log 𝑐 ( 2 )}
𝑠𝑖 2
𝜒2 = ∼ 𝜒𝑘−1
1 1 1
1+ {∑ ( ) − }
3(𝑘 − 1) 𝑖 𝑛𝑖 ∑ 𝑛𝑖
The critical region for test is, of course, the right-tail of the 𝜒 2 -distribution given by: 𝜒 2 >
𝜒 2 (𝑘 − 1)(𝛼),

Example 1. A sample of 400 students is found to have a mean height of 67.47 inches. Can it
be reasonably regarded as a sample from a large (or normal) population with mean heigh 67.39
inches and standard deviation 1.3 inches?
Solution. 𝐻0 : 𝜇 = 67.39 i.e., the sample is taken from a population against, with mean height
𝑥‾−𝜇
𝜇 = 67.39 inches, Test statistic: 𝑍 = 𝜎/ 𝑛 − 𝑁(0,1) under 𝐻0 . the value of the test statistic 𝑍

in given by
67.47 − 67.39
|𝑍| = = 1.23
(1.3)/√400
For 𝛼 = .05, the two-sided critical regions are given by |𝑍| > 1.96.
Conclusion:
Since the calculated value of |Z| under 𝐻0 is less than 1.96, the null hypothesis H0 is accepted.
Hence, we conclude that the sample could be regarded as drawn from a population with mean
67.39 inches and standard deviation 1.3 inches at 5% level of significance.
Alternative Method for Conclusion:
Since the difference 𝑥‾ ∼ 𝜇0 is less than 1.96 times the standard error of 𝑥‾, hence 𝐻1 is accepted.
Example 2 A random sample of 900 members is found to have a mean of 3.4 cm. Could it
come from a large population with mean 𝜇 = 3.25cms and 𝜎 = 2.16cms ?
294 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Solution. 𝐻0 : 𝜇 = 𝜇0 = 3 ⋅ 25, 𝐻1 : 𝜇 ≠ 3.25


Test statistic:
𝑥‾ − 𝜇
𝑍= ∼ 𝑁(0,1)
𝜎/√𝑛
under 𝐻0 , the value of test statistic 𝑧 is
𝑥‾ − 𝜇0
|𝑧| =
𝜎/√𝑛
3.4 − 3.25
=
2.61/√900
0.15 × 30
= = 1.72
2.61
Conclusion:
Since |𝑍. |cal is less than 𝛼 = 27%, C.R = 𝑃(|𝑍| > 3)
For have been drawn from a 3, we conclude that the given sample might deviation.
Example 3. A large corporation uses thousands of light bulbs ever year. The brand that has
been used in the past has an average life of 1000 hours with a standard deviation of 100 hours.
A new brand is offered to the corporation at a price far lower than the one they are paying for
the old brand. It is decided that they will switch to the new brand unless the new brand is
proved to have a smaller average life at the 𝛼 = 0.05 level of significance. Consequently, a
sample of 100 new brand bulls is tested, yielding an average life of 985 hours. Should the
company switch to the new brand
Solution. Here 𝐻0 : 𝜇 = 1000 hours, i.e., the average life of the new brand is equal to 1000
hours, 𝐻1 : 𝜇 < 1000 hours 𝑥‾ = 985 hours, 𝑛 = 100, the population standard deviation.
𝜎 = 100 hours.
Test Statistic :
x‾ − μ
Z = ∼ N(0,1)
σ √n
985 − 1000
= , under H0 : μ = 100
100/√100
−15
= = −1.5
10
Critical Region : For α = .05 and using the left tail of the standard normal curve as critical
region, H0 will be rejected if Zcal Zα i.e., if Zcal < −1.645.

295 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Conclusion : Since Zcal does not fall in the critical region, H0 is accepted.
It implies that the company should switch to the new brand.

10.4 IN-TEXT QUESTIONS


MCQ’s Problems
Question: 1
Let 𝑋 and 𝑌 be two independent 𝑁(0,1) random variables. Then 𝑃(0 < 𝑋 2 + 𝑌 2 < 4) equals
A. 1 − 𝑒 −2
B. 1 − 𝑒 −4
C. 1 − 𝑒 −1
D. 𝑒 −2
Question: 2
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 denote a random sample from a normal distribution with variance 𝜎 2 > 0. If
(𝑋 −𝑋‾)2
the first percentile of the statistic 𝑊 = ∑𝑛𝑖=1 𝑖𝜎2 is 1.24 and 𝑃(𝜒72 ≤ 1.24) = 0.01 and
𝑃(𝜒72 > 1.24) = 0.99, where 𝑋‾ denotes the sample mean, what is the sample size n ?
A. 7
B. 8
C. 6
D.5
Question: 3
Consider the sample linear regression model 𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖 , 𝑖 = 1,2, … , 𝑛 Where 𝜖𝑖′ 𝑠 are
i.i. d random variables with mean 0 and variance 𝜎 2 ∈ (0, ∞)
Suppose that we have a data set (𝑥1 , 𝑦2 ), … , (𝑥𝑛 , 𝑦𝑛 ) with n = 20,
∑𝑛𝑖=1 𝑥𝑖 = 100, ∑𝑛𝑖=1 𝑦𝑖 = 50, ∑𝑛𝑖=1 𝑥𝑖2 = 600, ∑𝑛𝑖=1 𝑦𝑖2 = 500 and ∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 = 400
Then the least square estimates of 𝛼 and 𝛽 are respectively,
A. 5 and 3/2
B. −5 and 3/2
C. 5 and −3/2
D.-5 and −3/2

296 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question: 4
If 𝑋1 , 𝑋2 , … , 𝑋𝑛 is random sample from a population with density
𝜃−1
if 0<𝑥 < 1
𝑓(𝑥, 𝜃) = {𝜃𝑥
0, otherwise
Where 𝜃 > 0 is an unknown parameter, what is 100(1 − 𝛼)% confidence interval for 𝜃 ?
2
𝜒𝛼 (2𝑛) 𝜒2 𝛼 (2𝑛)
1−
2 2
A. [ , ]
2 ∑𝑛
𝑖=1 ln 𝑋𝑖 2 ∑𝑛
𝑖=1 ln 𝑋𝑖

2
𝜒𝛼 (𝑛) 𝜒2 𝛼 (𝑛)
1−
2 2
B. [ , ]
−2 ∑𝑛
𝑖=1 ln 𝑋𝑖 −2 ∑𝑛
𝑖=1 ln 𝑋𝑖

2
C. 𝜒𝛼 (2𝑛) 𝜒2 𝛼 (2𝑛)
1−
2 2
[ , ]
−2 ∑𝑛
𝑖=1 ln 𝑋𝑖 −2 ∑𝑛
𝑖=1 ln 𝑋𝑖

2
𝜒𝛼 (𝑛) 𝜒2 𝛼 (𝑛)
1−
2 2
D. [2 ∑𝑛 , 𝑛 ]
𝑖=1 ln 𝑋𝑖 2 ∑𝑖=1 ln 𝑋𝑖

Question: 5
Let 𝑋1 , … , 𝑋𝑛 be a random sample from a 𝑁(2𝜃, 𝜃 2 ) population, 𝜃 > 0. A consistent
estimator for 𝜃 is
1
A. 𝑛 ∑𝑛𝑖=1 𝑋𝑖
5 1/2
B. (𝑛 ∑𝑛𝑖=1 𝑋𝑖2 )
1
C. 5𝑛 ∑𝑛𝑖=1 𝑋𝑖2
1 1/2
D. (5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 )

Question: 6
What is the arithmetic mean of the data set: 4, 5, 0, 10, 8, and 3?

297 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

A. 4
B. 5
C. 6
D. 7
Question: 7
Which of the following cannot be the probability of an event?
A. 0.0
B. 0.3
C. 0.9
D. 1.2
Question: 8
If a random variable X has a normal distribution, then eX has a _____ distribution.
A. lognormal
B. exponential
C. poisson
D. binomial
Question: 9
What is the geometric mean of: 1, 2, 8, and 16?
A. 4
B. 5
C. 6
D. 7
Question: 10
Which test is applied to Analysis of Variance (ANOVA)?
A. t test
B. z test
C. F test
D. χ2 test
Question: 11
The arithmetic mean of all possible outcomes is known as
A. expected value
B. critical value
C. variance
D. standard deviation

298 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question: 12
Which of the following cannot be the value of a correlation coefficient?
A. –1
B. –0.75
C. 0
D. 1.2
Question: 13
Var (X) = ?
A. E[X2]
B. E[X2] – E[X]
C. E[X2] + E[X]2
D. E[X2] – E[X]2
Question: 14
Var (X + Y) = ?
A. E[X/Y] + E[Y]
B. E[Y/X] + E[X]
C. Var(X) + Var(Y) + 2 Cov(X, Y)
D. Var(X) + Var(Y) – 2 Cov(X, Y)
Question: 15
What is variance of the data set: 2, 10, 1, 9, and 3?
A. 15.5
B. 17.5
C. 5.5
D. 7.5
Question: 16
In a module, quiz contributes 10%, assignment 30%, and final exam contributes 60%
towards the final result. A student obtained 80% marks in quiz, 65% in assignment, and 75%
in the final exam. What are average marks?
A. 64.5%
B. 68.5%
C. 72.5%
D. 76.5%

299 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Question: 17
In a university, average height of students is 165 cm. Now, consider the following Table,
Height 160-162 162-164 164-166 166-168 168-170
Students 16 20 24 20 16
What type of distribution is this?
A. Normal
B. Uniform
C. Poisson
D. Binomial
Question: 18
What is the average of 3%, 7%, 10%, and 16% ?
A. 8%
B. 9%
C. 10%
D. 11%
Question: 19
The error of rejecting the null hypothesis when it is true is known as
A. Type-I error
B. Type-II error
C. Type-III error
D. Type-IV error
Question: 20
The mean and variance of Poisson distribution with parameter lamda are both
A. 0
B. 1
C. λ
D. 1/λ
Question 21.
Let the discrete random variables 𝑋 and 𝑌 have the joint probability mass function
𝑒 −1
; 𝑚 = 0,1,2, … , 𝑛; 𝑛 = 0,1,2, …
𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) = {(𝑛 − 𝑚)! 𝑚! 2𝑛
0, otherwise
Which of the following statements is(are) TRUE?

300 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

A. The marginal distribution of 𝑋 is Poisson with mean 1/2


B. The random variable 𝑋 and 𝑌 are independent
1
C. The conditional distribution of X given Y = 5 is Bin (6, 2)
D. 𝑃(𝑌 = 𝑛) = (𝑛 + 1)𝑃(𝑌 = 𝑛 + 2) for 𝑛 = 0,1,2, …
Question 22.
Consider the trinomial distribution with the probability mass function
2! 1 𝑥 2 𝑦 3 2−𝑥−𝑦
𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) = ( ) ( ) ( )
𝑥! 𝑦! (2 − 𝑥 − 𝑦)! 6 6 6
, 𝑥 ≥ 0, 𝑦 ≥ 0, and 0 < 𝑥 + 𝑦 ≤ 2. Then Corr (𝑋, 𝑌) is equal to…
(correct up to two decimal places)
A) -0.31
B) 0.31
C) 0.35
D) 0.78
Question 23.
Let 𝑥1 = 1.1, 𝑥2 = 0.5, 𝑥3 = 1.4, 𝑥4 = 1.2 be the observed values of a random sample of size
four from a distribution with the probability density function
𝑒 𝜃−𝑥 , if 𝑥 ≥ 𝜃, 𝜃 ∈ (−∞, ∞)
𝑓(𝑥 ∣ 𝜃) = {
0, otherwise
Then the maximum likelihood estimate of 𝜃 2 + 𝜃 + 1 is equal (up to decimal place).
A) 1.75
B) 1.89
C) 1.74
D) 0.87
Question 24.
Let 𝑈 ∼ 𝐹5,8 and 𝑉 ∼ 𝐹8,5. If 𝑃[𝑈 > 3.69] = 0.05, then the value of C such that
𝑃[𝑉 > 𝑐] = 0.95 equals… (round off two decimal places)
A) 0.27
B) 1.27
301 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

C) 2.27
D) 2.29
Question 25.
Let P be a probability function that assigns the same weight to each of the points of the sample
space Ω = {1,2,3,4}. Consider the events E = {1,2}, F = {1,3} and G = {3,4}. Then which of
the following statement(s) is (are) TRUE?
1. E and F are independent
2. E and G are independent
3. E, F and G are independent
Select the correct answer using code given below:
A. 1 only
B. 2 only
C. 1 and 2 only
D. 1,2 and 3
Question 26
Let 𝑋1 , 𝑋2 , … , 𝑋4 and 𝑌1 , 𝑌2 , … , 𝑌5 be two random samples of size 4 and 5 respectively, from a
51 𝑋 2 +𝑋 2 +𝑋 2 +𝑋 2
2 3 4
standard normal population. Define the statistic T = (4) 𝑌 2+𝑌 2 +𝑌 2 +𝑌 2 +𝑌 2 , then which of the
1 2 3 4 5
following is TRUE?
A. Expectation of 𝑇 is 0.6
B. Variance of T is 8.97
C. T has F-distribution with degree of freedom 5 and 4
D. T has F-distribution with degree of freedom 4 and 5
Question 27.
Let 𝑋, 𝑌 and 𝑍 be independent random variables with respective moment generating function
1 2
𝑀𝑋 (𝑡) = 1−𝑡 , 𝑡 < 1; 𝑀𝑌 (𝑡) = 𝑒 𝑡 /2 = 𝑀𝑍 (𝑡) 𝑡 ∈ ℝ. Let 𝑊 = 2𝑋 + 𝑌 2 + 𝑍 2 then P(W > 2)
is equals to
A. 2𝑒 −1
B. 2𝑒 −2
C. 𝑒 −1
D. 𝑒 −2
302 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Question 28.
Let 𝑥1 = 3, 𝑥2 = 4, 𝑥3 = 3, 𝑥4 = 2.5 be the observed values of a random sample from the
probability density function
1 1 𝑥 1 −𝑥
𝑓(𝑥 ∣ 𝜃) = [ 𝑒 − 𝜃 + 2 𝑒 𝜃2 + 𝑒 −𝑥 ] , 𝑥 > 0, 𝜃 ∈ (0, ∞)
3 𝜃 𝜃
Then the method of moment estimate (MME) of 𝜃 is
A. 1.5
B. 2.5
C. 3.5
D. 4.5
Question 29.
Let 𝑋 be a random variable with cumulative distribution function
1 𝑛+2𝑘+1
𝑃(𝑋 = ℎ, 𝑌 = 𝑘) = ( ) ; 𝑛 = −𝑘, −𝑘 + 1, … , ; 𝑘 = 1,2, …
2
Then E(Y) equals
A. 1
B. 2
C. 3
D. 4
Question 30.
Let 𝑋 be a random variable with the cumulative distribution function
0, 𝑥<0
1 + 𝑥2
, 0≤𝑥<1
𝐹(𝑥) = 10
3 + 𝑥2
, 1≤𝑥<2
10
{ 1, 𝑥≥2
Which of the following statements is (are) TRUE?
3
A. 𝑃(1 < 𝑋 < 2) = 10
31
B. 𝑃(1 < 𝑋 ≤ 2) = 5
303 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

11
C. 𝑃(1 ≤ 𝑋 < 2) = 2
41
D. 𝑃(1 ≤ 𝑋 ≤ 2) = 5

True/ False Questions (Objective Type Questions)


State True or False:
(i) The number of equations 𝜇𝑟′ = 𝑚𝑟′ in the method of moments is taken equal to the
number of unknown parameters. (True)
(ii) The method of moments estimates is biased. (False)
(iii) The method of moments estimates is not efficient, in general. (True)
(iv) The moments method estimates and maximum likelihood (True) estimates often
coincide.
(v) The moments method of estimation fails in case of some distributions, the moment
may not exist. (True)
(vi) The method of maximum likelihood for estimation of parameters was introduced by
Prof. R.A. Fisher. (True)
(vii) Maximum likelihood estimators are consistent. (True)
(viii) Maximum likelihood estimators are not necessarily unbiased. (True)
(ix) Maximum likelihood estimators are asymptotically normal. (True)
(x) Minimum variance unbiased estimators are unique under certain general conditions.
(True)
(xi) The minimum variance unbiased estimator for 𝜃 does not exist in the
Cauchy distribution
1 1
𝑓(𝑥, 𝜃) = ⋅ , −∞ ≤ 𝑥 ≤ ∞
𝜋 1 + (𝑥 − 𝜃)2
(True)
Fill in the blanks:
(i) The method of moments estimates is asymptotically distributed. [Ans. normally]
(ii) If a sufficient estimator 𝑇 for 𝜃 exists, then any solution of the likelihood equation
will be a .......... of 𝑇. [Ans. function]
(iii) A random sample of size 3 is taken from.
1
𝑓(𝑥, 𝜃) = . 0 ≤ 𝑥 ≤ 𝜃, 𝜃 > 0
𝜃
The sample values are 𝑥1 = 13, 𝑥2 = 6, 𝑥3 = 22.
The maximum likelihood estimate of 𝜃 is [Ans. 𝑥(3) = 22]
∂log 𝐿 𝑇−
(iv) (iv) The necessary and sufficient condition for the existence of MVUE is = .
∂𝜃 𝜆
Then
304 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(v) The importance of the method of minimum variance over other methods is that it
gives also then variance of T is …… (Ans. Lambda) [Ans. Variance]
In each of the following questions, four alternative answers are given in which only one is
correct. Select the correct answer and write the letter (a), (b) (c) or (d):
(i) The method of moments for determining point estimators of the population
parameters was discovered by
(a) Karl Pearson
(b) R.A. Fisher
(c) Cramer-Rao
(d) Rao-Blackwell
Ans. (a)
(ii) Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be a random sample from
2
𝑓(𝑥, 𝛽) = (𝛽 − 𝑥), 𝛼 ≤ 𝑥 ≤ 𝛽.
𝛽2
The estimate of 𝛽 obtained by the method of moments is
(a) 𝑋‾
(b) 2𝑋‾
(c) 3𝑋‾
(d) 4𝑋‾
Ans. (c)
1
(iii) If 𝑓(𝑥, 𝜃) = 2 𝑒 −(𝑥−𝜃) , then the m.l.e. of 𝜃 is :
(a) sample mean
(b) sample mode
(c) sample median
(d) none of these
Ans. (c)
1
(iv) Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from p.d.f., 𝑓(𝑥, 𝜃) = 𝜃 , 0 < 𝑥 < 𝜃. The m.le.
for 𝜃 is
(a) Min (𝑋𝑖 )
(b) Max (𝑋𝑝 )
1
(c) 𝑛 Σ𝑋𝑖
(d) Σ𝑋𝑖
Ans. (b)
305 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

(v) If the likelihood function of the sample values 𝑥1 , 𝑥2 , … + 𝑥𝑛 is denoted by 𝐿 and 𝜃 is


the maximum likelihood estimator then 𝜃 is the solution of
∂𝐿 ∂2 𝐿
(a) = 0 and ∂𝜃2 > 0
∂𝜃
∂𝐿 ∂2 𝐿
(b) = 0 and ∂𝜃2 < 0
∂𝜃
∂𝐿 ∂2 𝐿
(c) ≠ 0 and ∂𝜃2 = 0
∂𝜃
∂𝐿 ∂2 𝐿 ∂2 log 𝐿
(d) ∂𝜃 = 0 and ∂𝜃2 = and of the equation an = [ ] 𝜃 = 𝑇 then 𝑇 is the
∂𝜃2
maximum likelihood estimate of 𝜃 for
(a) 𝑙 <0
(b) 𝑙 >0
(c) 𝑙 =0
(d) 𝑙 ≠0
Ans. (a)
(vii) The necessary and sufficient condition for the existence of minimum variance
unbiased estimator 𝑇 of 𝜓(𝜃) is
∂log 𝐿
= 𝑘(𝜃, 𝑛)[𝑇 − 𝜓(𝜃)]
∂𝜃
Then Var (𝑇) is
(a) 𝐾(𝜃, 𝑛)
1
(b) 𝐾(𝜃,𝑛)
𝑌 ′ (𝜃)
(c) 𝐾(𝜃,𝑛)
(d) 𝜓′ (𝜃), 𝐾(𝜃, 𝑛)
Ans. (c)
(viii) If 𝑇1 and 𝑇2 are unbiased estimators of 𝜃 and 𝜃 2 respectively (0 < 𝜃 < 1) and 𝑇 is a
sufficient statistic, then 𝐸[𝑇1 /𝑇] − 𝐸[𝑇2 /𝑇] is
(a) the minimum variance unbiased estimator of 𝜃
(b) always an unbiased estimator of 𝜃(1 − 𝜃)
(c) the maximum likelihood estimator for 𝜃 + 𝜃 2
(d) rot an unbiased estimator of 𝜃(1 − 𝜃).
Ans. (b)
(ix) The statement If 𝑇1 is an unbiased estimator of 𝜃 and 𝑇2 is safficient for 𝜃, then
Var |𝐸(𝑇1 /𝑇2 )| ≤ Var (𝑇1 ) be longs to

306 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

(a) Cramer Rao Inequality


(b) Rao-Blackwill Theorem
(c) Maximam Likelihood Estimators
(d) None of these.
Ans. (b)
(x) Let 𝑋1 , 𝑋2 , 𝑋𝑎 be a nandom sample from a unsform distribution with probability
density function.
1
𝑓(𝑥) = {0 ; 0 < 𝑥 < 0
0 otherwise
The minimum varianse unbiased estimator for 𝜃 is.
(a) Max (𝑋1 , 𝑋2 , … … , 𝑋𝑛 )
𝑋 +𝑋 +⋯+𝑋𝑛
(b) 1 2𝑛
(c) Min (𝑋1 , 𝑋2 , … … , 𝑋𝑛 )
𝑛+1
(d) [Max, (𝑋1 , 𝑋2 , … − 𝑋𝑛 )]
𝑛

(xi) The maximum likelihood estimators are necessarily


(a) unbiased
(b) sufficient
(c) most efficient
(d) unique
(xii) If a sufficient estimator exists, it is a function of
(a) MLE
(b) Unbiased estimator
(c) consistent estimator
(d) All of these
Ans. (a)
(xiii) If 𝑇 = 𝑟(𝑋1 , 𝑋2 , … , 𝑋𝑛 ) is a sufficient statistic for a parametric 𝜃 and 𝑎 unique MLE,
𝜃 of 𝜃 exists then
(a) 𝜃ˆ = 𝑓(𝑋1 , 𝑋2 , … … , 𝑋𝑛 )
(b) 𝜃ˆ is a function of 𝑡
(c) 𝐵ˆ is independent of 𝑡
(d) None of the above
Ans. (b)

307 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

(xiv) A minimum variance unbiased estimator is said to be unique if for any other
estimator 𝑇𝑛′
(a) Var (𝑇𝑛 ) = Var (𝑇𝑛 )
(b) Var (𝑇𝑛 ) ≤ Var (𝑇𝑛 )
(c) both (a) and (b)
(d) neither (a) nor (b)
Ans. (a)
10.5 SUMMARY
The main points which we have covered in this lessons are what is estimator and what is
consistency, efficiency and sufficiency of the estimator and how to get best estimator.
10.6 GLOSSARY
• Motivation: These Problems are very useful in real life and we can use it in data science,
economics as well as social sciemce.
• Attention: Think how the Methods of Estimation are useful in real world problems.
10.7 ANSWER TO IN-TEXT QUESTIONS
Answer 1: A
Explanation:
2 2
Since 𝑋 2 + 𝑌 2 ∼ 𝜒(2) , we know that 𝜒(2) random variable is the same as that of the
exponential random variable with mean 2 and therefore, we have
4 1 −𝑡/2
𝑃(0 < 𝑋 2 + 𝑌 2 < 4) = ∫0 𝑒 𝑑𝑡 = 1 − 𝑒 −2
2
Hence option A is the correct choice.
Answer 2: B
Explanation:
1 (𝑋𝑖 − 𝑋‾)2
= 𝑃(𝑊 ≤ 1.24) = 𝑃 (∑𝑛𝑖=1 2
≤ 1.24) = 𝑃(𝜒𝑛−1 ≤ 1.24)
100 𝜎2
Thus from the given value 𝑃(𝜒72 ≤ 1.24) = 0.01
we get
n−1=7
and hence the sample size n is 8 .
Answer 3: B
Explanation:

308 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜖𝑖 , 𝑖 = 1,2, … , 𝑛
5
∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑛𝑥‾𝑦‾ 400 − 20 × 5 × 2 150 3 5 3 10
𝛽ˆ = 𝑖 2
= 2
= = ; 𝛼ˆ = 𝑦‾ − 𝑥‾𝛽ˆ = − 5 × = −
∑𝑖=1 𝑥𝑖 − 𝑛𝑥‾ 2 600 − 20 × 5 100 2 2 2 2
= −5
Hence option B is correct.
Answer 4: C
Explanation:
We use the random variable 𝑄 = −2𝜃∑𝑛𝑖=1 ln 𝑋𝑖 ∼ 𝜒(2𝑛)
2

As the pivotal quantity. The 100(1 − 𝛼)%


confidence interval for 𝜃 can be constructed from
𝜒𝛼2 (2𝑛) 2
𝜒1− 𝛼 (2𝑛)
1 − 𝛼 = 𝑃 (𝜒𝛼2 (2𝑛) ≤ 𝑄 ≤ 𝜒1−
2
𝛼 (2𝑛)) = 𝑃 [
2
≤𝜃≤ 2
]
2 2 −2∑𝑛𝑖=1 ln 𝑋𝑖 −2∑𝑛𝑖=1 ln 𝑋𝑖
2
𝜒𝛼 (2𝑛) 𝜒2 𝛼 (2𝑛)
1−
2 2
Thus, 100(1 − 𝛼)% confidence interval for 𝜃 is given by [ , ]
−2∑𝑛
𝑖=1 ln 𝑋𝑖′ −2∑𝑛
𝑖=1 ln 𝑋𝑖

Hence option C is correct.


Answer 5: D
Explanation:
2
We have 𝐸(𝑋𝑖2 ) = 𝑉(𝑋𝑖 ) + (𝐸(𝑋𝑖 )) = 5𝜃 2 ; 𝑖 = 1,2, … , 𝑛
𝑥12 𝑥22
Then , ,…
5 5
𝑥2
is a sequence of i.i. d random variables with 𝐸 ( 51 ) = 𝜃 2 . Using WLLN, we get
1 1
∑𝑛𝑖=1 𝑋𝑖2 /5 = 5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 converge in probability to 𝜃 2
𝑛
1 1/2
as 𝑛 → ∞, which implies that (5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 )
1 1/2
converge in probability to 𝜃 as 𝑛 → ∞. Thus (5𝑛 ∑𝑛𝑖=1 𝑋𝑖2 ) is a consistent estimator for 𝜃.
Hence option D is the correct choice.
Answer 6: B
Explanation :

309 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Here total numbers are 6


4+5+0+10+8+3 30
So, 𝐴𝑀 = = =5
6 6

Answer 7: D
Explanation :
The probability of an event is always between 0 and 1 (including 0 and 1 ). So, 1.2 cannot be
the probability of an event.
Answer 8: A
Explanation :
A lognormal distribution is a probability distribution of a random variable whose logarithm is
normally distributed. So, if X is lognormal then Y = ln (X) is normal. Similarly, if Y is
normal then X = eY is lognormal.
Answer 9: A
Explanation :
There are 4 numbers in total. So, by using the formula for calculating geometric mean, we
have
𝐺. 𝑀 = (1 × 2 × 8 × 16)1/4
= (256)1/4
= (44 )1/4 ∵ 44 = 256
=4

Answer 10: C
In anova we use F test because we use ratio of two chi square statistics .
Answer 11: A
Explanation :
Expectation (or expected value) is the arithmetic mean of all possible outcomes of a random
variable.
Answer 12: D
Explanation :
The value of a correlation coefficient is always between −1 and 1, including −1 and 1.

310 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

Answer 13: D
Explanation :
By definition, Var (𝑋) = 𝐸[𝑋 2 ] − 𝐸[𝑋]2
Answer 14: C
Explanation :
By definition, Var (𝑋 + 𝑌) = Var (𝑋) + Var (𝑌) + 2Cov (𝑋, 𝑌)
Answer 15: B
Explanation :
First calculate the mean
2 + 10 + 1 + 9 + 3 25
Mean = = =5
5 5
Now calculate the variance,
(2 − 5)2 + (10 − 5)2 + (1 − 5)2 + (9 − 5)2 + (3 − 5)2
Variance =
5
= 17.5
Answer 16: C
Explanation :
By using formula for calculating weighted average, we have
Weighted Average = 𝑤1 𝑥1 + 𝑤2 𝑥2 + 𝑤3 𝑥3
= 0.1 × 0.8 + 0.3 × 0.65 + 0.6 × 0.75
= 0.725 = 72.5%
Answer 17: A
Explanation :

311 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Answer 18: B
Explanation :
3%, 7%, 10%, 16%
3 + 7 + 10 + 16
Average = %
4
36
= % = 9%
4
Answer 19: A
Explanation :
Type I error = P(reject H0| when H0 is true )
Answer 20: C
Explanation :
We know that if random variable x follows Poisson distribution with parameter lamda then
E(X) = V(X)= lamda
Answer 21: A
Explanation :
The marginal probability mass function of X is given by
𝑃(𝑋 = 𝑚) = ∑∞
𝑛=𝑚 𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ( for 𝑚 = 0,1,2, … )
1 1 𝑚
𝑒 − 2(2)
= , 𝑚 = 0,1,2, …
𝑚!
Thus the marginal distribution of X is Poisson with mean 1/2.
The marginal probability mass function of 𝑌 is given by
𝑃(𝑌 = 𝑛) = ∑∞
𝑚=0 𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ( for 𝑛 = 0,1,2, … )

𝑒 −1
= , 𝑛 = 0,1,2, …
𝑛!
Thus the marginal distribution of 𝑌 is Poisson with mean 1 .
𝑃(𝑋 = 𝑚, 𝑌 = 𝑛) ≠ 𝑃(𝑋 = 𝑚)𝑃(𝑌 = 𝑛)
Therefore 𝑋 and 𝑌 are not independent.
𝑃(𝑋 = 𝑚, 𝑌 = 5) 5! 1 5
𝑃(𝑋 = 𝑚 ∣ 𝑌 = 5) = = ( ) , 𝑚 = 0,1,2, … ,5
𝑃(𝑌 = 5) 𝑚! (5 − 𝑚)! 2

312 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1
Thus the conditional distribution of 𝑋 given 𝑌 = 5 is B in (5, 2)
𝑃(𝑌=𝑛)
Since 𝑃(𝑌=𝑛+1) = (𝑛 + 1) for 𝑛 = 0,1,2, …

Answer 22: 𝑨
Explanation :
The trinomial distribution of two r.v.'s 𝑋 and 𝑌 is given by
𝑛!
𝑓𝑋,𝑌 (𝑥, 𝑦) = 𝑝 𝑥 𝑞 𝑦 (1 − 𝑝 − 𝑞)(𝑛−𝑥−𝑦)
𝑥! 𝑦! (𝑛 − 𝑥 − 𝑦)!
for 𝑥, 𝑦 = 0,1,2, … , 𝑛 and 𝑥 + 𝑦 ≤ 𝑛, where p + q ≤ 1.
n = 2, p = 1/6 and q = 2/6
1 1 10
Var (X) = 𝑛𝑝1 (1 − 𝑝1 ) = 2 × (1 − ) = ; Var (Y) = 𝑛𝑝2 (1 − 𝑝2 )
6 6 36
2 2
= 2 × ̅6 (1 − 6) = 16/36

1 2 4
Cov (𝑋, 𝑌) = −𝑛𝑝1 𝑝2 = −2 × × =−
6 6 36
Cov (𝑋, 𝑌) 4
Corr (𝑋, 𝑌) = =− = −0.31
√Var (𝑋)√Var (𝑌) 4√10
Hence −0.31 is the correct answer.
Answer 23: 𝐀
Explanation :
Let 𝑥1 = 1.1, 𝑥2 = 0.5, 𝑥3 = 1.4, 𝑥4 = 1.2
𝑒 𝜃−𝑥 , if 𝑥 ≥ 𝜃, 𝜃 ∈ (−∞, ∞)
𝑓(𝑥 ∣ 𝜃) = {
0, otherwise
𝜃 ∈ (∞, 𝑋(1) ]
𝑑
Since 𝑑𝜃 𝑓(𝑥 ∣ 𝜃) > 0 ∀𝜃 ∈ (∞, 𝑋(1) ], then

𝑓(𝑥 ∣ 𝜃) is strictly increasing function. So 𝜃ˆ = 𝑋(1) = 0.5, therefore by invariance


property the MLE of 𝜃 2 + 𝜃 + 1 = (0.5)2 + 0.5 + 1 = 1.75.
Hence MLE for 𝜃 2 + 𝜃 + 1 is 1.75.

313 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

Answer 24: 𝐀
Explanation :
1
X ∼ 𝐹(𝑚, 𝑛) then x ∼ 𝐹(𝑛, 𝑚)
𝑃[𝑈 > 3.69] = 0.05 ⇒ 1 − 𝑃[𝑈 > 3.69] = 1 − 0.05
⇒ 𝑃[𝑈 < 3.69] = 0.95
1 1 1
⇒ 𝑃 [𝑈 > 3.69] = 0.95 ⇒ 𝑉 = 𝑈 and
1
𝑐= = 0.27
3.69
Hence c = 0.27 is the correct answer.
Answer 25: C
Explanation :
Clearly, P({𝜔}) = 1/4 ∀𝜔 ∈ Ω = {1,2,3,4}. We have E = {1,2}, F = {1,3} and G = {3,4}
Then P(E) = P(F) = P(G) = 2/4 = 1/2.
Using this result, we see that E and F are independent and also E and G are independent.
Hence option C is correct.
Answer 26: D
Explanation :
5 𝑋12 + 𝑋22 + 𝑋32 + 𝑋42 𝑛 5
𝑇=( ) 2 2 2 2 2 ∼ 𝐹(4,5); 𝐸(𝑊) = =
4 𝑌1 + 𝑌2 + 𝑌3 + 𝑌4 + 𝑌5 𝑛−2 3
2(5)2 (7) 350
Var (𝑇) = = = 9.72
4(3)2 (1) 36
Hence option D is correct.
Answer 27: A
Explanation :
2
Since 𝑊 = 2𝑋 + 𝑌 2 + 𝑍 2 ∼ 𝜒(4)
1 −𝑤/2
𝑓𝑊 (𝑤) = {4 𝑤𝑒 , if 𝑤 > 0
0, otherwise

314 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Intermediate Statistics for Economics

1 −𝑤/2

𝑃(𝑊 > 2) = ∫2 𝑤𝑒 𝑑𝑤 = 2𝑒 −1
4
Hence option A is correct.
Answer 28: C
Explanation :
1
𝑥‾ = (3 + 4 + 3.5 + 2.5) = 3.25
4
1 1
𝐸(𝑋) = [𝜃 + 𝜃 2 + 1]Γ2 = [𝜃 + 𝜃 2 + 1] = 3.25
3 3
2
𝜃 + 𝜃 − 8.75 = 0 then 𝜃 = 2.5 or −3.5
Since 𝜃 ∈ (0, ∞) then 𝜃 = 2.5
Hence option C is correct.
Answer 29: B
Explanation :
𝑃(𝑌 = 𝑘) = ∑∞
𝑛=−𝑘 𝑃(𝑋 = 𝑛, 𝑌 = 𝑘): { put m = n + k}

1 1 𝑘−1
= ( ) {𝑘 = 1,2, …
2 2
which is the pmf of geometric distribution with parameter 1/2}
1 1 𝑘−1
𝐸(𝑌) = ∑∞
𝑘=0 𝑘 ( ) =2
2 2
Hence option B is correct.
Answer 30: A
Explanation :
3
𝑃(1 < 𝑋 < 2) = 𝐹(2) − 𝐹(1) − 𝑃(𝑋 = 2) =
10
3
𝑃(1 < 𝑋 ≤ 2) = 𝐹(2) − 𝐹(1) =
5
1
𝑃(1 ≤ 𝑋 < 2) = 𝐹(2) − 𝐹(1) − 𝑃(𝑋 = 2) + 𝑃(𝑋 = 1) =
2
4
𝑃(1 ≤ 𝑋 ≤ 2) = 𝐹(2) − 𝐹(1) + 𝑃(𝑋 = 1) =
5

315 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
B.A.(Hons.) Economics

10.8 REFERENCES
• Devore, J. (2012). Probability and statistics for engineers, 8th ed. Cengage Learning.
• John A. Rice (2007). Mathematical Statistics and Data Analysis, 3rd ed. Thomson
Brooks/Cole
• Larsen, R., Marx, M. (2011). An introduction to mathematical statistics and its
applications. Prentice Hall.
• Miller, I., Miller, M. (2017). J. Freund’s mathematical statistics with applications, 8th
ed. Pearson.
• Demetri Kantarelis, D. and Malcolm O. Asadoorian, M. O. (2009). Essentials of
Inferential Statistics, 5th edition, University Press of America.
• Hogg, R., Tanis, E., Zimmerman, D. (2021) Probability and Statistical inference,
10TH Edition, Pearson
10.9 SUGGESTED READINGS
• S. C Gupta , V.K Kapoor, Fundamentals of Mathematical Statistics,Sultan Chand
Publication, 11th Edition.
• B.L Agarwal, Programmed Statistics ,New Age International Publishers, 2nd Edition.

316 | P a g e

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
978-81-19169-57-3

9 788119 169573

You might also like