0% found this document useful (0 votes)

21 views58 pages

Lecture 01

Uploaded by

wangweian8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views58 pages

Lecture 01

Uploaded by

wangweian8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

STA732

Statistical Inference
Lecture 01: Course Introduction

Yuansi Chen
Spring 2023
Duke University

https://www2.stat.duke.edu/courses/Spring23/sta732.01/

1
Goal of Lecture 01

• Logistics
• Introduce “the problem”
• Discuss what it means to have “the best” estimator?
• (If time permits) Review of measure theory basics

2
Logistics
Coordinates

• Instructor: Yuansi Chen [email protected]

• TA: Christine Shen [email protected]
• Course websites:
• Main:
https://www2.stat.duke.edu/courses/Spring23/sta732.01/
• Sakai

3
Lectures and Office hours

• Lectures: Monday and Wednesday 3:30-4:45pm in Old Chem

025
• Office hours:
• Yuansi: MW 4:45-5:30
• Christine: see website
• Ed Discussion on Sakai

4
Sakai tools

• Announcements
• Zoom Meetings (for possible online office hours)
• Resources (for HW problem sets)
• Ed Discussion (for online discussions)
• Gradescope (for HW submission and exam grading)

5
Textbooks

• Keener, Theoretical Statistics: Topics for a Core Course, 2010

• Lehmann and Casella, Theory of Point Estimation, 1998
• Lehmann and Romano, Testing Statistical Hypotheses, 2005

All available online via Duke library website

6
Grading

• Weekly homework (due on Wednesdays at 11am)

• One midterm + one final

Homework 25%
Midterm 25%
Final 45%
Participation 5%

7
Scribing

Everyone is required to scribe at least one lecture note. Please sign

up via the link in Sakai.

8
Policies

Check website
• Duke Community Standard
• I will not lie, cheat, or steal in my academic endeavors
• I will conduct myself honorably in all of my endeavors
• I will act if the standard is compromised
• Plagiarism
• Can use online resources, but make sure you understand in
your own language and make sure to cite them (code or theory)
• Answer sharing between groups or individuals are not allowed
• Homework Policy:
• Late HW: No homework more than two full days (48 hours) late
will be accepted. Each late day will result in a one-level
down-grade (10% off) of that HW
• Regrade requests on Gradescope within 2 days
• Drop the HW with the lowest score for final grade
• Exam Policy: no makeup exams 9
HW0 released

• Designed for trying out Gradescope on Sakai

• due Wednesday 18 at 11am
• will not be counted into final grade

10
The statistical inference problem
Statistical inference in a dictionary

Oxford Dictionary of Statistics or Wikipedia:

“Statistical inference is the process of using data analysis to infer
properties of an underlying distribution of probability”

11
Statistical experiment

Statistical experiment
A statistical experiment is a procedure/process that generates a
collection of data, X

• For example, a coin tossing experiment: tossing a coin n times.

12
Statistical experiment

Statistical experiment
A statistical experiment is a procedure/process that generates a
collection of data, X

• For example, a coin tossing experiment: tossing a coin n times.

Sample space
The set of possible data values is called the sample space S

• For example, in the coin tossing experiment, S = {0, 1}n , the

sample space contains all length n string with 0s and 1s

12
Statistical model

Statistical model
A statistical model is a family of possible distributions {Pθ , θ ∈ Ω}
for X, where Ω is called the parameter space

• Note that the family can be very small (e.g. a single

distribution) or very large (e.g. all absolutely continuous
distributions)
• Bayesian also puts assumption on θ (we will deal later)
• A model is in essence a collection of assumptions regarding
the sampling distribution of the data

13
Event in the sample space

• E ⊂ S is called an event
• Each distribution in a model can specify the probability of an
event

Pθ (E) = Probθ (X ∈ E)

14
Example: coin tossing experiment

• Data: X = (X1 , X2 , . . . , Xn ) where


1 if the ith toss is H
Xi =
0 if the ith toss is T

• Statistical model: contains all joint distributions of n

independent Bernoulli distribution with equal head
probability θ, θ ∈ [0, 1]
∑ ∑
xi
Pθ (X1 = x1 , . . . , Xn = xn ) = θ (1 − θ)n− xi

15
Statistical inference

Inference
Inference about g(θ) (estimand) is making an “educated guess”
about g(θ) based on the data

• for example, blindly guessing θ to be 0 is a type of inference

• guessing θ to be a real number between 0 and 1 is also an
example of inference

16
Common types of statistical inference problems

1. Point estimation
2. Hypothesis testing
3. Interval estimation (confidence intervals or credible regions)
4. Prediction

17
Common types of statistical inference problems (2)

1. Point estimation: an estimator is a statistic (a function of the

data) for the purpose of guessing the value for some g(θ),
which is hopefully close to g(θ)

18
Common types of statistical inference problems (2)

1. Point estimation: an estimator is a statistic (a function of the

data) for the purpose of guessing the value for some g(θ),
which is hopefully close to g(θ)
2. Hypothesis testing: let Ω = Ω0 ∪ Ω1 be a disjoint union. Ask
whether

H0 : θ ∈ Ω0 or H1 : θ ∈ Ω1

18
Common types of statistical inference problems (3)

3. Interval estimation: suppose g(θ) ∈ R. We want an interval

that contains g(θ) “with high probability”
• A level (1 − α) ∗ 100% confidence interval [l(X), u(X)]:

Pθ (g(θ) ∈ [l(X), u(X)]) ≥ 1 − α

(1 − α) is called the significance level or coverage level

19
Common types of statistical inference problems (3)

3. Interval estimation: suppose g(θ) ∈ R. We want an interval

that contains g(θ) “with high probability”
• A level (1 − α) ∗ 100% confidence interval [l(X), u(X)]:

Pθ (g(θ) ∈ [l(X), u(X)]) ≥ 1 − α

(1 − α) is called the significance level or coverage level

4. Prediction: what would a new data point look like?

19
Based on our definition, the requirement of doing inference is quite
low. We need a notion of “good inference” to compare inference
methods, and to rule out the clearly useless inference methods!

20
What does it mean to have “the best”
estimator?
Take point estimation as an example

Objective of Point estimation:

Construct a statistics T(X) that is “close” to g(θ).

• What is a formal notion of “closeness”?

• Introduce a loss function

L(θ, d) = the loss incurred when estimating g(θ) by d

Note that d is taken to our estimator which is a statistic

(depends on X)

21
Examples of loss functions

• Squared error loss

L(θ, d) = (d − g(θ))2
• Lp loss
L(θ, d) = |d − g(θ)|p
• ϵ-step error loss, ϵ > 0

1 if |d − g(θ)| > ϵ
L(θ, d) =
0 otherwise
• In general, the loss does not have to be symmetric

3(d − g(θ)) if d − g(θ) > 0
L(θ, d) =
−(d − g(θ)) otherwise

We assume for simplicity the minimum value is taken at g(θ)

22
Example: a normal experiment

• Data: X = (X1 , X2 , . . . , Xn ) i.i.d. N (θ, 1)

• Estimand: g(θ) = θ
• Loss fun: L(θ, d) = (d − θ)2 , the squared error loss

23
Example: a normal experiment

• Data: X = (X1 , X2 , . . . , Xn ) i.i.d. N (θ, 1)

• Estimand: g(θ) = θ
• Loss fun: L(θ, d) = (d − θ)2 , the squared error loss

If we take d = δ(X) as our estimator, then the loss L(θ, δ(X))

• is random (under repeated experiments)

• depends on the unknown θ

23
Example: a normal experiment

• Data: X = (X1 , X2 , . . . , Xn ) i.i.d. N (θ, 1)

• Estimand: g(θ) = θ
• Loss fun: L(θ, d) = (d − θ)2 , the squared error loss

If we take d = δ(X) as our estimator, then the loss L(θ, δ(X))

• is random (under repeated experiments)

• depends on the unknown θ

How do we evaluate the performance of different estimators? say

∑
• sample mean X̄ = n1 Xi
• sample median med(X) = median (X1 , . . . , Xn )

23
Risk function

Risk function
The risk function R(θ, δ) for an estimator δ is the average loss under
repeated experiments (This is the frequentist prespective).

R(θ, δ) = EX∼Pθ [L(θ, δ(X))] .

In general, we want to find estimators that has “low” risk

but how low is low? low for which θ?

24
Example: a normal experiment (cont’d)

• The risk of sample mean is

[( )2 ] 1
R(θ, X̄) = Eθ X̄ − θ = Varθ (X̄) =
n
• The risk of sample median
• is also constant over all θ
• is larger than n1 (we will prove later)

In this case, the sample mean is preferred under the squared error
loss (and repeated experiments) for all θ. We say the sample mean
is uniformly better than the sample median.

25
A natural question:
Given a loss function, does there exist a uniformly best estimator,
which has lower risk than any other estimator over all values of θ?

26
The answer to the previous question is NO in general!

Proof of NO uniformly best estimator

1. If δ ∗ exists which is uniformly better, then taking δc = c, we
must have

R(θ, δ ∗ ) ≤ R(θ, δc )

2. In particular, R(θ, δ ∗ ) ≤ R(θ, g(θ))

3. Since L(θ, g(θ)) is the minimum, L(θ, δ ∗ (X)) = L(θ, g(θ)) for all
θ and all X. This is a degenerate case

In other words, δ ∗ (X) = arg minδ L(θ, δ) no matter what data X is.

27
Various approaches to define “good” estimator with “low” risk

Since no uniformly best estimator exist, we must be careful when

claiming that “we have the best estimator”. We need some
compromises or restrictions in defining what is “the best”.
Three general approaches
1. Restrict attention to a smaller (but hopefully reasonable) class
of estimators (avoid comparison to all estimators)
2. Applying global measures for risk and minimize those, rather
than trying to find an estimator with lowest risk at every θ
(don’t need to be good at every θ)

3. Large sample (asymptotic) approach

28
I. Restrict attention to a smaller class of estimators

Strategy A: restrict attention to unbiased estimators

• bias: EX∼Pθ δ(X) − g(θ) is the bias of δ

• UMVU: Uniformly minimum variance unbiased estimator
• If δ is unbiased, then for square loss

R(θ, δ) = Eθ (δ − g(θ))2
= Varθ (δ) + (Eθ δ − g(θ))2
= Varθ (δ)

29
I. Restrict attention to a smaller class of estimators

Strategy A: restrict attention to unbiased estimators

• bias: EX∼Pθ δ(X) − g(θ) is the bias of δ

• UMVU: Uniformly minimum variance unbiased estimator
• If δ is unbiased, then for square loss

R(θ, δ) = Eθ (δ − g(θ))2
= Varθ (δ) + (Eθ δ − g(θ))2
= Varθ (δ)

Strategy B: restrict attention to estimators with certain symmetry

It is sometimes reasonable to require an estimator to be equivariant

δ(X1 + c, . . . , Xn + c) = δ(X1 , . . . , Xn ) + c
29
II. Global approaches, A

Strategy A: minimax
• Want to minimize maximum value of risk function, i.e. find δ ∗
satisfying

sup R(θ, δ ∗ ) ≤ sup R(θ, δ) for any other δ

θ∈Ω θ∈Ω

• Such an estimator is called minimax. It seems rather

pessimistic:
• An estimator might be good for a vast majority of θ, but only
bad for a single value of θ. It will not be considered good under
the minimax strategy!

30
II. Global approaches, B

Strategy B: minimize the average risk

• Want to minimize the averaged risk under some weight
function, i.e. find δ ∗ to minimize
∫
R(θ, δ)dΛ(θ)

where Λ(θ) is some measure over θ

• If Λ is taken as a probability distribution over the parameter
space Ω, then it is called the prior distribution
• Depending on the prior about θ, we weight the risk differently
• Such an estimator is called a Bayes estimator

31
III. Large sample approach

Intuition for large sample approach: when n tends to infinity, the

risk simplifies and we might be able to define which estimator is the
best without making too much compromises

32
Now we can understand why the Lehmann and Casella book
(Theory of Point Estimation) is organized as follows

• Preparations
• Unbiasedness
• Equivariance
• Average Risk Optimality
• Minimaxity and Admissibility
• Asymptotic Optimality

33
What is covered in this course?

• The first half: focus on the logic of Lehmann and Casella

• The second half: focus on
• hypothesis testing
• how the classic studies the maximum likelihood estimator
• But before the first half, we need to have some probability
background and to build the basic language
• Measure theory basics
• Exponential families
• Sufficient statistics
• Rao-Blackwell theorem (a generic way to improve an estimator)

34
Review of measure theory basics
Measure theory overview

• Measure theory is the foundation of all rigorous statistical

theory which is built on top of probability theory
• We will go through the basics, but it is recommended to review
Sta 711 textbooks to get a thorough understanding!

35
Measure

Given a set X , a measure µ maps subsets A ⊆ X to [0, ∞)

• Example 1: if X is countable (e.g. X = Z), the counting

measure #(A) equals the number of points in A
• Example 2: if X = Rn , the Lebesgue measure is
∫ ∫
λ(A) = · · · A dx1 . . . dxn = Vol(A)

Due to pathological sets, λ(A) can only be defined for some subsets
A ⊆ Rn . This leads to the introduction of σ-algebra (or σ-field).

36
σ-algebra

A σ-algebra F on a set X is a collection of subsets of X satisfying

• it includes X and the empty set

• it is closed under complement
• it is closed under countable unions

• Example 1: if X is countable, F = 2X (all subsets)

• Example 2: if X = Rn , F is the Borel σ-field, B, the smallest
σ-algebra that contains all rectangles.

37
Formal definition of measure under σ-algebra

Given (X , F) (a measurable space), a measure is any map

µ : F → [0, ∞] such that
(∞ ) ∞
∪ ∑
µ Ai = µ(Ai ) for disjoint Ai ∈ F
i=1 i=1

If in addition µ(X ) = 1, then µ is a probability measure

38
Integrals

∫
We can now define integrals using measures. Intuitively, fdµ
means summing f with weights µ(A) on A
∫ ∑
• For counting measure, f(x)d#(x) = x∈X f(x)
∫ ∫ ∫
• For Lebesgue measure, f(x)dλ(x) = · · · f(x)dx1 . . . dxn .
• Indicator fun 1x∈A
∑
• Simple fun i ai 1x∈Ai
• Measurable fun (if pre-image is in F): those can be
approximated by simple funs (Theorem 1.8 in Keener)

39
Densities

Given (X , F) and two measures µ, P, we say P is absolutely

continuous with respect to µ if P(A) = 0 whenever µ(A) = 0. Note
this as P ≪ µ.
• If P ≪ µ, we can define the density function
dP
p= ,
dµ
∫
where P(A) = A p(x)dµ(x). This is also called Radon-Nikodym
derivative
• If µ is the counting measure, then p is a probability mass
function. If µ is the Lesbegue measure, then p is a probability
density function
According to the definition, the density function is not unique but agrees almost
everywhere
40
Probability space

A probability space is the triple (Ω, F, P)

• Sample space Ω, which is the set of all possible outcomes

• Event space F, A ⊂ F is called an event
• Probability function P, P(A) is the probability of A

41
Random variable

A random variable is a function Y : Ω → X

• We say Y has distribution Q (or Y ∼ Q) if

P(Y ∈ B) = P({w : Y(w) ∈ B}) = Q(B)

for B ∈ F

42
Expectation

The expectation is an integral with respect to P

∫ ∫
E[Y] = Y(w)dP(w) = xdQ(x).
Ω

43
Need to know more about measure theory?

• More in Keener Chap. 1

• More in Sta 711

44
Summary

What we have covered

• Statistical inference problem
• Intuitively how to argue for the best estimator

45
Summary

What we have covered

• Statistical inference problem
• Intuitively how to argue for the best estimator

What is next lecture?

• Exponential families

45
Thank you

46
47

Lecture 1
No ratings yet
Lecture 1
18 pages
An Introduction To Objective Bayesian Statistics PDF
No ratings yet
An Introduction To Objective Bayesian Statistics PDF
69 pages
Ecn 306
No ratings yet
Ecn 306
43 pages
Notes For 18.6501x, Fundamentals of Statistics: v0.2 (2019 April 24)
100% (1)
Notes For 18.6501x, Fundamentals of Statistics: v0.2 (2019 April 24)
14 pages
Basic Concepts of Inference: Corresponds To Chapter 6 of Tamhane and Dunlop
No ratings yet
Basic Concepts of Inference: Corresponds To Chapter 6 of Tamhane and Dunlop
40 pages
Asymptotic Theory & Inference Guide
No ratings yet
Asymptotic Theory & Inference Guide
32 pages
Lecture Two 2025
No ratings yet
Lecture Two 2025
106 pages
Metrics Topic3 Statistics Brief
No ratings yet
Metrics Topic3 Statistics Brief
24 pages
Advanced Statistics for Analysts
No ratings yet
Advanced Statistics for Analysts
139 pages
Final Paper Guide For PS, Spring : e Source File For This Document Is Not Yet Available at
No ratings yet
Final Paper Guide For PS, Spring : e Source File For This Document Is Not Yet Available at
13 pages
Lecture Notes For Mathematical Statistics
No ratings yet
Lecture Notes For Mathematical Statistics
184 pages
Chapitre 9 - Point Estimation
No ratings yet
Chapitre 9 - Point Estimation
35 pages
Probability & Regression Basics
100% (2)
Probability & Regression Basics
5 pages
Lecture 24
No ratings yet
Lecture 24
23 pages
Notes 2
No ratings yet
Notes 2
16 pages
Statistics Lecture Notes
No ratings yet
Statistics Lecture Notes
15 pages
SI Chapter-2
No ratings yet
SI Chapter-2
53 pages
Statistics Cheat Sheet for Students
100% (1)
Statistics Cheat Sheet for Students
8 pages
Quiz 2 Cheatsheet v3
No ratings yet
Quiz 2 Cheatsheet v3
2 pages
Module 5
No ratings yet
Module 5
24 pages
Lecture6-Standard Error, Confidence Interval and Simple Hypothesis Testing - Slides - Annotated
No ratings yet
Lecture6-Standard Error, Confidence Interval and Simple Hypothesis Testing - Slides - Annotated
28 pages
BUAN6359 - Unit4 Part2 Handout
No ratings yet
BUAN6359 - Unit4 Part2 Handout
18 pages
1 Review 1-13-2025
No ratings yet
1 Review 1-13-2025
97 pages
Multiple Regression Inference Overview
No ratings yet
Multiple Regression Inference Overview
89 pages
Stat2602 Chapter3
No ratings yet
Stat2602 Chapter3
37 pages
Statistics
No ratings yet
Statistics
53 pages
Lectura 2 Point Estimator Basics
No ratings yet
Lectura 2 Point Estimator Basics
11 pages
Statistical Estimation and Hypothesis Testing
No ratings yet
Statistical Estimation and Hypothesis Testing
21 pages
2A3. Review of Mathematical Statistics
No ratings yet
2A3. Review of Mathematical Statistics
8 pages
Lecture 2
No ratings yet
Lecture 2
64 pages
Chapter 8 Estimation & Hypothesis Testing Copy Copy1
No ratings yet
Chapter 8 Estimation & Hypothesis Testing Copy Copy1
11 pages
Statistical Inference Review 1753631981
No ratings yet
Statistical Inference Review 1753631981
48 pages
STA248
No ratings yet
STA248
26 pages
Advanced Statistical Inference Concepts
No ratings yet
Advanced Statistical Inference Concepts
7 pages
Lecture 1
No ratings yet
Lecture 1
8 pages
Hexp C13
No ratings yet
Hexp C13
67 pages
The Most Important Probability Distribution in Statistics
100% (1)
The Most Important Probability Distribution in Statistics
57 pages
Statistics Course Review Notes
No ratings yet
Statistics Course Review Notes
20 pages
Chapter 1 To Chapter 2 Stat 222
No ratings yet
Chapter 1 To Chapter 2 Stat 222
21 pages
Essential Statistical Tests for Engineers
No ratings yet
Essential Statistical Tests for Engineers
51 pages
Topic 3a
No ratings yet
Topic 3a
64 pages
Agricultural Statistics and Biometry (Agr 304) - 2021.2022
No ratings yet
Agricultural Statistics and Biometry (Agr 304) - 2021.2022
11 pages
Statistics 2
No ratings yet
Statistics 2
25 pages
Data Science Inference and Modeling
No ratings yet
Data Science Inference and Modeling
98 pages
Ch8 Statistics Ver1
No ratings yet
Ch8 Statistics Ver1
21 pages
Quantitative Methods Sessions 11 - 21
No ratings yet
Quantitative Methods Sessions 11 - 21
41 pages
Exercises Introduction Statistics
No ratings yet
Exercises Introduction Statistics
2 pages
OLS Assumptions & Issues Guide
No ratings yet
OLS Assumptions & Issues Guide
4 pages
Understanding Statistical Estimation
No ratings yet
Understanding Statistical Estimation
31 pages
Final Review Handout
No ratings yet
Final Review Handout
47 pages
Inferences in Regression Analysis
No ratings yet
Inferences in Regression Analysis
66 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
Statistics Training
No ratings yet
Statistics Training
96 pages
XXXX - Mathematical Statistics II
No ratings yet
XXXX - Mathematical Statistics II
192 pages
Lecture Notes - 1
No ratings yet
Lecture Notes - 1
56 pages
351note06 Lecture Notes 6
No ratings yet
351note06 Lecture Notes 6
29 pages
Advanced Mathematical Statistics II
No ratings yet
Advanced Mathematical Statistics II
192 pages
확통1 LectureNote09 on Bayesian Statistical Inference
No ratings yet
확통1 LectureNote09 on Bayesian Statistical Inference
78 pages
Issi 1
0% (1)
Issi 1
5 pages
Importance of FRACAS To Ensure Product Reliability PDF
No ratings yet
Importance of FRACAS To Ensure Product Reliability PDF
5 pages
AResearchMethodologyforbigDataIntelligence100822 RG
No ratings yet
AResearchMethodologyforbigDataIntelligence100822 RG
9 pages
Manuscript Na Walang Output
No ratings yet
Manuscript Na Walang Output
110 pages
Topic 7B Pharmacoeconomic Modeling Methods Sir Raph
No ratings yet
Topic 7B Pharmacoeconomic Modeling Methods Sir Raph
22 pages
The Evolution of The Theory of Lean Supply Chain Management
No ratings yet
The Evolution of The Theory of Lean Supply Chain Management
7 pages
Reseach Mba 2nd Sem
0% (1)
Reseach Mba 2nd Sem
44 pages
Assignment (Roll No.o6) Arjunrajkoti
No ratings yet
Assignment (Roll No.o6) Arjunrajkoti
47 pages
Dam Mass Rating: RMR Adaptation Guide
100% (1)
Dam Mass Rating: RMR Adaptation Guide
4 pages
Saueressig Et Al - Rehab, Lca - 2022
No ratings yet
Saueressig Et Al - Rehab, Lca - 2022
13 pages
Internal Assessment Marks Breakup
No ratings yet
Internal Assessment Marks Breakup
2 pages
Practice Quiz (Chapter 10)
No ratings yet
Practice Quiz (Chapter 10)
24 pages
Kennedy Et Givens - 2019 - Eco-Habitus or Eco-Powerlessness Examining Enviro
No ratings yet
Kennedy Et Givens - 2019 - Eco-Habitus or Eco-Powerlessness Examining Enviro
14 pages
Presentation of Survey Camp PDF
No ratings yet
Presentation of Survey Camp PDF
30 pages
Critique Paper
No ratings yet
Critique Paper
3 pages
Principle of Business Sba 1-1
No ratings yet
Principle of Business Sba 1-1
16 pages
Aziz El Haybe - Management Accounting Systems
No ratings yet
Aziz El Haybe - Management Accounting Systems
94 pages
Thesis On Employee Satisfaction PDF
100% (3)
Thesis On Employee Satisfaction PDF
5 pages
The Test of Time A Longitudinal Study of Parasocia
No ratings yet
The Test of Time A Longitudinal Study of Parasocia
12 pages
The Impact of The Performance Management System in
No ratings yet
The Impact of The Performance Management System in
8 pages
A Quantitative Evaluation of Shame Resilience Theory
No ratings yet
A Quantitative Evaluation of Shame Resilience Theory
2 pages
H.R Cipd
80% (5)
H.R Cipd
43 pages
PPP Success Factors Review 1990-2013
No ratings yet
PPP Success Factors Review 1990-2013
13 pages
Handout 1 - Introduction To Research
No ratings yet
Handout 1 - Introduction To Research
3 pages
Biology Lab Sample-Transpiration
100% (1)
Biology Lab Sample-Transpiration
8 pages
MBA Research Survey Methods
No ratings yet
MBA Research Survey Methods
25 pages
FS Unit 3 Identification of Problem of Research
No ratings yet
FS Unit 3 Identification of Problem of Research
32 pages
Types of Archaeological Data Explained
No ratings yet
Types of Archaeological Data Explained
9 pages
Open Pit Mine Planning and Design (MN)
100% (2)
Open Pit Mine Planning and Design (MN)
3 pages
Proceding Icobus
No ratings yet
Proceding Icobus
6 pages

Lecture 01

Uploaded by

Lecture 01

Uploaded by

STA732

• Instructor: Yuansi Chen [email protected]

• Lectures: Monday and Wednesday 3:30-4:45pm in Old Chem

• Keener, Theoretical Statistics: Topics for a Core Course, 2010

All available online via Duke library website

• Weekly homework (due on Wednesdays at 11am)

Everyone is required to scribe at least one lecture note. Please sign

• Designed for trying out Gradescope on Sakai

Oxford Dictionary of Statistics or Wikipedia:

• For example, a coin tossing experiment: tossing a coin n times.

• For example, a coin tossing experiment: tossing a coin n times.

• For example, in the coin tossing experiment, S = {0, 1}n , the

• Note that the family can be very small (e.g. a single

• Data: X = (X1 , X2 , . . . , Xn ) where

• Statistical model: contains all joint distributions of n

• for example, blindly guessing θ to be 0 is a type of inference

1. Point estimation: an estimator is a statistic (a function of the

1. Point estimation: an estimator is a statistic (a function of the

3. Interval estimation: suppose g(θ) ∈ R. We want an interval

Pθ (g(θ) ∈ [l(X), u(X)]) ≥ 1 − α

(1 − α) is called the significance level or coverage level

3. Interval estimation: suppose g(θ) ∈ R. We want an interval

Pθ (g(θ) ∈ [l(X), u(X)]) ≥ 1 − α

(1 − α) is called the significance level or coverage level

Objective of Point estimation:

• What is a formal notion of “closeness”?

L(θ, d) = the loss incurred when estimating g(θ) by d

Note that d is taken to our estimator which is a statistic

• Squared error loss

We assume for simplicity the minimum value is taken at g(θ)

• Data: X = (X1 , X2 , . . . , Xn ) i.i.d. N (θ, 1)

• Data: X = (X1 , X2 , . . . , Xn ) i.i.d. N (θ, 1)

If we take d = δ(X) as our estimator, then the loss L(θ, δ(X))

• is random (under repeated experiments)

• Data: X = (X1 , X2 , . . . , Xn ) i.i.d. N (θ, 1)

If we take d = δ(X) as our estimator, then the loss L(θ, δ(X))

• is random (under repeated experiments)

How do we evaluate the performance of different estimators? say

R(θ, δ) = EX∼Pθ [L(θ, δ(X))] .

In general, we want to find estimators that has “low” risk

• The risk of sample mean is

Proof of NO uniformly best estimator

2. In particular, R(θ, δ ∗ ) ≤ R(θ, g(θ))

Since no uniformly best estimator exist, we must be careful when

3. Large sample (asymptotic) approach

Strategy A: restrict attention to unbiased estimators

• bias: EX∼Pθ δ(X) − g(θ) is the bias of δ

Strategy A: restrict attention to unbiased estimators

• bias: EX∼Pθ δ(X) − g(θ) is the bias of δ

Strategy B: restrict attention to estimators with certain symmetry

sup R(θ, δ ∗ ) ≤ sup R(θ, δ) for any other δ

• Such an estimator is called minimax. It seems rather

Strategy B: minimize the average risk

where Λ(θ) is some measure over θ

Intuition for large sample approach: when n tends to infinity, the

• The first half: focus on the logic of Lehmann and Casella

• Measure theory is the foundation of all rigorous statistical

Given a set X , a measure µ maps subsets A ⊆ X to [0, ∞)

• Example 1: if X is countable (e.g. X = Z), the counting

A σ-algebra F on a set X is a collection of subsets of X satisfying

• it includes X and the empty set

• Example 1: if X is countable, F = 2X (all subsets)

Given (X , F) (a measurable space), a measure is any map

If in addition µ(X ) = 1, then µ is a probability measure

Given (X , F) and two measures µ, P, we say P is absolutely

A probability space is the triple (Ω, F, P)

• Sample space Ω, which is the set of all possible outcomes

A random variable is a function Y : Ω → X

• We say Y has distribution Q (or Y ∼ Q) if

P(Y ∈ B) = P({w : Y(w) ∈ B}) = Q(B)

The expectation is an integral with respect to P

• More in Keener Chap. 1

What we have covered

What we have covered

What is next lecture?

You might also like