Introduction Estimation
Meeting 10: Matching
Gumilang Aryo Sahadewo
Universitas Gadjah Mada
November 12, 2019
Introduction Estimation
Motivation
1 Introduction
Motivation
Matching Methods
Finding a match
Propensity Score Matching
2 Estimation
Estimation
Matching Methods
Syntax
Introduction Estimation
Motivation
The Problem of Counterfactual
Potential Outcome Framework
Recall the two potential outcomes:
(
Y1i Di = 1
Yi =
Y0i Di = 0
Yi = Y0i + (Y1i − Y0i ) Di
The causal effect of a program is (Y1i − Y0i )
What is the problem with estimating the causal effect?
Introduction Estimation
Motivation
The Problem of Counterfactual
Potential Outcome Framework
Recall the two potential outcomes:
(
Y1i Di = 1
Yi =
Y0i Di = 0
Yi = Y0i + (Y1i − Y0i ) Di
The causal effect of a program is (Y1i − Y0i )
What is the problem with estimating the causal effect?
Introduction Estimation
Motivation
The Problem of Counterfactual
Potential Outcome Framework
Recall the two potential outcomes:
(
Y1i Di = 1
Yi =
Y0i Di = 0
Yi = Y0i + (Y1i − Y0i ) Di
The causal effect of a program is (Y1i − Y0i )
What is the problem with estimating the causal effect?
Introduction Estimation
Motivation
The Problem of Counterfactual
Potential Outcome Framework
The fundamental problem of evaluation is no counterfactual
ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
ATE = E [Y1i − Yoi ] = P (D = 1) · ATT + P (D = 0) · ATNT
The methods that we’ve studied so far seek to construct a
valid comparison group
Matching is another method that applies statistical techniques
to construct a comparison group
Matching identifies average unobserved counterfactuals
Introduction Estimation
Motivation
The Problem of Counterfactual
Potential Outcome Framework
The fundamental problem of evaluation is no counterfactual
ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
ATE = E [Y1i − Yoi ] = P (D = 1) · ATT + P (D = 0) · ATNT
The methods that we’ve studied so far seek to construct a
valid comparison group
Matching is another method that applies statistical techniques
to construct a comparison group
Matching identifies average unobserved counterfactuals
Introduction Estimation
Motivation
The Problem of Counterfactual
Potential Outcome Framework
The fundamental problem of evaluation is no counterfactual
ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
ATE = E [Y1i − Yoi ] = P (D = 1) · ATT + P (D = 0) · ATNT
The methods that we’ve studied so far seek to construct a
valid comparison group
Matching is another method that applies statistical techniques
to construct a comparison group
Matching identifies average unobserved counterfactuals
Introduction Estimation
Motivation
The Problem of Counterfactual
Potential Outcome Framework
The fundamental problem of evaluation is no counterfactual
ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
ATE = E [Y1i − Yoi ] = P (D = 1) · ATT + P (D = 0) · ATNT
The methods that we’ve studied so far seek to construct a
valid comparison group
Matching is another method that applies statistical techniques
to construct a comparison group
Matching identifies average unobserved counterfactuals
Introduction Estimation
Motivation
The Problem of Counterfactual
Potential Outcome Framework
The fundamental problem of evaluation is no counterfactual
ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
ATE = E [Y1i − Yoi ] = P (D = 1) · ATT + P (D = 0) · ATNT
The methods that we’ve studied so far seek to construct a
valid comparison group
Matching is another method that applies statistical techniques
to construct a comparison group
Matching identifies average unobserved counterfactuals
Introduction Estimation
Motivation
The Problem of Counterfactual
Potential Outcome Framework
The fundamental problem of evaluation is no counterfactual
ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
ATE = E [Y1i − Yoi ] = P (D = 1) · ATT + P (D = 0) · ATNT
The methods that we’ve studied so far seek to construct a
valid comparison group
Matching is another method that applies statistical techniques
to construct a comparison group
Matching identifies average unobserved counterfactuals
Introduction Estimation
Motivation
The Problem of Counterfactual
Potential Outcome Framework
The fundamental problem of evaluation is no counterfactual
ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
ATE = E [Y1i − Yoi ] = P (D = 1) · ATT + P (D = 0) · ATNT
The methods that we’ve studied so far seek to construct a
valid comparison group
Matching is another method that applies statistical techniques
to construct a comparison group
Matching identifies average unobserved counterfactuals
Introduction Estimation
Matching Methods
1 Introduction
Motivation
Matching Methods
Finding a match
Propensity Score Matching
2 Estimation
Estimation
Matching Methods
Syntax
Introduction Estimation
Matching Methods
Matching Methods
Selection on Observables
The identifying assumption is selection on observables:
(Y (0) , Y (1)) ⊥ D | X
This is equivalent to:
Pr (D = 1 | Y (0) , Y (1) , X ) = Pr (D = 1 | X )
E (D = 1 | Y (0) , Y (1) , X ) = E (D = 1 | X )
Introduction Estimation
Matching Methods
Matching Methods
Selection on Observables
The identifying assumption is selection on observables:
(Y (0) , Y (1)) ⊥ D | X
This is equivalent to:
Pr (D = 1 | Y (0) , Y (1) , X ) = Pr (D = 1 | X )
E (D = 1 | Y (0) , Y (1) , X ) = E (D = 1 | X )
Introduction Estimation
Matching Methods
Matching Methods
Selection on Observables
Differences between treatment and comparison group are
captured in X
ATT : Y0i ⊥ D | X → E [Y0i | D = 1, X ] = E [Y0i | D = 0, X ]
ATNT : Y1i ⊥ D | X → E [Y1i | D = 1, X ] =
E [Y1i | D = 0, X ]
ATE : Y0 , Y1 ⊥ D | X
What would be the threat to identification?
Introduction Estimation
Matching Methods
Matching Methods
Selection on Observables
Differences between treatment and comparison group are
captured in X
ATT : Y0i ⊥ D | X → E [Y0i | D = 1, X ] = E [Y0i | D = 0, X ]
ATNT : Y1i ⊥ D | X → E [Y1i | D = 1, X ] =
E [Y1i | D = 0, X ]
ATE : Y0 , Y1 ⊥ D | X
What would be the threat to identification?
Introduction Estimation
Matching Methods
Matching Methods
Selection on Observables
Differences between treatment and comparison group are
captured in X
ATT : Y0i ⊥ D | X → E [Y0i | D = 1, X ] = E [Y0i | D = 0, X ]
ATNT : Y1i ⊥ D | X → E [Y1i | D = 1, X ] =
E [Y1i | D = 0, X ]
ATE : Y0 , Y1 ⊥ D | X
What would be the threat to identification?
Introduction Estimation
Matching Methods
Matching Methods
Selection on Observables
Differences between treatment and comparison group are
captured in X
ATT : Y0i ⊥ D | X → E [Y0i | D = 1, X ] = E [Y0i | D = 0, X ]
ATNT : Y1i ⊥ D | X → E [Y1i | D = 1, X ] =
E [Y1i | D = 0, X ]
ATE : Y0 , Y1 ⊥ D | X
What would be the threat to identification?
Introduction Estimation
Matching Methods
Matching Methods
Selection on Observables
Differences between treatment and comparison group are
captured in X
ATT : Y0i ⊥ D | X → E [Y0i | D = 1, X ] = E [Y0i | D = 0, X ]
ATNT : Y1i ⊥ D | X → E [Y1i | D = 1, X ] =
E [Y1i | D = 0, X ]
ATE : Y0 , Y1 ⊥ D | X
What would be the threat to identification?
Introduction Estimation
Matching Methods
Matching Methods
Selection on Observables
Just like in the standard OLS framework, differences between
treatment and comparison group are not captured in X
Differences between the two groups are on the unobservable
characteristics
Introduction Estimation
Matching Methods
Matching Methods
Selection on Observables
Just like in the standard OLS framework, differences between
treatment and comparison group are not captured in X
Differences between the two groups are on the unobservable
characteristics
Introduction Estimation
Matching Methods
Matching Methods
Common Support
We observe individuals in the treatment and non-treatment
unit with the same characteristics
ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
When X = x , then Pr (D = 1 | X = x ) = 1.
We won’t observe individuals in the control group with X = x
Thus, there we cannot obtain a valid comparison group
Introduction Estimation
Matching Methods
Matching Methods
Common Support
We observe individuals in the treatment and non-treatment
unit with the same characteristics
ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
When X = x , then Pr (D = 1 | X = x ) = 1.
We won’t observe individuals in the control group with X = x
Thus, there we cannot obtain a valid comparison group
Introduction Estimation
Matching Methods
Matching Methods
Common Support
We observe individuals in the treatment and non-treatment
unit with the same characteristics
ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
When X = x , then Pr (D = 1 | X = x ) = 1.
We won’t observe individuals in the control group with X = x
Thus, there we cannot obtain a valid comparison group
Introduction Estimation
Matching Methods
Matching Methods
Common Support
We observe individuals in the treatment and non-treatment
unit with the same characteristics
ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
When X = x , then Pr (D = 1 | X = x ) = 1.
We won’t observe individuals in the control group with X = x
Thus, there we cannot obtain a valid comparison group
Introduction Estimation
Matching Methods
Matching Methods
Common Support
We observe individuals in the treatment and non-treatment
unit with the same characteristics
ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
When X = x , then Pr (D = 1 | X = x ) = 1.
We won’t observe individuals in the control group with X = x
Thus, there we cannot obtain a valid comparison group
Introduction Estimation
Matching Methods
Matching Methods
Common Support
We observe individuals in the treatment and non-treatment
unit with the same characteristics
ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
When X = x , then Pr (D = 1 | X = x ) = 1.
We won’t observe individuals in the control group with X = x
Thus, there we cannot obtain a valid comparison group
Introduction Estimation
Matching Methods
Matching Methods
Common Support
We observe individuals in the treatment and non-treatment
unit with the same characteristics
ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
When X = x , then Pr (D = 1 | X = x ) = 1.
We won’t observe individuals in the control group with X = x
Thus, there we cannot obtain a valid comparison group
Introduction Estimation
Matching Methods
Matching Methods
Common Support
We observe individuals in the treatment and non-treatment
unit with the same characteristics
ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
When X = x , then Pr (D = 1 | X = x ) = 1.
We won’t observe individuals in the control group with X = x
Thus, there we cannot obtain a valid comparison group
Introduction Estimation
Matching Methods
Matching Methods
Selection on Observables & Common Support
If the assumptions hold, we can use the observed average
outcome of the non-treatment units to estimate the
counterfactual outcome
Introduction Estimation
Finding a match
1 Introduction
Motivation
Matching Methods
Finding a match
Propensity Score Matching
2 Estimation
Estimation
Matching Methods
Syntax
Introduction Estimation
Finding a match
Finding a match
The goal of matching is to approximate the characteristics
that explain individual’s decision to enroll
This procedure requires a large data set. Why?
Introduction Estimation
Finding a match
Finding a match
The goal of matching is to approximate the characteristics
that explain individual’s decision to enroll
This procedure requires a large data set. Why?
Introduction Estimation
Finding a match
Finding a match
Treated Untreated
Months unemployed Poor Months unemployed Poor
5 1 2 1
10 0 12 1
3 0 8 1
20 0 14 0
2 1 4 0
8 1 6 1
6 1 1 1
Introduction Estimation
Finding a match
Finding a match
Treated Untreated
Months unemployed Poor Months unemployed Poor
5 1 2 1
10 0 12 1
3 0 8 1
20 0 14 0
2 1 4 0
8 1 6 1
6 1 1 1
Introduction Estimation
Finding a match
Finding a match
Treated Untreated
Months unemployed Poor Months unemployed Poor
5 1 2 1
10 0 12 1
3 0 8 1
20 0 14 0
2 1 4 0
8 1 6 1
6 1 1 1
Introduction Estimation
Finding a match
Finding a match
Treated Untreated
Months unemployed Poor Months unemployed Poor
5 1 2 1
10 0 12 1
3 0 8 1
20 0 14 0
2 1 4 0
8 1 6 1
6 1 1 1
Introduction Estimation
Finding a match
The problem of matching
It is difficult to identify a match for each of the units in the
treatment group
The list of observed characteristics is large
Each characteristics takes on many values
We can easily run into the curse of dimensionality
Dilemma:
Limit the set of observed characteristics, but...
Increase the number of observed characteristics, but...
Introduction Estimation
Finding a match
The problem of matching
It is difficult to identify a match for each of the units in the
treatment group
The list of observed characteristics is large
Each characteristics takes on many values
We can easily run into the curse of dimensionality
Dilemma:
Limit the set of observed characteristics, but...
Increase the number of observed characteristics, but...
Introduction Estimation
Finding a match
The problem of matching
It is difficult to identify a match for each of the units in the
treatment group
The list of observed characteristics is large
Each characteristics takes on many values
We can easily run into the curse of dimensionality
Dilemma:
Limit the set of observed characteristics, but...
Increase the number of observed characteristics, but...
Introduction Estimation
Finding a match
The problem of matching
It is difficult to identify a match for each of the units in the
treatment group
The list of observed characteristics is large
Each characteristics takes on many values
We can easily run into the curse of dimensionality
Dilemma:
Limit the set of observed characteristics, but...
Increase the number of observed characteristics, but...
Introduction Estimation
Finding a match
The problem of matching
It is difficult to identify a match for each of the units in the
treatment group
The list of observed characteristics is large
Each characteristics takes on many values
We can easily run into the curse of dimensionality
Dilemma:
Limit the set of observed characteristics, but...
Increase the number of observed characteristics, but...
Introduction Estimation
Finding a match
The problem of matching
It is difficult to identify a match for each of the units in the
treatment group
The list of observed characteristics is large
Each characteristics takes on many values
We can easily run into the curse of dimensionality
Dilemma:
Limit the set of observed characteristics, but...
Increase the number of observed characteristics, but...
Introduction Estimation
Finding a match
The problem of matching
It is difficult to identify a match for each of the units in the
treatment group
The list of observed characteristics is large
Each characteristics takes on many values
We can easily run into the curse of dimensionality
Dilemma:
Limit the set of observed characteristics, but...
Increase the number of observed characteristics, but...
Introduction Estimation
Propensity Score Matching
1 Introduction
Motivation
Matching Methods
Finding a match
Propensity Score Matching
2 Estimation
Estimation
Matching Methods
Syntax
Introduction Estimation
Propensity Score Matching
Idea
A solution to the curse of dimensionality problem is the
propensity score matching
The method computes the probability that the unit is enrolled
in the program using the observable characteristics
We do this for treatment and non-treatment units
Note that we only use the baseline or pre-treatment
observable characteristics
The propensity score is between 0 and 1
Introduction Estimation
Propensity Score Matching
Idea
A solution to the curse of dimensionality problem is the
propensity score matching
The method computes the probability that the unit is enrolled
in the program using the observable characteristics
We do this for treatment and non-treatment units
Note that we only use the baseline or pre-treatment
observable characteristics
The propensity score is between 0 and 1
Introduction Estimation
Propensity Score Matching
Idea
A solution to the curse of dimensionality problem is the
propensity score matching
The method computes the probability that the unit is enrolled
in the program using the observable characteristics
We do this for treatment and non-treatment units
Note that we only use the baseline or pre-treatment
observable characteristics
The propensity score is between 0 and 1
Introduction Estimation
Propensity Score Matching
Idea
A solution to the curse of dimensionality problem is the
propensity score matching
The method computes the probability that the unit is enrolled
in the program using the observable characteristics
We do this for treatment and non-treatment units
Note that we only use the baseline or pre-treatment
observable characteristics
The propensity score is between 0 and 1
Introduction Estimation
Propensity Score Matching
Idea
A solution to the curse of dimensionality problem is the
propensity score matching
The method computes the probability that the unit is enrolled
in the program using the observable characteristics
We do this for treatment and non-treatment units
Note that we only use the baseline or pre-treatment
observable characteristics
The propensity score is between 0 and 1
Introduction Estimation
Propensity Score Matching
Idea
The propensity score is:
e (x) = P (D = 1 | X = x)
The score is used to make this assumption:
X ⊥ D | e (X )
Combining:
(Y1 , Y0 ) ⊥ D | X and 0 < e (x) < 1
(Y1 , Y0 ) ⊥ D | e (X )
Introduction Estimation
Propensity Score Matching
Idea
The propensity score is:
e (x) = P (D = 1 | X = x)
The score is used to make this assumption:
X ⊥ D | e (X )
Combining:
(Y1 , Y0 ) ⊥ D | X and 0 < e (x) < 1
(Y1 , Y0 ) ⊥ D | e (X )
Introduction Estimation
Propensity Score Matching
Idea
The propensity score is:
e (x) = P (D = 1 | X = x)
The score is used to make this assumption:
X ⊥ D | e (X )
Combining:
(Y1 , Y0 ) ⊥ D | X and 0 < e (x) < 1
(Y1 , Y0 ) ⊥ D | e (X )
Introduction Estimation
Propensity Score Matching
Idea
Match treatment and non-treatment units with the closest
propensity score.
The matched non-treatment units become the comparison
group
The average difference in outcomes between the treatment
and the matched comparison is the estimate of the impact
The propensity score matching mimics a randomized
experiment
Treatment and comparison units have similar propensities.
Introduction Estimation
Propensity Score Matching
Idea
Match treatment and non-treatment units with the closest
propensity score.
The matched non-treatment units become the comparison
group
The average difference in outcomes between the treatment
and the matched comparison is the estimate of the impact
The propensity score matching mimics a randomized
experiment
Treatment and comparison units have similar propensities.
Introduction Estimation
Propensity Score Matching
Idea
Match treatment and non-treatment units with the closest
propensity score.
The matched non-treatment units become the comparison
group
The average difference in outcomes between the treatment
and the matched comparison is the estimate of the impact
The propensity score matching mimics a randomized
experiment
Treatment and comparison units have similar propensities.
Introduction Estimation
Propensity Score Matching
Idea
Match treatment and non-treatment units with the closest
propensity score.
The matched non-treatment units become the comparison
group
The average difference in outcomes between the treatment
and the matched comparison is the estimate of the impact
The propensity score matching mimics a randomized
experiment
Treatment and comparison units have similar propensities.
Introduction Estimation
Propensity Score Matching
Idea
Match treatment and non-treatment units with the closest
propensity score.
The matched non-treatment units become the comparison
group
The average difference in outcomes between the treatment
and the matched comparison is the estimate of the impact
The propensity score matching mimics a randomized
experiment
Treatment and comparison units have similar propensities.
Introduction Estimation
Propensity Score Matching
Steps to PSM
Find representative surveys to identify the treatment and
non-treatment units
Pool the sample and estimate the probability that each
individual receives the treatment (based on observable
characteristics)
It is important to include relevant variables to avoid a biased
estimate
theory and previous empirical findings
formal statistical tests
Always remember the tradeoff:
Small set of characteristics: selection on observable assumption
Large set of characteristics: problems of common support
Introduction Estimation
Propensity Score Matching
Steps to PSM
Find representative surveys to identify the treatment and
non-treatment units
Pool the sample and estimate the probability that each
individual receives the treatment (based on observable
characteristics)
It is important to include relevant variables to avoid a biased
estimate
theory and previous empirical findings
formal statistical tests
Always remember the tradeoff:
Small set of characteristics: selection on observable assumption
Large set of characteristics: problems of common support
Introduction Estimation
Propensity Score Matching
Steps to PSM
Find representative surveys to identify the treatment and
non-treatment units
Pool the sample and estimate the probability that each
individual receives the treatment (based on observable
characteristics)
It is important to include relevant variables to avoid a biased
estimate
theory and previous empirical findings
formal statistical tests
Always remember the tradeoff:
Small set of characteristics: selection on observable assumption
Large set of characteristics: problems of common support
Introduction Estimation
Propensity Score Matching
Steps to PSM
Find representative surveys to identify the treatment and
non-treatment units
Pool the sample and estimate the probability that each
individual receives the treatment (based on observable
characteristics)
It is important to include relevant variables to avoid a biased
estimate
theory and previous empirical findings
formal statistical tests
Always remember the tradeoff:
Small set of characteristics: selection on observable assumption
Large set of characteristics: problems of common support
Introduction Estimation
Propensity Score Matching
Steps to PSM
Find representative surveys to identify the treatment and
non-treatment units
Pool the sample and estimate the probability that each
individual receives the treatment (based on observable
characteristics)
It is important to include relevant variables to avoid a biased
estimate
theory and previous empirical findings
formal statistical tests
Always remember the tradeoff:
Small set of characteristics: selection on observable assumption
Large set of characteristics: problems of common support
Introduction Estimation
Propensity Score Matching
Steps to PSM
Find representative surveys to identify the treatment and
non-treatment units
Pool the sample and estimate the probability that each
individual receives the treatment (based on observable
characteristics)
It is important to include relevant variables to avoid a biased
estimate
theory and previous empirical findings
formal statistical tests
Always remember the tradeoff:
Small set of characteristics: selection on observable assumption
Large set of characteristics: problems of common support
Introduction Estimation
Propensity Score Matching
Steps to PSM
Find representative surveys to identify the treatment and
non-treatment units
Pool the sample and estimate the probability that each
individual receives the treatment (based on observable
characteristics)
It is important to include relevant variables to avoid a biased
estimate
theory and previous empirical findings
formal statistical tests
Always remember the tradeoff:
Small set of characteristics: selection on observable assumption
Large set of characteristics: problems of common support
Introduction Estimation
Propensity Score Matching
Steps to PSM
Find representative surveys to identify the treatment and
non-treatment units
Pool the sample and estimate the probability that each
individual receives the treatment (based on observable
characteristics)
It is important to include relevant variables to avoid a biased
estimate
theory and previous empirical findings
formal statistical tests
Always remember the tradeoff:
Small set of characteristics: selection on observable assumption
Large set of characteristics: problems of common support
Introduction Estimation
Propensity Score Matching
Steps to PSM
Obtain the propensity scores
Restrict the sample to units with a common support
For each enrolled unit, locate a subgroup of non-treated units
with similar propensity scores
Test whether the means for the treatment and non-treated
units are statistically different
Introduction Estimation
Propensity Score Matching
Steps to PSM
Obtain the propensity scores
Restrict the sample to units with a common support
For each enrolled unit, locate a subgroup of non-treated units
with similar propensity scores
Test whether the means for the treatment and non-treated
units are statistically different
Introduction Estimation
Propensity Score Matching
Steps to PSM
Obtain the propensity scores
Restrict the sample to units with a common support
For each enrolled unit, locate a subgroup of non-treated units
with similar propensity scores
Test whether the means for the treatment and non-treated
units are statistically different
Introduction Estimation
Propensity Score Matching
Steps to PSM
Obtain the propensity scores
Restrict the sample to units with a common support
For each enrolled unit, locate a subgroup of non-treated units
with similar propensity scores
Test whether the means for the treatment and non-treated
units are statistically different
Introduction Estimation
Propensity Score Matching
Steps to PSM
The measure of the impact is the difference between the
outcomes of the treatment and the matched comparison.
Introduction Estimation
Estimation
1 Introduction
Motivation
Matching Methods
Finding a match
Propensity Score Matching
2 Estimation
Estimation
Matching Methods
Syntax
Introduction Estimation
Estimation
Matching Strategy and ATT
The matching strategy is:
Pair each treatment unit i with one or more comparable
non-treated units
Associate the outcome Yiobs a matched outcome Ŷi (0) given
the weighted outcomes of its neighbors:
wij Yjobs
X
Ŷi (0) =
j∈C (i)
C (i) is the set of neighbors with DP
= 0 of the treated subject i
wij is the weight of non-treated j, j∈C(i) wij = 1
Introduction Estimation
Estimation
Matching Strategy and ATT
The matching strategy is:
Pair each treatment unit i with one or more comparable
non-treated units
Associate the outcome Yiobs a matched outcome Ŷi (0) given
the weighted outcomes of its neighbors:
wij Yjobs
X
Ŷi (0) =
j∈C (i)
C (i) is the set of neighbors with DP
= 0 of the treated subject i
wij is the weight of non-treated j, j∈C(i) wij = 1
Introduction Estimation
Estimation
Matching Strategy and ATT
The matching strategy is:
Pair each treatment unit i with one or more comparable
non-treated units
Associate the outcome Yiobs a matched outcome Ŷi (0) given
the weighted outcomes of its neighbors:
wij Yjobs
X
Ŷi (0) =
j∈C (i)
C (i) is the set of neighbors with DP
= 0 of the treated subject i
wij is the weight of non-treated j, j∈C(i) wij = 1
Introduction Estimation
Estimation
Matching Strategy and ATT
The matching strategy is:
Pair each treatment unit i with one or more comparable
non-treated units
Associate the outcome Yiobs a matched outcome Ŷi (0) given
the weighted outcomes of its neighbors:
wij Yjobs
X
Ŷi (0) =
j∈C (i)
C (i) is the set of neighbors with DP
= 0 of the treated subject i
wij is the weight of non-treated j, j∈C(i) wij = 1
Introduction Estimation
Estimation
Matching Strategy and ATT
The matching strategy is:
Pair each treatment unit i with one or more comparable
non-treated units
Associate the outcome Yiobs a matched outcome Ŷi (0) given
the weighted outcomes of its neighbors:
wij Yjobs
X
Ŷi (0) =
j∈C (i)
C (i) is the set of neighbors with DP
= 0 of the treated subject i
wij is the weight of non-treated j, j∈C(i) wij = 1
Introduction Estimation
Estimation
Matching Strategy and ATT
The ATT:
E [Yi (1) − Yi (0) | Di = 1]
is estimated as:
ˆ = 1
X h i
ATT T
Yiobs − Ŷi (0)
N i:D =1
i
N T is the number of matched treated in the sample
Introduction Estimation
Estimation
Matching Strategy and ATT
The ATT:
E [Yi (1) − Yi (0) | Di = 1]
is estimated as:
ˆ = 1
X h i
ATT T
Yiobs − Ŷi (0)
N i:D =1
i
N T is the number of matched treated in the sample
Introduction Estimation
Matching Methods
1 Introduction
Motivation
Matching Methods
Finding a match
Propensity Score Matching
2 Estimation
Estimation
Matching Methods
Syntax
Introduction Estimation
Matching Methods
Matching Strategy and ATT
One-to-one matching is still the most desirable
But it is difficult to observe two units with the same
propensity scores
Matching methods have been developed to deal with this
problem:
Nearest-neighbor matching
Radius matching
Kernel matching
Stratification matching
Introduction Estimation
Matching Methods
Matching Strategy and ATT
One-to-one matching is still the most desirable
But it is difficult to observe two units with the same
propensity scores
Matching methods have been developed to deal with this
problem:
Nearest-neighbor matching
Radius matching
Kernel matching
Stratification matching
Introduction Estimation
Matching Methods
Matching Strategy and ATT
One-to-one matching is still the most desirable
But it is difficult to observe two units with the same
propensity scores
Matching methods have been developed to deal with this
problem:
Nearest-neighbor matching
Radius matching
Kernel matching
Stratification matching
Introduction Estimation
Matching Methods
Matching Strategy and ATT
One-to-one matching is still the most desirable
But it is difficult to observe two units with the same
propensity scores
Matching methods have been developed to deal with this
problem:
Nearest-neighbor matching
Radius matching
Kernel matching
Stratification matching
Introduction Estimation
Matching Methods
Matching Strategy and ATT
One-to-one matching is still the most desirable
But it is difficult to observe two units with the same
propensity scores
Matching methods have been developed to deal with this
problem:
Nearest-neighbor matching
Radius matching
Kernel matching
Stratification matching
Introduction Estimation
Matching Methods
Matching Strategy and ATT
One-to-one matching is still the most desirable
But it is difficult to observe two units with the same
propensity scores
Matching methods have been developed to deal with this
problem:
Nearest-neighbor matching
Radius matching
Kernel matching
Stratification matching
Introduction Estimation
Matching Methods
Matching Strategy and ATT
One-to-one matching is still the most desirable
But it is difficult to observe two units with the same
propensity scores
Matching methods have been developed to deal with this
problem:
Nearest-neighbor matching
Radius matching
Kernel matching
Stratification matching
Introduction Estimation
Matching Methods
Nearest Neighbor Matching
The absolute difference between the estimated propensity
scores of treatment and control groups are minimized
The group of control individuals are selected such that:
C (Pi ) = minj |Pi − Pj |
where:
Pi is the estimated propensity score for treated individuals i
Pj is the estimated propensity score for the control individuals j
Introduction Estimation
Matching Methods
Nearest Neighbor Matching
The absolute difference between the estimated propensity
scores of treatment and control groups are minimized
The group of control individuals are selected such that:
C (Pi ) = minj |Pi − Pj |
where:
Pi is the estimated propensity score for treated individuals i
Pj is the estimated propensity score for the control individuals j
Introduction Estimation
Matching Methods
Nearest Neighbor Matching
The absolute difference between the estimated propensity
scores of treatment and control groups are minimized
The group of control individuals are selected such that:
C (Pi ) = minj |Pi − Pj |
where:
Pi is the estimated propensity score for treated individuals i
Pj is the estimated propensity score for the control individuals j
Introduction Estimation
Matching Methods
Nearest Neighbor Matching
The absolute difference between the estimated propensity
scores of treatment and control groups are minimized
The group of control individuals are selected such that:
C (Pi ) = minj |Pi − Pj |
where:
Pi is the estimated propensity score for treated individuals i
Pj is the estimated propensity score for the control individuals j
Introduction Estimation
Matching Methods
Radius and Kernel Matching
Radius: each individual in the treatment group is matched
with individuals in the control group whose scores are within a
predefined interval of the treatment individuals’ propensity
score.
Kernel: each individual in the treatment group is matched
with the weighted average of control individuals’ outcomes.
Introduction Estimation
Matching Methods
Radius and Kernel Matching
Radius: each individual in the treatment group is matched
with individuals in the control group whose scores are within a
predefined interval of the treatment individuals’ propensity
score.
Kernel: each individual in the treatment group is matched
with the weighted average of control individuals’ outcomes.
Introduction Estimation
Syntax
1 Introduction
Motivation
Matching Methods
Finding a match
Propensity Score Matching
2 Estimation
Estimation
Matching Methods
Syntax
Introduction Estimation
Syntax
PSCORE
Stata command pscore calculates propensity scores
pscore also tests the balancing hypothesis through this
algorithm:
Split the sample in k equally spaced intervals of e (X )
Within each interval test that the average e (X ) of treated and
untreated do not differ
If the test fails, split the interval and test again
Continue until, in all intervals, the average e (X ) of treated
and untreated units do not differ
Within each interval, test that the means of each characteristic
do not differ between treated and untreated
Introduction Estimation
Syntax
PSCORE
Stata command pscore calculates propensity scores
pscore also tests the balancing hypothesis through this
algorithm:
Split the sample in k equally spaced intervals of e (X )
Within each interval test that the average e (X ) of treated and
untreated do not differ
If the test fails, split the interval and test again
Continue until, in all intervals, the average e (X ) of treated
and untreated units do not differ
Within each interval, test that the means of each characteristic
do not differ between treated and untreated
Introduction Estimation
Syntax
PSCORE
Stata command pscore calculates propensity scores
pscore also tests the balancing hypothesis through this
algorithm:
Split the sample in k equally spaced intervals of e (X )
Within each interval test that the average e (X ) of treated and
untreated do not differ
If the test fails, split the interval and test again
Continue until, in all intervals, the average e (X ) of treated
and untreated units do not differ
Within each interval, test that the means of each characteristic
do not differ between treated and untreated
Introduction Estimation
Syntax
PSCORE
Stata command pscore calculates propensity scores
pscore also tests the balancing hypothesis through this
algorithm:
Split the sample in k equally spaced intervals of e (X )
Within each interval test that the average e (X ) of treated and
untreated do not differ
If the test fails, split the interval and test again
Continue until, in all intervals, the average e (X ) of treated
and untreated units do not differ
Within each interval, test that the means of each characteristic
do not differ between treated and untreated
Introduction Estimation
Syntax
PSCORE
Stata command pscore calculates propensity scores
pscore also tests the balancing hypothesis through this
algorithm:
Split the sample in k equally spaced intervals of e (X )
Within each interval test that the average e (X ) of treated and
untreated do not differ
If the test fails, split the interval and test again
Continue until, in all intervals, the average e (X ) of treated
and untreated units do not differ
Within each interval, test that the means of each characteristic
do not differ between treated and untreated
Introduction Estimation
Syntax
PSCORE
Stata command pscore calculates propensity scores
pscore also tests the balancing hypothesis through this
algorithm:
Split the sample in k equally spaced intervals of e (X )
Within each interval test that the average e (X ) of treated and
untreated do not differ
If the test fails, split the interval and test again
Continue until, in all intervals, the average e (X ) of treated
and untreated units do not differ
Within each interval, test that the means of each characteristic
do not differ between treated and untreated
Introduction Estimation
Syntax
PSCORE
Stata command pscore calculates propensity scores
pscore also tests the balancing hypothesis through this
algorithm:
Split the sample in k equally spaced intervals of e (X )
Within each interval test that the average e (X ) of treated and
untreated do not differ
If the test fails, split the interval and test again
Continue until, in all intervals, the average e (X ) of treated
and untreated units do not differ
Within each interval, test that the means of each characteristic
do not differ between treated and untreated
Introduction Estimation
Syntax
Estimation
Use STATA package psmatch2 and pstest for estimation and
balance check