Econometrics
Causational e0ect Correlations
A change in Y (outcome) is the causal e(ect A correlation between variables data does
of X ( treatment) if we can be sure that the not imply a causal relationship between
change in Y is due to the change in X and to them
nothing else
How to estimate causa e0ects: randomized controlled experiment (RCE)
• Binary treatment X : a person either receives (X = 1) or does not receive (X = 0) a
treatment (control)
• Suppose we randomly allocate individuals between treatment and control groups
• Then the observed di0erence in mean outcome Y in each group reflects receiving the
treatment −the change from X = 0 to X = 1− and nothing else. The change in mean
outcome Y is the causal e0ect of the change in X.
• Because all other things that may be a0ecting the outcome Y are, on average, the
same in both treatment and control groups by virtue of the random allocation of X.
Types of data:
• Cross Section: data on di(erent entities or units (individuals, firms, countries, etc.)
collected at a single period of time.
• Time series: data on a single entity or unit (individuals, firm, country, etc.) collected
over multiple time periods.
• Panel (longitudinal) data: data on di(erent entities or units collected over multiple
time periods.
Graphical representation of data:
1. Cross sectional data: histogram
2. Time-series data: time line plot
Covariance Measure of the linear association between two variables, say X and Y, in the
sample:
It depends on the unit of measurement
Correlation coe0icient
Does not depend on unit of measurement
Normal
(Gaussian)
distribution
Chi-squared
distribution
T/student
distribution
Bernoulli
distribution
• Linear regression model with single regressor
• Estimation of parameters in linear regression model
Solution for β0
Solution for β1
• Measure of fit
R squared
Total sum of squares
(TSS)
Explained sum of
squares (ESS)
Standard error of the
regression (SER)
• Properties of OLS
Assumption
#1
Assumption
#2
Assumption
#3
Preliminary algebra
• Hypothesis tests about β1
• Confidence intervals about β1
• Regression when X is binary (0/1)
• Heteroskedasticity and homoskedasticity
• Gauss-Markov theorem
Multiple linear regression model
• Omitted variable bias
• Multiple regression model
• Measures of fit
• Least squares assumptions
• Sampling distribution of the OLS estimator
• Using “dummy variables”
• Hypothesis test and confidence intervals for a single coe(icient
• Testing joint hypothesis
Nonlinear models
• Nonlinear regression functions
• Polynomials (single regressor)
• Logarithms (single regressor)
• Interaction between variables (multiple regressors)
• Failures of the OLS
Why external validity fails?
How to assess external validity?
Omitted variables observed or controls available
Omitted variables observed and controls not available
Incorrect specification
Error in variables
Measurement error bias
Measurement error in Y
Solutions to measurement error problem
Missing data and sample selection
Random missing data
Missing data depending on X’s
Missing data depending on X’s and Y
Simultaneity
Solutions
Inconsistent standard errors
Heteroskedasticity
Autocorrelation
§ Instrumental variables
IV in the one regressor model
Properties
General IV model
Conditions for identification
Role of exogenous variables
TSLS with single endogenous regressor
TSLS with multiple endogenous regressor
Identification conditions of the general regression model
Assumptions
Asymptotic properties of estimator
Asymptotic standard errors
Example
Instruments’ validity
Relevance, weak instruments
Problem 1
Problem 2
Testing
Solution
Exogeneity of instruments
J test
Where do instruments come from?
Grade
Wage regression
Quarter of birth
Other instruments
Selecting valid instruments
Crime and police
Instruments from natural experiments
• Panel data
Panel data with two time periods
Fixed e(ects regression
Regression with time fixed e(ects
Standard errors for fixed e(ects regression
Application to drunk driving and tra(ic safety
Limited dependent variables
Linear probability model
Inference and fit
Linear probability model
Probit regression
Probit regression with multiple regressors
Logit regression
Example
Estimation and inference
Review
E(ect on wage of working one extra month
E(ect on wage of working 50 extra hours of overtime
Do parttime workers earn more or less than fulltime ones?
Is the wage di(erence between parttime and fulltime significant
State and test the hypothesis that the return to a month of work is 1200 euros
What is the R-squared of the model?
What is the Standard Error of the Regression?
What is the expected wage for somebody working 10 months fulltime in a firm with
over 500 employees and working 15 hours of overtime?
What is the di(erence in wage between two identical workers, one at a firm with 100-
499 employees, the other at a firm less than 4 employees?
Test whether identical workers at firms with 5-9 employees and at firms with less than
4 employees earn the same wage
What is the di(erence in wage between two identical workers, one at a firm with 20-49
employees, the other at a firm with 5-15 employees?
Change of excluded group
What is the reduction in wage su(ered from those working part-time?
Construct the 95% confidence interval for the elasticity of wage to month worked
Test whether the model is nonlinear in overtime hours
What is the e(ect on wage of one extra hour of overtime?
Is the e(ect of overtime on wage significantly di(erent for partime and fulltime
workers?
What is the 10% CI for the price elasticity?
Is demand elasticity significantly di(erent from 1.5?