Econometrics Cheatsheet en

1. Ordinary least squares (OLS) regression aims to minimize the sum of squared residuals by finding the best-fitting linear relationship between variables. 2. For OLS estimates to be best, certain assumptions must hold including linearity, random sampling, no perfect collinearity, conditional mean zero errors, homoscedasticity, and no autocorrelation. 3. Under the assumptions, OLS produces unbiased, consistent, and asymptotically normally distributed estimates. The standard errors of coefficients can also be estimated consistently.

Uploaded by

Rodina Muhammed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

2K views3 pages

Econometrics Cheatsheet en

Uploaded by

Rodina Muhammed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Econometrics Cheat Sheet Assumptions and properties Ordinary Least Squares

By Marcelo Moreno - King Juan Carlos University

Econometric model assumptions Objective - minimize
Pn the Sum of Squared Residuals (SSR):
Under this assumptions, the estimators of the OLS param- min i=1 û2i , where ûi = yi − ŷi
Basic concepts eters will present good properties. Gauss-Markov as- Simple regression model
Definitions sumptions extended: y Equation:
Econometrics - is a social science discipline with the 1. Parameters linearity (plus weak dependence in time yi = β0 + β1 x1i + ui
objective of quantify the relationships between economic series). y must be a linear function of the β’s.
Estimation:
agents, test economic theories and evaluate and implement 2. Random sampling. The sample from the population
ŷi = β̂0 + β̂1 x1i
government and business policies. has been randomly taken. (Only when cross section)
β1 where:
Econometric model - is a simplified representation of the 3. No perfect collinearity.
reality to explain economic phenomena. There are no independent variables that are constant: β̂0 = y − β̂1 x
β0 β̂1 = Cov(y,x)
Ceteris paribus - if all the other relevant factors remain Var(xj ) ̸= 0 Var(x)
constant. There is not an exact linear relation between indepen- x
dent variables.
Data types 4. Conditional mean zero and correlation zero.
Cross section - data taken at a given moment in time, an
Multiple regression model
a. There are no systematic errors: E(u|x1 , ..., xk ) = y Equation:
static photo. Order does not matter. E(u) = 0 → strong exogeneity (a implies b). yi = β0 + β1 x1i + ... + βk xki + ui
Time series - observation of one/many variable/s across b. There are no relevant variables left out of the model:
time. Order does matter. Estimation:
Cov(xj , u) = 0 for any j = 1, ..., k → weak exo-
Panel data - consist of a time series for each observation ŷi = β̂0 + β̂1 x1i + ... + β̂k xki
geneity.
of a cross section. 5. Homoscedasticity. The variability of the residuals is where:
Pooled cross sections - combines cross sections from dif- the same for all levels of x: Var(u|x1 , ..., xk ) = σu2 β̂0 = y − β̂1 x1 − ... − β̂k xk
β0 x2 Cov(y,residualized x )
ferent time periods. 6. No auto-correlation. The residuals do not contain in- β̂j = Var(residualized xj )j
Phases of an econometric model formation about other residuals: Corr(ut , us |x) = 0 for x1 Matrix: β̂ = (X T X)−1 (X T y)
1. Specification. 3. Validation. any given t ̸= s. (Typical of time series)
2. Estimation. 4. Utilization. 7. Normality. The residuals are independent and identi- Interpretation of coefficients
cally distributed: u ∼ N (0, σu2 ) Model Dependent Independent β1 interpretation
Regression analysis 8. Data size. The number of observations available must Level-level y x ∆y = β1 ∆x
Study and predict the mean value of a variable (dependent be greater than (k + 1) parameters to estimate. (It is Level-log y log(x) ∆y ≈ (β1 /100)(%∆x)
variable, y) regarding the base of fixed values of other vari- already satisfied under asymptotic situations) Log-level log(y) x %∆y ≈ (100β1 )∆x
ables (independent variables, x’s). In econometrics it is Log-log log(y) log(x) %∆y ≈ β1 (%∆x)
common to use Ordinary Least Squares (OLS) for regres- Asymptotic properties of OLS Quadratic y x + x2 ∆y = (β1 + 2β2 x)∆x
sion analysis. Under the econometric model assumptions and the Central
Limit Theorem:
Error measures Pn Pn
Correlation analysis Hold (1) to (4a): OLS is unbiased. E(β̂j ) = βj Sum of Sq. Residuals: SSR = i=1 û2i = Pi=1 (yi − ŷi )2
n
The correlation analysis not distinguish between dependent Hold (1) to (4): OLS is consistent. plim(β̂ ) = β (to Explained Sum of Squares: SSE = Pi=1 (ŷi − y)2
j j n
and independent variables. Total Sum of Sq.: SST = SSE + SSR = i=1q (yi − y)2
(4b) left out (4a), weak exogeneity, biased but consistent)
The simple correlation measures the grade of linear asso- Hold (1) to (5): asymptotic normality of OLS (then, Standard Error of the Regression: σ̂u = n−k−1 SSR
ciation between two variables. (7) is necessarily satisfied): u ∼ N (0, σu2 )
p
se(β̂) = σ̂u2P· (X T X)−1
Pn
r = Cov(x,y) = √ i=1 ((xi −x)·(yi −y)) a Standard Error of the β̂’s:
σx ·σy n n
Hold (1) to (6): unbiased estimate of σu2 . E(σ̂u2 ) = σu2
n 2
P 2
P 2
i=1 (xi −x) · i=1 (yi −y) i=1 (yi −ŷi )
Mean Squared Error: MSE =
The partial correlation measures the grade of linear as- Hold (1) to (6): OLS is BLUE (Best Linear Unbiased Pn n
|y −ŷ |
sociation between two variables controlling a third. Absolute Mean Error: AME = i=1 n i i
Estimator) or efficient. Pn
|û /y |
Hold (1) to (7): hypothesis testing and confidence inter- Mean Percentage Error: MPE = i=1 n i i · 100
vals can be done reliably.
3.0-en - github.com/marcelomijas/econometrics-cheatsheet - CC BY 4.0
R-squared Individual tests Confidence intervals
Tests if a parameter is significantly different from a given
Is a measure of the goodness of the fit, how the regression value, ϑ. The confidence intervals at (1 − α) confidence level can be
fits to the data: H0 : β j = ϑ calculated:
SSE
R2 = SST = 1 − SSR
SST H1 : βj ̸= ϑ β̂j ∓ tn−k−1,α/2 · se(β̂j )
Measures the percentage of variation of y that is lin- β̂ −ϑ
early explained by the variations of x’s. Under H0 : t = se(j β̂ ) ∼ tn−k−1,α/2
j Dummy variables and structural
Takes values between 0 (no linear explanation of the If | t |> tn−k−1,α/2 , there is evidence to reject H0 .
variations of y) and 1 (total explanation of the varia- Individual significance test - tests if a parameter is sig- change
tions of y). nificantly different from zero. Dummy (or binary) variables are used for qualitative infor-
When the number of regressors increment, the value of the H0 : βj = 0 mation like sex, civil state, country, etc.
R-squared increments as well, whatever the new variables H1 : βj ̸= 0 Get the value of 1 in a given category, and 0 on
are relevant or not. To solve this problem, there is an ad- β̂j
Under H0 : t= se(β̂j )
∼ tn−k−1,α/2 the rest.
justed R-squared by degrees of freedom (or corrected R- Are used to analyze and modeling structural changes
If | t |> tn−k−1,α/2 , there is evidence to reject H0 .
squared): in the model parameters.
2 n−1
R = 1 − n−k−1 · SSR n−1
SST = 1 − n−k−1 · (1 − R )
2 The F test If a qualitative variable have m categories, we only have to
2 Simultaneously tests multiple (linear) hypothesis about the include (m − 1) dummy variables.
For big sample sizes: R ≈ R2
parameters. It makes use of a non restricted model and a
restricted model: Structural change
Hypothesis testing Non restricted model - is the model on which we want Structural change refers to changes in the values of the pa-
to test the hypothesis. rameters of the econometric model produced by the effect
The basics of hypothesis testing
Restricted model - is the model on which the hypoth- of different sub-populations. Structural change can be in-
An hypothesis test is a rule designed to explain from a sam- cluded in the model through dummy variables.
esis that we want to test have been imposed.
ple, if exist evidence or not to reject an hypothesis The location of the dummy variable matters:
Then,
Pn looking at the errors, there are:
that is made about one or more population parameters. On the intercept (β0 ) - represents the mean difference
i=1 û2
- is the Sum of Sq. Resid. of the non restricted
Elements of an hypothesis test: between the values produced by the structural change.
model: SSR.
Null hypothesis (H0 ) - is the hypothesis to be tested. Pn
On the parameters that determines the slope of
2
i=1 ûr - is the Sum of Sq. Resid of the restricted model:
Alternative hypothesis (H1 ) - is the hypothesis that the regression line (βj ) - represents the effect (slope)
SSRr .
cannot be rejected when the null hypothesis is rejected. difference between the values produced by the structural
Under H0 :
Test statistic - is a random variable whose probability r −SSR change.
F = SSRSSR · n−k−1
q ∼ Fq,n−k−1
distribution is known under the null hypothesis. The Chow’s structural test - when we want to analyze
where k is the number of parameters of the non restricted
Critic value - is the value against which the test statistic the existence of structural changes in all the model param-
model and q is the number of linear hypothesis tested.
is compared to determine if the null hypothesis is rejected eters, it is common to use a particular expression of the F
If Fq,n−k−1 < F , there is evidence to reject H0 .
or not. Is the value that makes the frontier between the test known as the Chow’s test, where the null hypothesis
Global significance test - tests if all the parameters as-
regions of acceptance and rejection of the null hypothesis. is: H0 : No structural change.
sociated to x’s are simultaneously equal to zero.
Significance level (α) - is the probability of rejecting
H0 : β1 = β2 = ... = βk = 0
the null hypothesis being true (Type I error). Is chosen
by who conduct the test. Commonly is 0.10, 0.05 or 0.01.
H1 : β1 ̸= 0 and/or β2 ̸= 0... and/or βk ̸= 0 Predictions
In this case, we can simplify the formula for the F statistic.
p-value - is the highest level of significance by which the Two types of prediction:
Under H0 :
null hypothesis cannot be rejected (H0 ). R2 n−k−1 Of the mean value of y for a specific value of x.
F = 1−R 2 · ∼ Fk,n−k−1
The rule is: if the p-value is less than α, there is evidence k Of an individual value of y for a specific value of x.
If Fk,n−k−1 < F , there is evidence to reject H0 .
to reject the null hypothesis at that given α (there is If the values of the variables (x) approximate to the mean
evidence to accept the alternative hypothesis). values (x), the confidence interval amplitude of the predic-
tion will be shorter.

3.0-en - github.com/marcelomijas/econometrics-cheatsheet - CC BY 4.0

Multicollinearity Heteroscedasticity Auto-correlation
Perfect multicollinearity - there are independent vari- The residuals ui of the population regression function do The residual of any observation, ut , is correlated with the
ables that are constant and/or there is an exact linear not have the same variance σu2 : residual of any other observation. The observations are not
relation between independent variables. Is the breaking Var(u|x) = Var(y|x) ̸= σu2 independent.
of the third (3) econometric model assumption. Is the breaking of the fifth (5) econometric model as- Corr(ut , us |x) ̸= 0 for any t ̸= s
Approximate multicollinearity - there are indepen- sumption. The “natural” context of this phenomena is time series. Is
dent variables that are approximately constant and/or the breaking of the sixth (6) econometric model as-
there is an approximately linear relation between inde-
Consequences sumption.
OLS estimators still are unbiased.
pendent variables. It does not break any economet-
OLS estimators still are consistent. Consequences
ric model assumption, but has an effect on OLS.
OLS is not efficient anymore, but still a LUE (Linear OLS estimators still are unbiased.
Consequences Unbiased Estimator). OLS estimators still are consistent.
Perfect multicollinearity - the equation system of Variance estimations of the estimators are biased: OLS is not efficient anymore, but still a LUE (Linear
OLS cannot be solved due to infinite solutions. the construction of confidence intervals and the hypoth- Unbiased Estimator).
Approximate multicollinearity esis testing is not reliable. Variance estimations of the estimators are biased:
– Small sample variations can induce to big variations in the construction of confidence intervals and the hypoth-
the OLS estimations.
Detection esis testing is not reliable.
Graphs - look u y
– The variance of the OLS estimators of the x’s that are
collinear, increments, thus the inference of the param-
for scatter pat- Detection
terns on x vs. u Graphs - look for scatter patterns on ut−1 vs. ut or
eter is affected. The estimation of the parameter is x
or x vs. y plots. make use of a correlogram.
very imprecise (big confidence interval).
Ac. Ac.(+) Ac.(-)
x ut ut ut
Detection
Correlation analysis - look for high correlations Formal tests - White, Bartlett, Breusch-Pagan, etc.
(greater than 0.7) between independent variables. Commonly, the null hypothesis: H0 : Homoscedasticity. ut−1 ut−1
Variance Inflation Factor (VIF) - indicates the in- ut−1
Correction
crement of Var(β̂j ) because of the multicollinearity. Use OLS with a variance-covariance matrix estimator ro-
1
VIF(β̂j ) = 1−R 2 bust to heteroscedasticity (HC), for example, the one pro-
2
j
Formal tests - Durbin-Watson, Breusch-Godfrey, etc.
where Rj denotes the R-squared from a regression be- posed by White.
Commonly, the null hypothesis: H0 : No auto-correlation.
tween xj and all the other x’s. If the variance structure is known, make use of Weighted
– Values between 4 to 10 suggest that it is advisable to Least Squares (WLS) or Generalized Least Squares Correction
analyze in more depth if there might be multicollinear- (GLS): Use OLS with a variance-covariance matrix estimator ro-
2
ity problems. – Supposing that Var(ui ) = σ u · x i , divide the model bust to heterocedasticity and auto-correlation (HAC), for
– Values bigger than 10 indicates that there are multi- variables by the square root of x i and apply OLS. example, the one proposed by Newey-West.
collinearity problems. – Supposing that Var(u i ) = σ u
2
· x 2
i , divide the model Use Generalized Least Squares. Supposing yt = β0 +
2
One typical characteristic of multicollinearity is that the variables by x i (the square root of x i ) and apply OLS. β1 xt + ut , with ut = ρut−1 + εt , where |ρ| < 1 and εt is
regression coefficients of the model are not individually dif- If the variance structure is not known, make use of Fea- white noise.
ferent from zero (due to high variances), but jointly they sible Weighted Least Squared (FWLS), that estimates a – If ρ is known, create a quasi-differentiated model where
are different from zero. possible variance, divides the model variables by it and u t is white noise and estimate it by OLS.
then apply OLS. – If ρ is not known, estimate it by -for example- the
Correction Make a new model specification, for example, logarithmic Cochrane-Orcutt method, create a quasi-differentiated
Delete one of the collinear variables. transformation (lower variance). model where ut is white noise and estimate it by OLS.
Perform factorial analysis (or any other dimension reduc-
tion technique) on the collinear variables.
Interpret coefficients with multicollinearity jointly.
3.0-en - github.com/marcelomijas/econometrics-cheatsheet - CC BY 4.0

Econometric S Cheat Sheet
No ratings yet
Econometric S Cheat Sheet
3 pages
Econometrics Cheat Sheet Overview
No ratings yet
Econometrics Cheat Sheet Overview
3 pages
Instrumental Variables & 2SLS Guide
No ratings yet
Instrumental Variables & 2SLS Guide
21 pages
Lecture Notes On Advanced Econometrics
100% (1)
Lecture Notes On Advanced Econometrics
378 pages
Chapter 4-Functional Forms of Regression Model
No ratings yet
Chapter 4-Functional Forms of Regression Model
21 pages
Econometrics Formula Sheet
100% (2)
Econometrics Formula Sheet
2 pages
Heteroskedasticity
No ratings yet
Heteroskedasticity
30 pages
Multiple Regression Analysis Overview
100% (1)
Multiple Regression Analysis Overview
35 pages
Exercise 2: Wooldridge Book: Part I Computer Exercises
No ratings yet
Exercise 2: Wooldridge Book: Part I Computer Exercises
10 pages
Time Series Cheatsheet en
100% (1)
Time Series Cheatsheet en
3 pages
Econometrics for Students
No ratings yet
Econometrics for Students
36 pages
Wooldridge - Computer Exercises, Chapter 3, C3, C9, C12
No ratings yet
Wooldridge - Computer Exercises, Chapter 3, C3, C9, C12
4 pages
ch02 Ans
No ratings yet
ch02 Ans
11 pages
Nonlinear Regression Functions Explained
No ratings yet
Nonlinear Regression Functions Explained
14 pages
Autocorrelation
100% (2)
Autocorrelation
172 pages
Studenmund Heteroskedacity
No ratings yet
Studenmund Heteroskedacity
44 pages
Introductory Econometrics Guide
50% (2)
Introductory Econometrics Guide
133 pages
Understanding Simultaneous Equation Models
100% (1)
Understanding Simultaneous Equation Models
17 pages
Chapter 15
No ratings yet
Chapter 15
38 pages
Econometrics: Analyzing Economic Data
No ratings yet
Econometrics: Analyzing Economic Data
27 pages
ARCH/GARCH Models in Econometrics
No ratings yet
ARCH/GARCH Models in Econometrics
26 pages
Understanding Distributed-Lag Models
No ratings yet
Understanding Distributed-Lag Models
1 page
Panel Data
No ratings yet
Panel Data
9 pages
ch12 Autocorrelation
100% (1)
ch12 Autocorrelation
36 pages
Econometric S
No ratings yet
Econometric S
26 pages
Chap 11 Heterscedasticity
100% (1)
Chap 11 Heterscedasticity
45 pages
Econometric Project Guidelines 2016-2017
No ratings yet
Econometric Project Guidelines 2016-2017
4 pages
Econometrics Cheat
No ratings yet
Econometrics Cheat
3 pages
Week 11
No ratings yet
Week 11
38 pages
Chapter 1 - Instrumental Variable Method
No ratings yet
Chapter 1 - Instrumental Variable Method
32 pages
Econometrics Notes
No ratings yet
Econometrics Notes
95 pages
1.6. Indirect Utility Function Roys Identity Expenditure Minimisation
No ratings yet
1.6. Indirect Utility Function Roys Identity Expenditure Minimisation
30 pages
Chapter 2 Power Point Slides
No ratings yet
Chapter 2 Power Point Slides
40 pages
1 Econometrics and Economic Data
100% (1)
1 Econometrics and Economic Data
20 pages
Econometrics Main Slides
No ratings yet
Econometrics Main Slides
175 pages
Stock Watson 3U ExerciseSolutions Chapter9 Instructors
100% (4)
Stock Watson 3U ExerciseSolutions Chapter9 Instructors
16 pages
Heteroscedasticity Theory MCQs
No ratings yet
Heteroscedasticity Theory MCQs
3 pages
EBE Ch1
100% (1)
EBE Ch1
17 pages
DSE Entrance Solutions 2016
No ratings yet
DSE Entrance Solutions 2016
20 pages
Problems With OLS
No ratings yet
Problems With OLS
8 pages
Application of Econometrics in Economics
No ratings yet
Application of Econometrics in Economics
160 pages
CH03 Wooldridge 7e PPT 2pp
No ratings yet
CH03 Wooldridge 7e PPT 2pp
38 pages
Cheat Sheet Econometrics
No ratings yet
Cheat Sheet Econometrics
4 pages
Studenmund Ch02 v2
No ratings yet
Studenmund Ch02 v2
30 pages
Simultaneous Equations: Main Reading: Chapter 18,19 +20
No ratings yet
Simultaneous Equations: Main Reading: Chapter 18,19 +20
49 pages
CH 02
No ratings yet
CH 02
88 pages
Econometrics by Shahzad CH
No ratings yet
Econometrics by Shahzad CH
451 pages
The Linear Regression Model: An Overview: Damodar Gujarati
100% (1)
The Linear Regression Model: An Overview: Damodar Gujarati
17 pages
CH04 Wooldridge 7e PPT 2pp
No ratings yet
CH04 Wooldridge 7e PPT 2pp
38 pages
Calculus For Economists
100% (1)
Calculus For Economists
187 pages
Principles of Econometrics 4e Chapter 2 Solution
84% (19)
Principles of Econometrics 4e Chapter 2 Solution
33 pages
Applied Econometrics Using Stata
100% (2)
Applied Econometrics Using Stata
100 pages
VAR and VECM: Cointegration Analysis
No ratings yet
VAR and VECM: Cointegration Analysis
53 pages
Assumptions of The Classical Linear Regression Model
No ratings yet
Assumptions of The Classical Linear Regression Model
7 pages
Chapter 3-Multiple Regression Model
No ratings yet
Chapter 3-Multiple Regression Model
26 pages
Microeconomics Formulas 1 PDF
100% (1)
Microeconomics Formulas 1 PDF
7 pages
Econometrics Cheat Sheet Overview
No ratings yet
Econometrics Cheat Sheet Overview
3 pages
Econometrics Cheatsheet en
No ratings yet
Econometrics Cheatsheet en
3 pages
Regression Analysis Techniques
No ratings yet
Regression Analysis Techniques
32 pages
Additional Cheatsheet en
No ratings yet
Additional Cheatsheet en
3 pages

Econometrics Cheatsheet en

Uploaded by

Econometrics Cheatsheet en

Uploaded by

Econometrics Cheat Sheet Assumptions and properties Ordinary Least Squares

By Marcelo Moreno - King Juan Carlos University

3.0-en - github.com/marcelomijas/econometrics-cheatsheet - CC BY 4.0

You might also like