100% found this document useful (1 vote)
2K views3 pages

Econometrics Cheatsheet en

1. Ordinary least squares (OLS) regression aims to minimize the sum of squared residuals by finding the best-fitting linear relationship between variables. 2. For OLS estimates to be best, certain assumptions must hold including linearity, random sampling, no perfect collinearity, conditional mean zero errors, homoscedasticity, and no autocorrelation. 3. Under the assumptions, OLS produces unbiased, consistent, and asymptotically normally distributed estimates. The standard errors of coefficients can also be estimated consistently.

Uploaded by

Rodina Muhammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
2K views3 pages

Econometrics Cheatsheet en

1. Ordinary least squares (OLS) regression aims to minimize the sum of squared residuals by finding the best-fitting linear relationship between variables. 2. For OLS estimates to be best, certain assumptions must hold including linearity, random sampling, no perfect collinearity, conditional mean zero errors, homoscedasticity, and no autocorrelation. 3. Under the assumptions, OLS produces unbiased, consistent, and asymptotically normally distributed estimates. The standard errors of coefficients can also be estimated consistently.

Uploaded by

Rodina Muhammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Econometrics Cheat Sheet Assumptions and properties Ordinary Least Squares

By Marcelo Moreno - King Juan Carlos University


Econometric model assumptions Objective - minimize
Pn the Sum of Squared Residuals (SSR):
Under this assumptions, the estimators of the OLS param- min i=1 û2i , where ûi = yi − ŷi
Basic concepts eters will present good properties. Gauss-Markov as- Simple regression model
Definitions sumptions extended: y Equation:
Econometrics - is a social science discipline with the 1. Parameters linearity (plus weak dependence in time yi = β0 + β1 x1i + ui
objective of quantify the relationships between economic series). y must be a linear function of the β’s.
Estimation:
agents, test economic theories and evaluate and implement 2. Random sampling. The sample from the population
ŷi = β̂0 + β̂1 x1i
government and business policies. has been randomly taken. (Only when cross section)
β1 where:
Econometric model - is a simplified representation of the 3. No perfect collinearity.
reality to explain economic phenomena. ˆ There are no independent variables that are constant: β̂0 = y − β̂1 x
β0 β̂1 = Cov(y,x)
Ceteris paribus - if all the other relevant factors remain Var(xj ) ̸= 0 Var(x)
constant. ˆ There is not an exact linear relation between indepen- x
dent variables.
Data types 4. Conditional mean zero and correlation zero.
Cross section - data taken at a given moment in time, an
Multiple regression model
a. There are no systematic errors: E(u|x1 , ..., xk ) = y Equation:
static photo. Order does not matter. E(u) = 0 → strong exogeneity (a implies b). yi = β0 + β1 x1i + ... + βk xki + ui
Time series - observation of one/many variable/s across b. There are no relevant variables left out of the model:
time. Order does matter. Estimation:
Cov(xj , u) = 0 for any j = 1, ..., k → weak exo-
Panel data - consist of a time series for each observation ŷi = β̂0 + β̂1 x1i + ... + β̂k xki
geneity.
of a cross section. 5. Homoscedasticity. The variability of the residuals is where:
Pooled cross sections - combines cross sections from dif- the same for all levels of x: Var(u|x1 , ..., xk ) = σu2 β̂0 = y − β̂1 x1 − ... − β̂k xk
β0 x2 Cov(y,residualized x )
ferent time periods. 6. No auto-correlation. The residuals do not contain in- β̂j = Var(residualized xj )j
Phases of an econometric model formation about other residuals: Corr(ut , us |x) = 0 for x1 Matrix: β̂ = (X T X)−1 (X T y)
1. Specification. 3. Validation. any given t ̸= s. (Typical of time series)
2. Estimation. 4. Utilization. 7. Normality. The residuals are independent and identi- Interpretation of coefficients
cally distributed: u ∼ N (0, σu2 ) Model Dependent Independent β1 interpretation
Regression analysis 8. Data size. The number of observations available must Level-level y x ∆y = β1 ∆x
Study and predict the mean value of a variable (dependent be greater than (k + 1) parameters to estimate. (It is Level-log y log(x) ∆y ≈ (β1 /100)(%∆x)
variable, y) regarding the base of fixed values of other vari- already satisfied under asymptotic situations) Log-level log(y) x %∆y ≈ (100β1 )∆x
ables (independent variables, x’s). In econometrics it is Log-log log(y) log(x) %∆y ≈ β1 (%∆x)
common to use Ordinary Least Squares (OLS) for regres- Asymptotic properties of OLS Quadratic y x + x2 ∆y = (β1 + 2β2 x)∆x
sion analysis. Under the econometric model assumptions and the Central
Limit Theorem:
Error measures Pn Pn
Correlation analysis ˆ Hold (1) to (4a): OLS is unbiased. E(β̂j ) = βj Sum of Sq. Residuals: SSR = i=1 û2i = Pi=1 (yi − ŷi )2
n
The correlation analysis not distinguish between dependent ˆ Hold (1) to (4): OLS is consistent. plim(β̂ ) = β (to Explained Sum of Squares: SSE = Pi=1 (ŷi − y)2
j j n
and independent variables. Total Sum of Sq.: SST = SSE + SSR = i=1q (yi − y)2
(4b) left out (4a), weak exogeneity, biased but consistent)
ˆ The simple correlation measures the grade of linear asso- ˆ Hold (1) to (5): asymptotic normality of OLS (then, Standard Error of the Regression: σ̂u = n−k−1 SSR
ciation between two variables. (7) is necessarily satisfied): u ∼ N (0, σu2 )
p
se(β̂) = σ̂u2P· (X T X)−1
Pn
r = Cov(x,y) = √ i=1 ((xi −x)·(yi −y)) a Standard Error of the β̂’s:
σx ·σy n n
ˆ Hold (1) to (6): unbiased estimate of σu2 . E(σ̂u2 ) = σu2
n 2
P 2
P 2
i=1 (xi −x) · i=1 (yi −y) i=1 (yi −ŷi )
Mean Squared Error: MSE =
ˆ The partial correlation measures the grade of linear as- ˆ Hold (1) to (6): OLS is BLUE (Best Linear Unbiased Pn n
|y −ŷ |
sociation between two variables controlling a third. Absolute Mean Error: AME = i=1 n i i
Estimator) or efficient. Pn
|û /y |
ˆ Hold (1) to (7): hypothesis testing and confidence inter- Mean Percentage Error: MPE = i=1 n i i · 100
vals can be done reliably.
3.0-en - github.com/marcelomijas/econometrics-cheatsheet - CC BY 4.0
R-squared Individual tests Confidence intervals
Tests if a parameter is significantly different from a given
Is a measure of the goodness of the fit, how the regression value, ϑ. The confidence intervals at (1 − α) confidence level can be
fits to the data: ˆ H0 : β j = ϑ calculated:
SSE
R2 = SST = 1 − SSR
SST ˆ H1 : βj ̸= ϑ β̂j ∓ tn−k−1,α/2 · se(β̂j )
ˆ Measures the percentage of variation of y that is lin- β̂ −ϑ
early explained by the variations of x’s. Under H0 : t = se(j β̂ ) ∼ tn−k−1,α/2
j Dummy variables and structural
ˆ Takes values between 0 (no linear explanation of the If | t |> tn−k−1,α/2 , there is evidence to reject H0 .
variations of y) and 1 (total explanation of the varia- Individual significance test - tests if a parameter is sig- change
tions of y). nificantly different from zero. Dummy (or binary) variables are used for qualitative infor-
When the number of regressors increment, the value of the ˆ H0 : βj = 0 mation like sex, civil state, country, etc.
R-squared increments as well, whatever the new variables ˆ H1 : βj ̸= 0 ˆ Get the value of 1 in a given category, and 0 on
are relevant or not. To solve this problem, there is an ad- β̂j
Under H0 : t= se(β̂j )
∼ tn−k−1,α/2 the rest.
justed R-squared by degrees of freedom (or corrected R- ˆ Are used to analyze and modeling structural changes
If | t |> tn−k−1,α/2 , there is evidence to reject H0 .
squared): in the model parameters.
2 n−1
R = 1 − n−k−1 · SSR n−1
SST = 1 − n−k−1 · (1 − R )
2 The F test If a qualitative variable have m categories, we only have to
2 Simultaneously tests multiple (linear) hypothesis about the include (m − 1) dummy variables.
For big sample sizes: R ≈ R2
parameters. It makes use of a non restricted model and a
restricted model: Structural change
Hypothesis testing ˆ Non restricted model - is the model on which we want Structural change refers to changes in the values of the pa-
to test the hypothesis. rameters of the econometric model produced by the effect
The basics of hypothesis testing
ˆ Restricted model - is the model on which the hypoth- of different sub-populations. Structural change can be in-
An hypothesis test is a rule designed to explain from a sam- cluded in the model through dummy variables.
esis that we want to test have been imposed.
ple, if exist evidence or not to reject an hypothesis The location of the dummy variable matters:
Then,
Pn looking at the errors, there are:
that is made about one or more population parameters. ˆ On the intercept (β0 ) - represents the mean difference
ˆ i=1 û2
- is the Sum of Sq. Resid. of the non restricted
Elements of an hypothesis test: between the values produced by the structural change.
model: SSR.
ˆ Null hypothesis (H0 ) - is the hypothesis to be tested. Pn
ˆ On the parameters that determines the slope of
ˆ 2
i=1 ûr - is the Sum of Sq. Resid of the restricted model:
ˆ Alternative hypothesis (H1 ) - is the hypothesis that the regression line (βj ) - represents the effect (slope)
SSRr .
cannot be rejected when the null hypothesis is rejected. difference between the values produced by the structural
Under H0 :
ˆ Test statistic - is a random variable whose probability r −SSR change.
F = SSRSSR · n−k−1
q ∼ Fq,n−k−1
distribution is known under the null hypothesis. The Chow’s structural test - when we want to analyze
where k is the number of parameters of the non restricted
ˆ Critic value - is the value against which the test statistic the existence of structural changes in all the model param-
model and q is the number of linear hypothesis tested.
is compared to determine if the null hypothesis is rejected eters, it is common to use a particular expression of the F
If Fq,n−k−1 < F , there is evidence to reject H0 .
or not. Is the value that makes the frontier between the test known as the Chow’s test, where the null hypothesis
Global significance test - tests if all the parameters as-
regions of acceptance and rejection of the null hypothesis. is: H0 : No structural change.
sociated to x’s are simultaneously equal to zero.
ˆ Significance level (α) - is the probability of rejecting
ˆ H0 : β1 = β2 = ... = βk = 0
the null hypothesis being true (Type I error). Is chosen
by who conduct the test. Commonly is 0.10, 0.05 or 0.01.
ˆ H1 : β1 ̸= 0 and/or β2 ̸= 0... and/or βk ̸= 0 Predictions
In this case, we can simplify the formula for the F statistic.
ˆ p-value - is the highest level of significance by which the Two types of prediction:
Under H0 :
null hypothesis cannot be rejected (H0 ). R2 n−k−1 ˆ Of the mean value of y for a specific value of x.
F = 1−R 2 · ∼ Fk,n−k−1
The rule is: if the p-value is less than α, there is evidence k ˆ Of an individual value of y for a specific value of x.
If Fk,n−k−1 < F , there is evidence to reject H0 .
to reject the null hypothesis at that given α (there is If the values of the variables (x) approximate to the mean
evidence to accept the alternative hypothesis). values (x), the confidence interval amplitude of the predic-
tion will be shorter.

3.0-en - github.com/marcelomijas/econometrics-cheatsheet - CC BY 4.0


Multicollinearity Heteroscedasticity Auto-correlation
ˆ Perfect multicollinearity - there are independent vari- The residuals ui of the population regression function do The residual of any observation, ut , is correlated with the
ables that are constant and/or there is an exact linear not have the same variance σu2 : residual of any other observation. The observations are not
relation between independent variables. Is the breaking Var(u|x) = Var(y|x) ̸= σu2 independent.
of the third (3) econometric model assumption. Is the breaking of the fifth (5) econometric model as- Corr(ut , us |x) ̸= 0 for any t ̸= s
ˆ Approximate multicollinearity - there are indepen- sumption. The “natural” context of this phenomena is time series. Is
dent variables that are approximately constant and/or the breaking of the sixth (6) econometric model as-
there is an approximately linear relation between inde-
Consequences sumption.
ˆ OLS estimators still are unbiased.
pendent variables. It does not break any economet-
ˆ OLS estimators still are consistent. Consequences
ric model assumption, but has an effect on OLS.
ˆ OLS is not efficient anymore, but still a LUE (Linear ˆ OLS estimators still are unbiased.
Consequences Unbiased Estimator). ˆ OLS estimators still are consistent.
ˆ Perfect multicollinearity - the equation system of ˆ Variance estimations of the estimators are biased: ˆ OLS is not efficient anymore, but still a LUE (Linear
OLS cannot be solved due to infinite solutions. the construction of confidence intervals and the hypoth- Unbiased Estimator).
ˆ Approximate multicollinearity esis testing is not reliable. ˆ Variance estimations of the estimators are biased:
– Small sample variations can induce to big variations in the construction of confidence intervals and the hypoth-
the OLS estimations.
Detection esis testing is not reliable.
ˆ Graphs - look u y
– The variance of the OLS estimators of the x’s that are
collinear, increments, thus the inference of the param-
for scatter pat- Detection
terns on x vs. u ˆ Graphs - look for scatter patterns on ut−1 vs. ut or
eter is affected. The estimation of the parameter is x
or x vs. y plots. make use of a correlogram.
very imprecise (big confidence interval).
Ac. Ac.(+) Ac.(-)
x ut ut ut
Detection
ˆ Correlation analysis - look for high correlations ˆ Formal tests - White, Bartlett, Breusch-Pagan, etc.
(greater than 0.7) between independent variables. Commonly, the null hypothesis: H0 : Homoscedasticity. ut−1 ut−1
ˆ Variance Inflation Factor (VIF) - indicates the in- ut−1
Correction
crement of Var(β̂j ) because of the multicollinearity. ˆ Use OLS with a variance-covariance matrix estimator ro-
1
VIF(β̂j ) = 1−R 2 bust to heteroscedasticity (HC), for example, the one pro-
2
j
ˆ Formal tests - Durbin-Watson, Breusch-Godfrey, etc.
where Rj denotes the R-squared from a regression be- posed by White.
Commonly, the null hypothesis: H0 : No auto-correlation.
tween xj and all the other x’s. ˆ If the variance structure is known, make use of Weighted
– Values between 4 to 10 suggest that it is advisable to Least Squares (WLS) or Generalized Least Squares Correction
analyze in more depth if there might be multicollinear- (GLS): ˆ Use OLS with a variance-covariance matrix estimator ro-
2
ity problems. – Supposing that Var(ui ) = σ u · x i , divide the model bust to heterocedasticity and auto-correlation (HAC), for
– Values bigger than 10 indicates that there are multi- variables by the square root of x i and apply OLS. example, the one proposed by Newey-West.
collinearity problems. – Supposing that Var(u i ) = σ u
2
· x 2
i , divide the model ˆ Use Generalized Least Squares. Supposing yt = β0 +
2
One typical characteristic of multicollinearity is that the variables by x i (the square root of x i ) and apply OLS. β1 xt + ut , with ut = ρut−1 + εt , where |ρ| < 1 and εt is
regression coefficients of the model are not individually dif- ˆ If the variance structure is not known, make use of Fea- white noise.
ferent from zero (due to high variances), but jointly they sible Weighted Least Squared (FWLS), that estimates a – If ρ is known, create a quasi-differentiated model where
are different from zero. possible variance, divides the model variables by it and u t is white noise and estimate it by OLS.
then apply OLS. – If ρ is not known, estimate it by -for example- the
Correction ˆ Make a new model specification, for example, logarithmic Cochrane-Orcutt method, create a quasi-differentiated
ˆ Delete one of the collinear variables. transformation (lower variance). model where ut is white noise and estimate it by OLS.
ˆ Perform factorial analysis (or any other dimension reduc-
tion technique) on the collinear variables.
ˆ Interpret coefficients with multicollinearity jointly.
3.0-en - github.com/marcelomijas/econometrics-cheatsheet - CC BY 4.0

You might also like