0% found this document useful (0 votes)

30 views47 pages

Basic Econometrics Notes

The document discusses Simple Linear Regression (SLR) and Multiple Linear Regression, detailing their equations, assumptions, and significance testing methods such as t-tests and p-values. It also addresses issues like multicollinearity, heteroskedasticity, and autocorrelation that can affect regression analysis. Additionally, it covers advanced topics like dummy variables, binary choice models, and the use of proxy variables in regression analysis.

Uploaded by

f20221061

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views47 pages

Basic Econometrics Notes

Uploaded by

f20221061

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Simple Linear Regression (Two variable linear model)

The two-variable linear model, or simple regression analysis, is used for testing hypotheses about the
relationship between a dependent variable Y and an independent or explanatory variable X and for
prediction. Since the points are unlikely to fall precisely on the line, the exact linear relationship
includes a random disturbance, error, or stochastic term, ui. This results in the following equation:

Interpretations: b0: The expected value of Y when X = 0.

b1: a.k.a. the slope coefficient, it measures the change in Y per unit change in X.

Gauss-Markov Assumptions: the following assumptions must be satisfied for SLR:

1) Linear In Parameters: the independent variable is related to the dependent variable as
described by the above equation.
2) Random Sampling: A random sample of size n is taken for the regression.
3) Sample Variation in the Explanatory Variable: The sample outcomes on x are not all the
same value.
4) Zero Conditional Mean: The error term has zero expected value given any value of the
explanatory variable, i.e.,

5) Homoskedasticity: The error term has constant variance given any value of the explanatory
variable.

6) The error terms of different observations are uncorrelated to each other.

7) The independent variable (Xi) and the error term are also uncorrelated.
8) The error term is normally distributed.
Note: Y is also called the explained variable, response variable, predicted variable and the regressand.
X is also called the explanatory variable, control variable, predictor variable and the regressor.

Ordinary Least Squares (OLS)

The ordinary least-squares method (OLS) is a technique for fitting the ‘‘best’’ straight line to the sample
of XY observations. It involves minimizing the sum of the squared (vertical) deviations of points from
the line:

where Yi refers to the actual observations, and Y^i refers to the corresponding fitted values. Their
difference is the residual (ei).
Solving the above minimization problem, we get the following results:

and cov(X,Y) means the covariance of X and Y .

The estimated least-squares regression (OLS) equation is then

Examples
Significance Testing

We need to test whether are obtained parameters are statistically significant. Statistical significance is
a measure of whether your research findings are meaningful. In order to test for the statistical
significance of the parameter estimates of the regression, the variance of the estimates of b0 and b1 is
required:

The unbiased estimates for variance of b0 and b1 is then given by:

Significance Testing Using T-Test

A t test is a statistical test that is used to compare the means of two groups. It is often used in
hypothesis testing to determine whether a process or treatment actually has an effect on the population
of interest, or whether two groups are different from one another.
To perform a t-test of significance on a parameter estimate of the regression equation, the following
steps can be performed:
1) Calculate the t-statistic using the given formula:

Here, β^ is the estimate calculated, β0 is the value we are testing against, and SEβ^ is the standard
error of the estimate. In most cases the value of β0 is taken a 0, to test if the value should be
included in the model or not.
2) The tscore is compared against the critical values of the t-distribution at different confidence
levels. A confidence interval is a range of values that is likely to contain an unknown population
parameter. If you draw a random sample many times, a certain percentage of the confidence
intervals will contain the population mean. This percentage is the confidence level, denoted by
α.

3) If the tscore is greater than the critical value, then the null hypothesis (that the estimate’s value is
β0) is rejected. Two types of tests can be conducted: a twin-tail test (shown in the figure above),
where H1: β ≠ 0, and a single-tail test (shown below), H1: β > 0 or β < 0.
Example
Significance Testing using P-values

For the p-value approach, the likelihood (p-value) of the numerical value of the test statistic is compared to
the specified significance level (α) of the hypothesis test. The p-value corresponds to the probability of
observing sample data at least as extreme as the actually obtained test statistic. Small p-values provide
evidence against the null hypothesis. The smaller (closer to 0) the p-value, the stronger is the evidence
against the null hypothesis.

If the p-value is less than or equal to the specified significance level α, the null hypothesis is rejected;
otherwise, the null hypothesis is not rejected. In other words, if p ≤ α, reject H0; otherwise, if p > α do not
reject H0.

In consequence, by knowing the p-value any desired level of significance may be assessed. For example, if
the p-value of a hypothesis test is 0.01, the null hypothesis can be rejected at any significance level larger
than or equal to 0.01. It is not rejected at any significance level smaller than 0.01. Thus, the p-value is
commonly used to evaluate the strength of the evidence against the null hypothesis without reference to
significance level.
The following table provides guidelines for using the p-value to assess the evidence against the null
hypothesis:

Test of Goodness of Fit

The closer the observations fall to the regression line (i.e., the smaller the residuals), the greater is the
variation in Y ‘‘explained’’ by the estimated regression equation. The total variation in Y is equal to
the explained plus the residual variation:

Example
Properties of OLS

OLS estimators are BLUE (Best Linear Unbiased Estimator).

• Best: this method’s estimators have the least variance

• Linear: the coefficients are linear (i.e. they do not appear in exponent or root form)
• Unbiased: the expected value of the estimates obtained from the sample is the same as the true
value obtained from the population.
Thus, OLS estimators are the best among all unbiased linear estimators. This is known as the Gauss-
Markov theorem and represents the most important justification for using OLS.

Multiple Linear Regression

Multiple regression analysis is used for testing hypotheses about the relationship between a dependent
variable Y and two or more independent variables X and for prediction. The three-variable linear
regression model can be written as:

Assumptions: The assumptions of SLR are also applicable here. The additional assumption is that there
is no exact linear relationship between the X values, i.e., No perfect collinearity.
Example
Significance Testing

The method to test the significance is same as the one for simple linear regression. In order to test for
the statistical significance of the parameter estimates of the multiple regression, the variance of the
estimates is required.

Once the standard errors are calculated, t-test or p-value approach can be used to test for significance.

Coefficient of Multiple Determination

Similar to the R2 value in SLR, the coefficient of multiple determination is defined as the proportion of
total variation in Y “explained” by the multiple regression of Y on X1 and X2. It can be calculated as:

Note: the inclusion of more independent variables always increases the value of R 2, so a high value is
not necessarily a sign of a good fit. Adjusted R2 penalizes the addition of more independent variables
and is a better measure in such cases:

where n is the number of observations, and k the number of parameters estimated.

Example
Test of Overall Significance of the Regression

While the other tests of significance are run on individual parameters of the regression, we also need to
test the joint significance of the parameters. In some cases, it is found that the parameters are
individually insignificant, but have a joint significant effect on the independent variable and thus should
not be excluded from the model.
Joint significance testing is done by using the F-statistic. The F-stat is the ratio of the explained to the
unexplained variance. This follows an F distribution with k - 1 and n - k degrees of freedom, where n
is number of observations and k is number of parameters estimated:

The procedure for conducting the test using the F-stat is the same as the one explained previously for
the T-stat.

Partial Correlation Coefficients

The partial-correlation coefficient measures the net correlation between the dependent variable and one
independent variable after excluding the common influence of (i.e., holding constant) the other
independent variables in the model. For example, rYX1.X2 is the partial correlation between Y and X1,
after removing the influence of X2 from both Y and X1:

where rYX1 = simple-correlation coefficient between Y and X1, and rYX2 and rX1X2 are analogously
defined. Partial-correlation coefficients range in value from -1 to +1 (as do simple-correlation
coefficients), have the sign of the corresponding estimated parameter, and are used to determine the
relative importance of the different explanatory variables in a multiple regression.

Matrix Notation

Calculations increase substantially as the number of independent variables increase. Matrix notation
can aid in solving larger regressions algebraically. The following solution works with any number of
independent variables and is therefore extremely flexible. For example:

Further Techniques and Applications

Functional Forms

The OLS method is used to fit linear relationships only. Therefore, for cases where there is a non-linear
relationship between the variables, certain transformations can be made to make them linear, before
performing the regression analysis. Accordingly, the interpretation of the regression coefficients also
changes. Some of these transformations are given in the table below.
Example
Interaction Terms

Sometimes, it is natural for the partial effect, elasticity, or semi-elasticity of the dependent variable with
respect to an explanatory variable to depend on the magnitude of yet another explanatory variable. For
example, in the model
the partial effect of bdrms on price (holding all other variables fixed) is

Example
Dummy Variables

Qualitative explanatory variables (such as wartime vs. peacetime, periods of strike vs. nonstrike, male
vs. females, etc.) can be introduced into regression analysis by assigning the value of 1 for one
classification (e.g., wartime) and 0 for the other (e.g., peacetime). These are called dummy variables
and are treated as any other variable. Dummy variables can be used to capture changes (shifts) in the
intercept [Eq. (8.5)], changes in slope [Eq. (8.6)], and changes in both intercept and slope [Eq. (8.7)]:

where D is 1 for one classification and 0 otherwise and X is the usual quantitative explanatory variable.
Dummy variables also can be used to capture differences among more than two classifications, such as
seasons and regions [Eq. (8.8)]:

where b0 is the intercept for the first season or region and D1, D2, and D3 refer, respectively, to season
or region 2, 3, and 4. Note that for any number of classifications k, k 1 dummies are required.

Example
Binary Choice Models

These models are used when the dependent/outcome variable is binary i.e. takes only two values (Yes
or No). To estimate the model, we first set up an underlying model where Y is like a dummy variable:

Here, Yi* is considered an underlying propensity for the dummy variable to take the value of 1 and is
a continuous variable so that:

Instead of using OLS, another estimation technique called the maximum-likelihood estimate of the
coefficients is calculated by setting up the log-likelihood function:

where Σ1 and Σ0 indicate sum of all probabilities for those data points where Yi = 1 and 0, respectively,
and ^ b0 and ^ b1 are chosen to maximize the log-likelihood function. If the standard normal distribution
is used to find the probabilities, it is a probit model; if the logistic distribution is used, it is a logit model.
Since these functions are nonlinear, estimation by computer is usually required.
The interpretation of b1 changes in a binary choice model. b1 is the effect of X on Y* . The marginal
effect of X on P(Y = 1) is easier to interpret and is given by:

Problems in Regression Analysis

Multicollinearity

Multicollinearity refers to the case in which two or more explanatory variables in the regression model
are highly correlated, making it difficult or impossible to isolate their individual effects on the
dependent variable. With multicollinearity, the estimated OLS coefficients may be statistically
insignificant (and even have the wrong sign) even though R2 may be ‘‘high.’’ Multicollinearity can
sometimes be overcome or reduced by:

• collecting more data

• utilizing a priori (previously known) information
• transforming the functional relationship, or
• dropping one of the highly collinear variables.

Heteroskedasticity

If the OLS assumption of homoskedasticity (that the variance of the error term is constant for all
observations does not hold), we face the problem of heteroscedasticity. This leads to unbiased but
inefficient (i.e., larger than minimum variance) estimates of the coefficients, as well as biased estimates
of the standard errors (and, thus, incorrect statistical tests and confidence intervals). Graphically, the
scatter plot gets more (or less) scattered as the values of X increases.
One test for heteroscedasticity involves arranging the data from small to large values of the independent
variable X and running two regressions, one for small values of X and one for large values, omitting,
say, one-fifth of the middle observations. Then, we test that the ratio of the error sum of squares (ESS)
of the second regression to the first regression is significantly different from zero, using the F table with
(n - d - 2k)/2 degrees of freedom, where n is the total number of observations, d is the number of omitted
observations, and k is the number of estimated parameters. If the error variance is proportional to X2
(often the case), heteroscedasticity can be overcome by dividing every term of the model by X and then
re-estimating the regression using the transformed variables.
Regression techniques such as Weighted Least Squares (WLS), or Generalized Least Squares (GLS)
can also be used in place of OLS, for cases where heteroskedasticity is present.
(Self-study: Tests for heteroskedasticity: Breusch-Pagan Test, Lagrange Multiplier, White Test)

Autocorrelation

When the error term in one time period is positively correlated with the error term in the previous time
period, we face the problem of (positive first-order) autocorrelation. This is common in time-series
analysis and leads to downward-biased standard errors (and, thus, to incorrect statistical tests and
confidence intervals). (Will be discussed further in Time Series Analysis)

Measurement Errors in Variables

Errors in variables refer to the case in which the variables in the regression model include measurement
errors. Measurement errors in the dependent variable are incorporated into the disturbance term and do
not create any special problem. However, errors in the explanatory variables lead to biased and
inconsistent parameter estimates. One method of obtaining consistent OLS parameter estimates is to
replace the explanatory variable subject to measurement errors with another variable (called an
instrumental variable) that is highly correlated with the original explanatory variable but is independent
of the error term. This is often difficult to do and somewhat arbitrary. The simplest instrumental variable
is usually the lagged explanatory variable in question. Another method used when only X is subject to
measurement errors involves regressing X on Y.

Proxy Variables

These are used in place of variables for which there is insufficient data to run a regression analysis. The
proxy variable (Z) is hypothesized to be linearly related to the missing explanatory variable (let X 2):
X2 = a + cZ
Therefore, the regression equation is rewritten as:
Y = b0 + b1X1 + b2(a + cZ) + ….. + bkXk + u
So, only the intercept and the slope variable of the proxy variable changes. For the rest of the variables,
they remain unbiased, and their standard errors remain same.
However, if a poor proxy variable is chosen, it can lead to measurement errors and the other explanatory
variables may try to act as proxies for the missing variable, which can lead to OVB.
Variable Misspecifications in Regression Analysis

Omission of a relevant variable

This can occur when a variable that should have been included in the model is omitted. This makes the
coefficients biased (called Omitted Variable Bias or OVB) and the standard errors become invalid.
Consequently, all the tests of significance also become invalid.
Reason for biasness: The variable let (X2) closest in relation to the omitted variable (let X3) starts acting
as a mimic for the omitted variable. The strength of this proxy effect depends on i) the strength of the
omitted variable on Y and ii) ability of X2 to mimic X3.
The RESET test can be used to test for OVB.

Example
Inclusion of an irrelevant variable

Here, the coefficients remain unbiased, but they become inefficient, because the standard errors are
higher than they should be (they are still valid, but since they are very large, there is a loss of efficiency).
Time Series Analysis
Time Series Data

The first step to understanding time series analysis is to understand the data that we’ll be working on.
The data used until now was cross-sectional data i.e., data of a number of units at a given time. However,
in time series analysis, we use time series data which is data collected over time on one particular
unit. Due to its nature, the ordering of the data matters in time series since it is defined
chronologically. Shuffling observations in cross sectional data does not result in loss in information in
estimation. However, in time series data it can lead to confounding of the possible existence of dynamic
variables between the variables.
Time Series data have the following components: trend, cyclical, seasonal and random.

Univariate Time Series

A univariate time series is a sequence of measurements of the same variable collected over time. Most
often, the measurements are made at regular time intervals. The basic objective usually is to determine
a model that describes the pattern of the time series. Uses for such a model are:
1. To describe the important features of the time series pattern.
2. To explain how the past affects the future or how two time series can “interact”.
3. To forecast future values of the series.
4. To possibly serve as a control standard for a variable that measures the quality of product in
some manufacturing situations.
Some important features of time series data are:

• Trend: on average, do the measurements tend to increase (or decrease) over time?
• Seasonality: is there a regularly repeating pattern of highs and lows related to calendar time
such as seasons, quarters, months, days of the week, and so on?
• Outliers: In regression, outliers are far away from your line. With time series data, your outliers
are far away from your other data.

Stationarity

A series xt is said to be (weakly) stationary if it satisfies the following properties:

• The mean E(xt) is the same for all t.

• The variance of xt is the same for all t.
• The covariance (and also correlation) between xt and xt-h is the same for all t at each lag ℎ = 1,
2, 3, etc.
Simply put, a series is stationary if the distribution of its values remains the same as time progresses.
Stationarity is an important property to verify when working with time series data as it represents the
ability of the series to absorb “shocks”. A non-stationary series reacts “explosively” to shocks i.e.,
shocks will lead to a subsequent time path that has an unbounded mean and variance. On the other hand,
a stationary series experiences changes, but always returns to the same value given time.

White Noise Process

A definition of a white noise process is:

Thus, a white noise process has constant mean and variance, and zero autocovariances, except at lag
zero. Another way to state this last condition would be to say that each observation is uncorrelated with
all other values in the sequence. Hence the autocorrelation function for a white noise process will be
zero apart from a single peak of 1 at s = 0. If μ = 0, and the three conditions hold, the process is known
as zero mean white noise.
In the following models, the error terms are assumed to follow a white noise process.

The Lag Operator

For convenience in writing and solving time series equations which can involve a lot of lagged terms,
a Lag Operator (B or L) is used. When multiplied to the time series variable, it lags it by the amount
which is equal to the power of the operator. For example:

BXt = Xt-1
B2Xt = Xt-2
So, in general, BkXt = Xt-k.
Thus, an AR or MA model equation can be expressed in the form of a polynomial of B, which can be
solved to find the roots.

Moving Averages (MA) Model

This is the simplest class of time series models. The notation MA(q) refers to the moving average model
of order q:
where µ is the mean of the series, the Ɵi are the parameters of the model and the ε are white noise error
terms. The value of q is called the order of the MA model.
A moving-average model is conceptually a linear regression of the current value of the series against
current and previous (observed) white noise error terms or random shocks. The random shocks at each
point are assumed to be mutually independent and to come from the same distribution, typically a
normal distribution, with location at zero and constant scale.

Example
Invertibility of MA Models

An MA model is said to be invertible if it is algebraically equivalent to a converging infinite order

AR model. By converging, we mean that the AR coefficients decrease to 0 as we move back in time.

Invertibility is a restriction programmed into time series software used to estimate the coefficients of
models with MA terms. It’s not something that we check for in the data analysis.
Autoregressive (AR) Model

Here, the current value of a variable depends only on the values that variable took in previous periods,
plus an error term. The notation AR(p) indicates an autoregressive model of order p. The AR(p) model
is defined as:

Stationarity is a desirable property in AR models. For this condition to be satisfied, the roots of model
must lie outside the unit circle.
Assumptions:

• Error terms are independently distributed with a normal distribution that has 0 mean and
constant variance.
• Properties of the error terms are independent of xt.
• The series Xt is (weakly) stationary.

Using the lag operator:

Stationarity Condition

ARMA and ARIMA Models

Combining AR and MA models, we get ARMA model. The notation ARMA (p, q) refers to the model
with p autoregressive terms and q moving-average terms. This model contains the AR(p) and MA(q)
models:

If this model does not satisfy the stationarity condition, it can sometimes be achieved by differencing
the equation by its time lag i.e. Xt – Xt-1. Differencing removes the changes in the level of a time series,
eliminating trend and seasonality and consequently stabilizing the mean of the time series.
If this results in a stationary series, then the series is Integrated of order 1. Similarly, if this process
needs to be repeated ‘d’ number of times before stationarity is achieved, it is Integrated of order d. Such
a series is then denoted by ARIMA(p,d,q).

Example: AR(1) Model

Sample Autocorrelation Function (ACF)

The sample autocorrelation function (ACF) for a series gives correlations between the series xt and
lagged values of the series for lags of 1, 2, 3, and so on. The lagged values can be written as xt-1, xt-2,
xt-3, and so on. The ACF gives correlations between xt and xt-1, xt-2 and xt-3, and so on.
The ACF can be used to identify the possible structure of time series data. It can be used to test the
significance of each lag so that it can be determined whether they should be included in the model or
not. It can also be used to test the autocorrelation between error terms (or residuals). Since there should
be no autocorrelation, they should be insignificant at all lags.
The ACF can be calculated as:
Many stationary series have recognizable ACF patterns. Most series that we encounter in practice,
however, is not stationary. A continual upward trend, for example, is a violation of the requirement that
the mean is the same for all t. Distinct seasonal patterns also violate that requirement.

ACF of AR model

The ACF of AR model shows an exponential decay to 0 as the lag increases, as shown in the figure
below:
ACF of MA Model

The ACF of MA models shows a “sudden death” of significance after a certain lag instead of an
exponential decline. This can be seen in the ACF plot shown:

Partial Autocorrelation Function (PACF)

The PACF measures the correlation between the current observation and the observation k periods ago
after controlling for the observations at intermediate lags. Mathematically, it can be expressed as:

The plot of PACF values is very convenient for identifying an AR process. For an AR series, the PACF
shows a “sudden-death” after the lag that is the order of the model. For example, the PACF plot of an
AR(1) model is shown below:
For an MA mode, the PACF tapers off instead, so its identification is best done with an ACF plot as
discussed previously.

Ljung-Box Test of Significance

This is an important test of significance to check up to which lag should be included in your model. In
general, the model should contain as few terms as possible (called a parsimonious model). The Ljung-
Box test statistic is given by:

where n is the sample size, ρk is the sample autocorrelation at lag k, and h is the number of lags
being tested.
The test works just like a standard t-test. If the Q-statistic value is greater than the critical value at
the chosen significance level, then the null hypothesis is rejected in favour of the alternate
hypothesis. If not, then the null hypothesis cannot be rejected. For this test, the null and alternate
hypotheses are:
H0: The data are independently distributed (i.e., the correlations in the population from
which the sample is taken are 0, so that any observed correlations in the data result from
randomness of the sampling process).
Ha: The data are not independently distributed; they exhibit serial correlation.

Modeling Time Series Data

The standard approach to modeling non-seasonal time series data is as follows:

• Plot the data: this does not give any information about the correct model or its parameters, but
it gives indication whether the series is stationary or not, whether there is a trend/seasonality,
and what further steps you should take.
• Plot the ACF and PACF of series: Since MA and AR both have identifiable plots, this step is
used to decide on which model fits the series best. It is possible that neither MA, not AR can
fit it best, in which case it might be best to go with an ARIMA model.
• Model Diagnostics: this involves a series of steps to check whether model specified and
estimated is adequate. The best method for this is to check the Information Criteria values.

Information Criteria

Information criteria embody two factors: a term which is a function of the residual sum of squares
(RSS), and some penalty for the loss of degrees of freedom from adding extra parameters. So, adding a
new variable or an additional lag to a model will have two competing effects on the information criteria:
the residual sum of squares will fall but the value of the penalty term will increase.
There are several types of IC, the difference between them being how heavily they penalize the addition
of extra parameters. To build a parsimonious model, the one with the least value of IC is chosen.

Seasonal ARIMA Models

Seasonality in a time series is a regular pattern of changes that repeats over S time periods, where S
defines the number of time periods until the pattern repeats again.
For example, there is seasonality in monthly data for which high values tend always to occur in some
particular months and low values tend always to occur in other particular months. In this case, S = 12
(months per year) is the span of the periodic seasonal behaviour. For quarterly data, S = 4 time periods
per year.

In a seasonal ARIMA model, seasonal AR and MA terms predict xt using data values and errors at
times with lags that are multiples of S (the span of the seasonality).

• With monthly data (and S = 12), a seasonal first order autoregressive model would use xt-12 to
predict xt. For instance, if we were selling cooling fans we might predict this August’s sales
using last August’s sales. (This relationship of predicting using last year’s data would hold for
any month of the year.)
• A seasonal second order autoregressive model would use xt-12 and xt-24 to predict xt. Here we
would predict this August’s values from the past two Augusts.
• A seasonal first order MA(1) model (with S = 12) would use wt-12 as a predictor. A seasonal
second order MA(2) model would use wt-12 and wt-24.

Differencing

Almost by definition, it may be necessary to examine differenced data when we have seasonality.
Seasonality usually causes the series to be nonstationary because the average values at some particular
times within the seasonal span (months, for example) may be different than the average values at other
times. For instance, our sales of cooling fans will always be higher in the summer months.

Seasonal Differencing

Seasonal differencing is defined as a difference between a value and a value with lag that is a multiple
of S. The differences (from the previous year) may be about the same for each month of the year giving
us a stationary series.
With S = 12, which may occur with monthly data, a seasonal difference is:

(1-B12)Xt = Xt – Xt-12
Where B is the lag operator.
Seasonal differencing removes seasonal trend and can also get rid of seasonal random walk type of
nonstationarity.
Non-seasonal differencing

If trend is present in the data, we may also need non-seasonal differencing. Often (not always) a first
difference (non-seasonal) will “detrend” the data. That is, we use (1-B)Xt = Xt – Xt-1 in the presence of
trend.

Differencing for Trend and Seasonality

When both trend and seasonality are present, we may need to apply both a non-seasonal first difference
and a seasonal difference. That is, we may need to examine the ACF and PACF of:

(1-B12)(1-B)Xt = (Xt – Xt-1) – (Xt-12 – Xt-13)

Removing trend doesn't mean that we have removed the dependency. We may have removed the
mean, µt, part of which may include a periodic component. In some ways we are breaking the
dependency down into recent things that have happened and long-range things that have happened.

Seasonal ARIMA Model

The seasonal ARIMA model incorporates both non-seasonal and seasonal factors in a multiplicative
model. One shorthand notation for the model is

ARIMA(p,d,q) x (P,D,Q)S
with p = non-seasonal AR order, d = non-seasonal differencing, q = non-seasonal MA order, P =
seasonal AR order, D = seasonal differencing, Q = seasonal MA order, and S = time span of repeating
seasonal pattern.
Without differencing operations, the model could be written more formally as

Ψ(BS) ϕ(B) (Xt - µ) = Θ(BS)ɵ(B)wt

The non-seasonal components are:

The seasonal components are:

Smoothing and Decomposition Methods

Decomposition Models

Decomposition procedures are used in time series to describe the trend and seasonal factors in a time
series. More extensive decompositions might also include long-run cycles, holiday effects, day of week
effects and so on. Here, we’ll only consider trend and seasonal decompositions.
One of the main objectives for a decomposition is to estimate seasonal effects that can be used to create
and present seasonally adjusted values. A seasonally adjusted value removes the seasonal effect from a
value so that trends can be seen more clearly. For instance, in many regions of the U.S. unemployment
tends to decrease in the summer due to increased employment in agricultural areas. Thus, a drop in the
unemployment rate in June compared to May doesn’t necessarily indicate that there’s a trend toward
lower unemployment in the country. To see whether there is a real trend, we should adjust for the fact
that unemployment is always lower in June than in May.

Basic Structures

The following two structures are considered for basic decomposition models:
1. Additive: Xt = Trend + Seasonal + Random
2. Multiplicative: Xt = Trend * Seasonal * Random

The additive model is useful when the seasonal variation is relatively constant over time. The
multiplicative model is useful when the seasonal variation increases over time.
A time series that might benefit from additive decomposition would like this:
On the other hand, a time series that would benefit from multiplicative decomposition would look like
this:

Basic Steps in Decomposition

The seasonal effects are usually adjusted so that they average to 0 for an additive decomposition or they
average to 1 for a multiplicative decomposition.
1. The first step is to estimate the trend. Two different approaches could be used for this (with
many variations of each).
• One approach is to estimate the trend with a smoothing procedure such as smoothing
averages.
• The second approach is to model the trend with a regression equation
2. The second step is to “de-trend” the series. For an additive decomposition, this is done by
subtracting the trend estimates from the series. For a multiplicative decomposition, this is done
by dividing the series by the trend values.
3. The second step is to “de-trend” the series. For an additive decomposition, this is done by
subtracting the trend estimates from the series. For a multiplicative decomposition, this is done
by dividing the series by the trend values.
4. The second step is to “de-trend” the series. For an additive decomposition, this is done by
subtracting the trend estimates from the series. For a multiplicative decomposition, this is done
by dividing the series by the trend values.
Smoothing Time Series

Smoothing is usually done to help us better see patterns, trends for example, in time series. Generally
smooth out the irregular roughness to see a clearer signal. For seasonal data, we might smooth out the
seasonality so that we can identify the trend. Smoothing doesn’t provide us with a model, but it can be
a good first step in describing various components of the series.
The term filter is sometimes used to describe a smoothing procedure. For instance, if the smoothed
value for a particular time is calculated as a linear combination of observations for surrounding times,
it might be said that we’ve applied a linear filter to the data (not the same as saying the result is a straight
line, by the way).

Moving Averages

The traditional use of the term moving average is that at each point in time we determine (possibly
weighted) averages of observed values that surround a particular time.
To take away seasonality from a series so we can better see trend, we would use a moving average with
a length = seasonal span. Thus, in the smoothed series, each smoothed value has been averaged across
all seasons. This might be done by looking at a “one-sided” moving average in which you average all
values for the previous years’ worth of data or a centred moving average in which you use values both
before and after the current time.
Example of one-sided filter:

A centred moving average creates a bit of a difficulty when we have an even number of time periods in
the seasonal span (as we usually do).
Some examples of centred filter:
Single Exponential Smoothing

The basic forecasting equation for single exponential smoothing is often given as

The value of α is called the smoothing constant. With a relatively small value of α, the smoothing will
be relatively more extensive. With a relatively large value of α, the smoothing is relatively less extensive
as more weight will be put on the observed value.

Double Exponential Smoothing

Double exponential smoothing might be used when there's trend (either long run or short run), but no
seasonality.
Essentially the method creates a forecast by combining exponentially smoothed estimates of the trend
(slope of a straight line) and the level (basically, the intercept of a straight line).
Two different weights, or smoothing parameters, are used to update these two components at each time.
The smoothed “level” is more or less equivalent to a simple exponential smoothing of the data values
and the smoothed trend is more or less equivalent to a simple exponential smoothing of the first
differences.
The procedure is equivalent to fitting an ARIMA(0,2,2) model, with no constant; it can be carried out
with an ARIMA(0,2,2) fit.

The Periodogram

Any time series can be expressed as a combination of cosine and sine waves with differing periods (how
long it takes to complete a full cycle) and amplitudes (maximum/minimum value during the cycle). This
fact can be utilized to examine the periodic (cyclical) behavior in a time series.
A periodogram is used to identify the dominant periods (or frequencies) of a time series. This can be a
helpful tool for identifying the dominant cyclical behavior in a series, particularly when the cycles are
not related to the commonly encountered monthly or quarterly seasonality.
Suppose that we have observed data at n distinct time points, and for convenience we assume that n is
even. Our goal is to identify important frequencies in the data. To pursue the investigation, we consider
the set of possible frequencies wj = j/n for j = 1, 2,…, n/2. These are called the harmonic frequencies.

We will represent the time series as

This is a sum of sine and cosine functions at the harmonic frequencies. It uses the following identity:

Think of the β1(j/n) and β 2(j/n) as regression parameters. Then there is a total of n parameters because
we let j move from 1 to n/2. This means that we have n data points and n parameters, so the fit of this
regression model will be exact.

The first step in the creation of the periodogram is the estimation of the β1(j/n) and β 2(j/n) parameters.
It’s actually not necessary to carry out this regression to estimate the β1(j/n) and β 2(j/n) parameters.
Instead, a mathematical device called the Fast Fourier Transform (FFT) is used.
After the parameters have been estimated, we define:

This is the value of the sum of squared “regression” coefficients at the frequency j/n.

Interpretation and Use

A relatively large value of P(j/n) indicates relatively more importance for the frequency j/n (or near j/n)
in explaining the oscillation in the observed series. P(j/n) is proportional to the squared correlation
between the observed series and a cosine wave with frequency j/n. The dominant frequencies might be
used to fit cosine (or sine) waves to the data or might be used simply to describe the important
periodicities in the series.

Linear Regression Models with Autoregressive Errors

When we do regressions using time series variables, it is common for the errors (residuals) to have a
time series structure. This violates the usual assumption of independent errors made in ordinary least
squares regression. The consequence is that the estimates of coefficients and their standard errors will
be wrong if the time series structure of the errors is ignored.

It is possible, though, to adjust estimated regression coefficients and standard errors when the errors
have an AR structure. More generally, we will be able to make adjustments when the errors have a
general ARIMA structure.

Regression Model with AR Errors

Suppose that yt and xt are time series variables. A simple linear regression model with autoregressive
errors can be written as:

Checking if such a model is Necessary.

1. Start by doing an ordinary regression. Store the residuals.

2. Analyze the time series structure of the residuals to determine if they have an AR structure.
3. If the residuals from the ordinary regression appear to have an AR structure, estimate this model
and diagnose whether the model is appropriate.
The Cochrane-Orcutt Procedure

This procedure is attributed to Cochrane and Orcutt (1949) and is repeated until the estimates converge,
that is we observe a very small difference in our estimates between iterations. When the errors exhibit
an AR(1) pattern, the cochrane.orcutt function found within the orcutt package in R iterates this
procedure.
For a higher order AR, the adjustment variables are calculated in the same manner with more lags. For
instance, suppose the residuals were found to have an AR(2) with estimated coefficients 0.9 and -0.2.
Then the y- and x- variables for the adjustment regression would be

Regression Model with ARIMA Errors

Here, the purpose is to adjust the regression estimates for the fact that the residuals have an ARIMA
structure.
The basic steps are:
1. Use OLS regression to estimate the model:

(note that the βt term represents the possibility of a trend component, and hence the series may
require de-trending)

2. Examine the ARIMA structure (if any) of the sample residuals from the model in step 1.

3. If the residuals do have an ARIMA structure, use maximum likelihood to simultaneously

estimate the regression model using ARIMA estimation for the residuals.

4. Examine the ARIMA structure (if any) of the sample residuals from the model in step 3. If
white noise is present, then the model is complete. If not, continue to adjust the ARIMA model
for the errors until the residuals are white noise.

Cross Correlation Functions and Lagged Regressions

The basic problem we’re considering is the description and modelling of the relationship between two
time series.

In the relationship between two time series (yt and xt), the series yt may be related to past lags of the x-
series. The sample cross correlation function (CCF) is helpful for identifying lags of the x-variable
that might be useful predictors of yt.

The sample CCF is defined as the set of sample correlations between xt+h and yt for h = 0, ±1, ±2, ±3,
and so on. A negative value for ℎ is a correlation between the x-variable at a time before t and the y-
variable at time t. For instance, consider ℎ = −2. The CCF value would give the correlation between xt-
2 and yt.

• When one or more xt+h, with h negative, are predictors of yt, it is sometimes said that x leads y.
• When one or more xt+h, with h positive, are predictors of yt, it is sometimes said that x lags y.
In some problems, the goal may be to identify which variable is leading and which is lagging. In many
problems we consider, though, we’ll examine the x-variable(s) to be a leading variable of the y-variable
because we will want to use values of the x-variable to predict future values of y.
The CCF pattern is affected by the underlying time series structures of the two variables and the trend
each series has. It often (perhaps most often) is helpful to de-trend and/or take into account the
univariate ARIMA structure of the x-variable before graphing the CCF.

ARCH/GARCH Models

An ARCH (autoregressive conditionally heteroscedastic) model is a model for the variance of a time
series. ARCH models are used to describe a changing, possibly volatile variance. Although an ARCH
model could possibly be used to describe a gradually increasing variance over time, most often it is
used in situations in which there may be short periods of increased variation. (Gradually increasing
variance connected to a gradually increasing mean level might be better handled by transforming the
variable.)
ARCH models were created in the context of econometric and finance problems having to do with the
amount that investments or stocks increase (or decrease) per time period, so there’s a tendency to
describe them as models for that type of variable. For that reason, the authors of our text suggest that
the variable of interest in these problems might either be yt = (xt – xt-1)/xt-1, the proportion gained or lost
since the last time, or log((xt – xt-1)/xt-1) =log((xt – xt-1)) – log(xt-1), the logarithm of the ratio of this
time’s value to last time’s value. It’s not necessary that one of these be the primary variable of
interest. An ARCH model could be used for any series that has periods of increased or decreased
variance. This might, for example, be a property of residuals after an ARIMA model has been fit to the
data.

The ARCH(1) Variance Model

Suppose that we are modelling the variance of a series yt . The ARCH(1) model for the variance of
model yt is that conditional on yt-1 , the variance at time t is:

We impose the constraints α0 ≥ 0 and α1 ≥ 0 to avoid negative variance.

The variance at time t is connected to the value of the series at time t – 1. A relatively large value of
y2t-1 gives a relatively large value of the variance at time t. This means that the value of yt is less
predictable at time t −1 than at times after a relatively small value of y2t-1.

If we assume that the series has mean = 0 (this can always be done by centering), the ARCH model
could be written as:
For inference (and maximum likelihood estimation) we would also assume that the εt are normally
distributed.
Two potentially useful properties of the useful theoretical property of the ARCH(1) model as written in
equation above are the following:

• y2t has the AR(1) model y2t = α0 + α1 y2t-1 + error.

• This model will be causal, meaning it can be converted to a legitimate infinite order MA only
when α12 < 1/3
• yt is white noise when 0 ≤ α1≤ 1.
ARCH(m) and GARCH models

An ARCH(m) process is one for which the variance at time t is conditional on observations at the
previous m times, and the relationship is

With certain constraints imposed on the coefficients, the yt series squared will theoretically be AR(m).
A GARCH (generalized autoregressive conditionally heteroscedastic) model uses values of the past
squared observations and past variances to model the variance at time t. As an example, a GARCH(1,1)
is:

In the GARCH notation, the first subscript refers to the order of the y2 terms on the right side, and the
second subscript refers to the order of the σ 2
terms.

Identifying an ARCH/GARCH model

The following steps can be tried to identify an ARCH/GARCH model:

1. Make a time series plot of the series. Periods of increased variation may be spotted sprinkled
throughout the series.
2. Examine the ACF and the PACF plots of both yt and yt2. For instance, if yt appears to be white
noise and yt2 appears to be AR(1), then an ARCH(1) model for the variance is suggested. If the
PACF of the yt2 suggests AR(m), then ARCH(m) may work. GARCH models may be suggested
by an ARMA type look to the ACF and PACF of yt2.

Vector Autoregressive Models (VAR(p) models)

VAR models (vector autoregressive models) are used for multivariate time series. The structure is that
each variable is a linear function of past lags of itself and past lags of the other variables.

As an example, suppose that we measure three different time series variables, denoted by xt-1, xt-2,
and xt-3.

The vector autoregressive model of order 1, denoted as VAR(1), is as follows:

Each variable is a linear function of the lag 1 values for all variables in the set.
In a VAR(2) model, the lag 2 values for all variables are added to the right sides of the equations, In the
case of three x-variables (or time series) there would be six predictors on the right side of each equation,
three lag 1 terms and three lag 2 terms.
In general, for a VAR(p) model, the first p lags of each variable in the system would be used as
regression predictors for each variable.
VAR models are a specific case of more general VARMA models. VARMA models for multivariate
time series include the VAR structure above along with moving average terms for each variable. More
generally yet, these are special cases of ARMAX models that allow for the addition of other predictors
that are outside the multivariate set of principal interest.

Statistical Inference in Regression
No ratings yet
Statistical Inference in Regression
30 pages
Regression
No ratings yet
Regression
15 pages
MLR TestingSignificance
No ratings yet
MLR TestingSignificance
21 pages
Basics of The OLS Estimator: Study Guide For The Midterm
No ratings yet
Basics of The OLS Estimator: Study Guide For The Midterm
7 pages
Econometric S
No ratings yet
Econometric S
23 pages
Strongest Linear Regression Analysis
No ratings yet
Strongest Linear Regression Analysis
5 pages
Statistical Inference in Linear Regression
No ratings yet
Statistical Inference in Linear Regression
33 pages
Ecn 306
No ratings yet
Ecn 306
43 pages
Chapter 2
No ratings yet
Chapter 2
27 pages
8-Econometrics-Linear Regression
No ratings yet
8-Econometrics-Linear Regression
14 pages
Correlation & Regression Analysis
100% (1)
Correlation & Regression Analysis
39 pages
+part 03 - AMEFA - 2024 - Introduction and Repetition
No ratings yet
+part 03 - AMEFA - 2024 - Introduction and Repetition
46 pages
Chapter 9 Multiple Regression Analysis: The Problem of Inference
No ratings yet
Chapter 9 Multiple Regression Analysis: The Problem of Inference
10 pages
Understanding Bivariate Regression
No ratings yet
Understanding Bivariate Regression
4 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
Regression Analysis Essentials
100% (1)
Regression Analysis Essentials
2 pages
Welcome To The Course: Financial Econometrics I
No ratings yet
Welcome To The Course: Financial Econometrics I
14 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
These Two Methods Are Explained in Detail in The Next Sections of Your Material.
No ratings yet
These Two Methods Are Explained in Detail in The Next Sections of Your Material.
5 pages
Econometrics Concepts Explained
No ratings yet
Econometrics Concepts Explained
20 pages
RiP Final Study
No ratings yet
RiP Final Study
35 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
24 pages
OLS Assumptions & Issues Guide
No ratings yet
OLS Assumptions & Issues Guide
4 pages
Chapter 14
No ratings yet
Chapter 14
65 pages
L1 QM07 High Yield Notes
No ratings yet
L1 QM07 High Yield Notes
4 pages
File4 Session3 Introduction To Regression
No ratings yet
File4 Session3 Introduction To Regression
50 pages
Simple LR Lecture
No ratings yet
Simple LR Lecture
60 pages
Lesson 12 - Introduction To Regression and Correlation Analysis Regression Analysis
No ratings yet
Lesson 12 - Introduction To Regression and Correlation Analysis Regression Analysis
39 pages
Regression Analysis Techniques
No ratings yet
Regression Analysis Techniques
32 pages
Aqt 1
No ratings yet
Aqt 1
33 pages
Linear Regression Models
No ratings yet
Linear Regression Models
3 pages
Hypothesis Testing & ANOVA
No ratings yet
Hypothesis Testing & ANOVA
23 pages
Advanced Multiple Regression
No ratings yet
Advanced Multiple Regression
28 pages
Hypothesis Testing for Regression Slope
No ratings yet
Hypothesis Testing for Regression Slope
3 pages
Lecture 3 - LRM
No ratings yet
Lecture 3 - LRM
40 pages
Regression Analysis
No ratings yet
Regression Analysis
16 pages
STAT630Slide Adv Data Analysis
0% (1)
STAT630Slide Adv Data Analysis
238 pages
Econometrics Lecture 1 15834207 2023 03 06 17 58
No ratings yet
Econometrics Lecture 1 15834207 2023 03 06 17 58
33 pages
Econometrics Cheatsheet en
100% (1)
Econometrics Cheatsheet en
3 pages
8 Multiple Regression Inference
No ratings yet
8 Multiple Regression Inference
13 pages
Analyzing Variance in Regression Models
No ratings yet
Analyzing Variance in Regression Models
15 pages
Basic
No ratings yet
Basic
4 pages
Joint Significance in Regression Analysis
No ratings yet
Joint Significance in Regression Analysis
36 pages
Quants
No ratings yet
Quants
8 pages
Presentation REGRESSION
No ratings yet
Presentation REGRESSION
26 pages
DAM Class 21-24 Regression Analysis
No ratings yet
DAM Class 21-24 Regression Analysis
93 pages
P4 New - CHeat Sheet End-Term
No ratings yet
P4 New - CHeat Sheet End-Term
7 pages
Multiple - Regression - Analysis - Mu Final
No ratings yet
Multiple - Regression - Analysis - Mu Final
11 pages
Correlation
No ratings yet
Correlation
13 pages
DA&V Module 2 (SAMI)
No ratings yet
DA&V Module 2 (SAMI)
14 pages
Econometrics Chapter Three
No ratings yet
Econometrics Chapter Three
55 pages
Lecture 4
No ratings yet
Lecture 4
60 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
Correlation Simple Regression
No ratings yet
Correlation Simple Regression
26 pages
CFA Level 2 Resume
No ratings yet
CFA Level 2 Resume
38 pages
Reading 1
No ratings yet
Reading 1
49 pages
Causal Relationships in Regression Analysis
No ratings yet
Causal Relationships in Regression Analysis
8 pages
Uday Kiran Resume Aug15th Verdana 2page
No ratings yet
Uday Kiran Resume Aug15th Verdana 2page
2 pages
Thermodynamics
No ratings yet
Thermodynamics
8 pages
Mechanical Feedback
No ratings yet
Mechanical Feedback
1 page
Csis Feedback
No ratings yet
Csis Feedback
2 pages
Eeee Ni Feedback
No ratings yet
Eeee Ni Feedback
1 page
Statistics Homework Solutions
No ratings yet
Statistics Homework Solutions
14 pages
Understanding Regression Analysis
No ratings yet
Understanding Regression Analysis
18 pages
ECON4150 - Introductory Econometrics Lecture 2: Review of Statistics
No ratings yet
ECON4150 - Introductory Econometrics Lecture 2: Review of Statistics
41 pages
Optimization of Alpha Parameters in Single Exponen v1
No ratings yet
Optimization of Alpha Parameters in Single Exponen v1
6 pages
Computer Assignment
0% (1)
Computer Assignment
6 pages
Assignment2 Dhairya - Shah
No ratings yet
Assignment2 Dhairya - Shah
7 pages
CORRELATION AND COVARIANCE in R
100% (1)
CORRELATION AND COVARIANCE in R
24 pages
Stevenson 14e Chap003
No ratings yet
Stevenson 14e Chap003
41 pages
GEA1000 Study Notes Overview
No ratings yet
GEA1000 Study Notes Overview
27 pages
Executive Data Science A Guide To Training and Managing The Best Data Scientists by Brian Caffo, Roger D. Peng, Jeffrey T. Leek
100% (1)
Executive Data Science A Guide To Training and Managing The Best Data Scientists by Brian Caffo, Roger D. Peng, Jeffrey T. Leek
150 pages
Ch08 New
No ratings yet
Ch08 New
18 pages
Model Selection Techniques - An Overview: Jie Ding, Vahid Tarokh, and Yuhong Yang
No ratings yet
Model Selection Techniques - An Overview: Jie Ding, Vahid Tarokh, and Yuhong Yang
21 pages
Decision Making For Two Samples
No ratings yet
Decision Making For Two Samples
76 pages
Iannis Xenakis - Music Composition Treks
100% (2)
Iannis Xenakis - Music Composition Treks
13 pages
Theory of Estimation by P.G.dixit, Nirali Publication
No ratings yet
Theory of Estimation by P.G.dixit, Nirali Publication
186 pages
Correlation and Regression Analysis Lab
No ratings yet
Correlation and Regression Analysis Lab
8 pages
3141b86-6fd4-7726-D8ad-20a1516bcd Statistics Interview Cheat Sheet - Emmading - Com. All Rights Reserved.
No ratings yet
3141b86-6fd4-7726-D8ad-20a1516bcd Statistics Interview Cheat Sheet - Emmading - Com. All Rights Reserved.
10 pages
Drawbacks in The 3-Factor Approach of Fama and French (2018)
No ratings yet
Drawbacks in The 3-Factor Approach of Fama and French (2018)
36 pages
Econometrics II: ARCH & GARCH Models
No ratings yet
Econometrics II: ARCH & GARCH Models
4 pages
Lecture - 8 MLR
No ratings yet
Lecture - 8 MLR
63 pages
Test Bank For Introductory Statistics A Problem Solving Approach 3rd Edition
No ratings yet
Test Bank For Introductory Statistics A Problem Solving Approach 3rd Edition
16 pages
Research Methods For Construction 3rd Ed Edition Richard Fellows Online Version
100% (1)
Research Methods For Construction 3rd Ed Edition Richard Fellows Online Version
123 pages
(Ebook PDF) Introductory Statistics: Exploring The World Through Data 3rd Edition PDF Download
100% (3)
(Ebook PDF) Introductory Statistics: Exploring The World Through Data 3rd Edition PDF Download
56 pages
Nguyễn Thị Mỹ Tâm
No ratings yet
Nguyễn Thị Mỹ Tâm
3 pages
Chapter 17 - Cointegration and ECMs
No ratings yet
Chapter 17 - Cointegration and ECMs
17 pages
16 Simple Linear
No ratings yet
16 Simple Linear
7 pages
Learning Best Practices For Model Evaluation and Hyperparameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyperparameter Tuning
17 pages
Solution Manual Adms 2320 PDF
No ratings yet
Solution Manual Adms 2320 PDF
869 pages
UNIT III Theory
No ratings yet
UNIT III Theory
44 pages
RN10 BEEA StatPro RN Correlation and Regression Analyses MP RM FD
No ratings yet
RN10 BEEA StatPro RN Correlation and Regression Analyses MP RM FD
33 pages

Basic Econometrics Notes

Uploaded by

Basic Econometrics Notes

Uploaded by

Simple Linear Regression (Two variable linear model)

Interpretations: b0: The expected value of Y when X = 0.

Gauss-Markov Assumptions: the following assumptions must be satisfied for SLR:

6) The error terms of different observations are uncorrelated to each other.

Ordinary Least Squares (OLS)

and cov(X,Y) means the covariance of X and Y .

The estimated least-squares regression (OLS) equation is then

The unbiased estimates for variance of b0 and b1 is then given by:

Significance Testing Using T-Test

Test of Goodness of Fit

OLS estimators are BLUE (Best Linear Unbiased Estimator).

• Best: this method’s estimators have the least variance

Multiple Linear Regression

Coefficient of Multiple Determination

where n is the number of observations, and k the number of parameters estimated.

Partial Correlation Coefficients

Further Techniques and Applications

Problems in Regression Analysis

• collecting more data

Measurement Errors in Variables

Omission of a relevant variable

Univariate Time Series

A series xt is said to be (weakly) stationary if it satisfies the following properties:

• The mean E(xt) is the same for all t.

White Noise Process

A definition of a white noise process is:

The Lag Operator

Moving Averages (MA) Model

An MA model is said to be invertible if it is algebraically equivalent to a converging infinite order

Using the lag operator:

ARMA and ARIMA Models

Example: AR(1) Model

Sample Autocorrelation Function (ACF)

Partial Autocorrelation Function (PACF)

Ljung-Box Test of Significance

Modeling Time Series Data

The standard approach to modeling non-seasonal time series data is as follows:

Seasonal ARIMA Models

Differencing for Trend and Seasonality

(1-B12)(1-B)Xt = (Xt – Xt-1) – (Xt-12 – Xt-13)

Seasonal ARIMA Model

Ψ(BS) ϕ(B) (Xt - µ) = Θ(BS)ɵ(B)wt

The seasonal components are:

Basic Steps in Decomposition

Double Exponential Smoothing

We will represent the time series as

Interpretation and Use

Linear Regression Models with Autoregressive Errors

Regression Model with AR Errors

Checking if such a model is Necessary.

1. Start by doing an ordinary regression. Store the residuals.

Regression Model with ARIMA Errors

3. If the residuals do have an ARIMA structure, use maximum likelihood to simultaneously

Cross Correlation Functions and Lagged Regressions

The ARCH(1) Variance Model

We impose the constraints α0 ≥ 0 and α1 ≥ 0 to avoid negative variance.

• y2t has the AR(1) model y2t = α0 + α1 y2t-1 + error.

Identifying an ARCH/GARCH model

The following steps can be tried to identify an ARCH/GARCH model:

Vector Autoregressive Models (VAR(p) models)

The vector autoregressive model of order 1, denoted as VAR(1), is as follows:

You might also like