0% found this document useful (0 votes)

11 views10 pages

R Code Part 2

The document covers various statistical methods for analyzing data, including heteroscedasticity, autocorrelation, instrumental variables, and panel data analysis. It provides examples of using R functions to perform linear regression, compute robust standard errors, and conduct tests for stationarity and cointegration. Additionally, it discusses model comparisons and the application of fixed and random effects in panel data models.

Uploaded by

stan.rooseleer32

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views10 pages

R Code Part 2

Uploaded by

stan.rooseleer32

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

#----------------------------------------------------------------------------

# heteroscedasticity

#----------------------------------------------------------------------------

foodexp = read.table( "data/foodexp.txt", header = TRUE)

head(foodexp)

plot(foodexp$INCOME,foodexp$FOOD_EXP,xlab="INCOME",ylab="FOOD_E
XP")

reg = lm(FOOD_EXP~INCOME, data = foodexp)

summary(reg)

# compute heteroskedasticity-robust standard errors

# cfr vcovHC = variance-covariancematrix robust to HeterosCedasticity

whitecov = vcovHC(reg, type = "HC0")

coeftest(reg, vcov. = whitecov)

#remark that several alternatives have been developed recently

#and can be applied by choosing HC1, HC2, HC3 or HC4

rm(list=ls())

#----------------------------------------------------------------------------

# autocorrelation

#----------------------------------------------------------------------------

infldata = read_xlsx("data/inflation.xlsx")

head(infldata)

infldata$time = 1:nrow(infldata)

#one can also use the ts.plot to plot the time series but I need the time
variable anyway
plot(infldata$time,infldata$INFLN,type="l")

dwtest(INFLN~time,data=infldata)

#standard OLS inference is not valid but we want to check the residuals:

ols = lm(INFLN~time,data=infldata)

summary(ols)

#we can also look at the autocorrelation function (ACF): corr(u_t, u_{t-s})
as a function of s

resnw = ols$residuals

#notice the significant first autocorrelation in the plot

acf(resnw,main="check autocorrelation in residuals")

#the FGLS approach:

#iterated prais-winsten procedure; as the prais_winsten function cannot

deal

#with the missing value in the first observation, we remove it first

infldatapw = infldata[-1,]

#the index-option specifies the variable with the time information (here it
is also the explanatory variable)

#ask for 1 iteration with twostep = FALSE

pw = prais_winsten(INFLN~time,data = infldatapw,index = "time",twostep

= FALSE)

summary(pw)

#check that the iterative procedure (using two-step=TRUE) also stops

after 1 iteration here
#using corrected standard errors:

# compute HeterosCedasticity and Autocorrelation robust standard errors

# the default in vcovHAC is NeweyWest

NWcov = vcovHAC(ols)

coeftest(ols, vcov. = NWcov)

#getting rid of autocorrelation by adding lags of the response:

#Breusch-Godfrey test for AR(1) residuals

regmod = lm(INFLN~time,data=infldata)

bgtest(regmod, order = 1)

#add one lag

infldata$laginfln= Lag(infldata$INFLN, 1)

regmod2 = lm(INFLN~time+laginfln,data=infldata)

bgtest(regmod2, order = 1)

#add another lag

infldata$lag2infln= Lag(infldata$laginfln, 1) #or Lag(infldata$INFLN, 2)

regmod3 = lm(INFLN~time+laginfln+lag2infln,data=infldata)

bgtest(regmod3, order = 1)

summary(regmod3)

#the significance of time is borderline: slightly significant with Prais-

Winsten and

#by adding lags, slightly not significant with the corrected standard
errors...

rm(list=ls())

#----------------------------------------------------------------------------
# instrumental variables

#----------------------------------------------------------------------------

datIV <- read.csv("data/mroz2.csv")

head(datIV)

atwork = subset(datIV,wage > 0)

#as wage is right skewed, people often transform it to get a more

#symmetric (normally) distributed variable; remark that the estimators

#will asymptotically be normal anyway but we need less observations to

reach normality if

#the variables are more normally distributed

atwork$lnwage = log(atwork$wage) #log stands for ln !!!

summary(lm(lnwage~educ, data = atwork))

#put all the explanatory variables before the | in ivreg whether they are
endogenous or exogenous

#put all the instrumental variables and all the exogenous regressors after
the | in ivreg

summary(ivreg(lnwage~educ|mothereduc, data= atwork), diagnostics =

TRUE)

#illustration: ratio of standard errors approximately equal to

1/correlation(x,z)

stdev_OLS = 0.0144

stdev_IV = 0.03823

stdev_IV/stdev_OLS

1/cor(atwork$educ,atwork$mothereduc)

#use fathereducation as extra IV

summary(ivreg(lnwage~educ|mothereduc+fathereduc, data= atwork),

diagnostics = TRUE)

rm(list=ls())

#----------------------------------------------------------------------------

# Augmented Dickey-Fuller test

#----------------------------------------------------------------------------

timeseries = read_xlsx("data/stationarity.xlsx")

head(timeseries)

ts.plot(timeseries$Fseries, xlab="time" )

adf.test(timeseries$Fseries)

#remark that the lags in this table refer to the 'extra' lags that have to be
added

#to get rid of autocorrelated errors

#no matter whether we need an intercept and a deterministic trend or

not,

#no matter whether we need 0, 1, 2 or 3 lags to get rid of autocorrelation,

#the H0 of a unit root is accepted, so the series is UNIT ROOT

NONSTATIONARY

#----------------------------------------------------------------------------

#sometimes the decision depends on the exact number of lags you need
to include

#and on whether an intercept and trend are needed or not:

#to get rid of autocorrelated errors > we use Breusch-Godfrey to test
autocorrelation

#remark that you have to use the Lag function with Capital L!

#in this example, the dataset contains a "time" variable; if not, it has to
be created with

#timeseries$time = 1:length(timeseries$Fseries)

#first we test whether we need any extra lag

timeseries$L1F = Lag(timeseries$Fseries, 1)

regmod1 = lm(Fseries~0+L1F,data=timeseries) #no drift (use 0 or -1),

no trend

regmod2 = lm(Fseries~ L1F,data=timeseries) #with drift, no trend

regmod3 = lm(Fseries~time+L1F,data=timeseries) #with drift and trend

bgtest(regmod1, order = 1)

bgtest(regmod2, order = 1)

bgtest(regmod3, order = 1)

#so the residuals are autocorrelated and we need at least 1 lag extra

#check whether 1 extra lag is enough:

timeseries$L2F = Lag(timeseries$Fseries, 2)

regmod1 = lm(Fseries~0+L1F+L2F,data=timeseries)

regmod2 = lm(Fseries~L1F+L2F,data=timeseries)

regmod3 = lm(Fseries~time+L1F+L2F,data=timeseries)

bgtest(regmod1, order = 1)

bgtest(regmod2, order = 1)

bgtest(regmod3, order = 1)
#the residuals are no longer autocorrelated

#as the diff function removes the first observation, we have to create a

#difference variable ourselves:

timeseries$D1F = timeseries$Fseries-Lag(timeseries$Fseries,1)

timeseries$L1DF = Lag(timeseries$D1F, 1)

summary(lm(D1F~ time +L1F + L1DF, data = timeseries))

#as intercept and trend are significant, we need intercept + trend

#and 1 lag, and the appropriate tau in the adf.test output is -3.14

#with p-value= 0.109, so the Fseries is unit root nonstationarity

#----------------------------------------------------------------------------

#test whether the series is I(1) (so whether D1F is stationary):

ts.plot(timeseries$D1F, xlab="time")

#remark that we can also define D1F by diff(Fseries,1) if the

#variable is to be used in the adf.test function:

adf.test(timeseries$D1F)

#as almost all p-values are small, we might conclude that the difference
variable is

#stationary and therefore the Fseries is I(1)

#if you need to check which teststatistic to use, we check similarly as

before:

regmod_D_1 = lm(D1F~0+L1DF,data=timeseries) #no drift, no trend

regmod_D_2 = lm(D1F~ L1DF,data=timeseries) #with drift, no trend

regmod_D_3 = lm(D1F~ time+L1DF,data=timeseries) #with drift and

trend

bgtest(regmod_D_1, order = 1)

bgtest(regmod_D_2, order = 1)

bgtest(regmod_D_3, order = 1)

#so the residuals are not autocorrelated and we do not need extra lags

timeseries$D1DF = timeseries$D1F-Lag(timeseries$D1F,1)

timeseries$L1DDF = Lag(timeseries$D1DF, 1)

summary(lm(D1DF~ time +L1DF , data = timeseries))

#so we need the model without an intercept or trend

summary(lm(D1DF~ -1 +L1DF , data = timeseries))

#the appropriate tau = -4.007 with corrected p-value of 0.01

#so D1F is stationary and the Fseries is I(1)

#----------------------------------------------------------------------------

#test whether Fseries and Bseries are cointegrated

#----------------------------------------------------------------------------

#test first whether Bseries is I(0) (is not the case) or I(1) (is the case):

ts.plot(timeseries$Bseries, xlab="time")

adf.test(timeseries$Bseries)

timeseries$D1B = timeseries$Bseries - Lag(timeseries$Bseries,1)

ts.plot(timeseries$D1B, xlab="time")
adf.test(timeseries$D1B)

reg = lm(Fseries~Bseries,data = timeseries)

res = reg$residuals

adf.test(res)

#we reject the null hypothesis of unit root nonstationarity

#(you can check that you need 1 lag but no intercept or trend in this
case)

rm(list=ls())

#----------------------------------------------------------------------------

# panel data

#----------------------------------------------------------------------------

paneldata = read_xlsx("data/paneldata.xlsx")

head(paneldata)

# to get the appropriate tests, we use the plm function

# with the options:

# - pooling: simple OLS with 1 intercept,

# - random: the random effects model with FGLS or RE estimator

# - within: the fixed effects model with the within group estimator

#to use plm, we first need to convert the data with pdata.frame

#to indicate which variable identifies the subjects and which the time

panel <- pdata.frame(paneldata, index=c("I","T"), row.names=TRUE)

#pooled model

plm_pooled = plm(investments~value+capital, data=panel,

model="pooling")

summary(plm_pooled)

#perfom the tests using the cluster corrected covariance matrix:

coeftest(plm_pooled, vcov=vcovCR(plm_pooled,type="CR0"))

#FGLS estimators

plm_random <- plm(investments~value+capital, data=panel,

model="random")

summary(plm_random)

#remark that the variance component are estimated slightly different

#than in the slides but the parameter estimates + tests are OK

#fixed effects estimators

plm_fixed <- plm(investments~value+capital, data=panel,

model="within")

summary(plm_fixed )

#perfom the tests using the cluster corrected covariance matrix:

coeftest(plm_fixed, vcov=vcovCR(plm_fixed,type="CR0"))

#test whether the intercepts in the fixed effects model are equal

#so compare the pooled and the fixed effects model

pFtest(plm_fixed,plm_pooled)

#Hausman test

phtest(plm_random,plm_fixed)

#test and p-value differ from the slides but conclusion is similar

rm(list=ls())

Code
No ratings yet
Code
10 pages
Homework #6
No ratings yet
Homework #6
16 pages
Matlab-STATISTICAL MODELS AND METHODS FOR FINANCIAL MARKETS
No ratings yet
Matlab-STATISTICAL MODELS AND METHODS FOR FINANCIAL MARKETS
13 pages
Auto
No ratings yet
Auto
43 pages
Data Analysis & Simulation Code
No ratings yet
Data Analysis & Simulation Code
9 pages
Estimating A VAR - Gretl
No ratings yet
Estimating A VAR - Gretl
9 pages
Stationarity Analysis of Time Series Data
No ratings yet
Stationarity Analysis of Time Series Data
7 pages
Nardl
No ratings yet
Nardl
7 pages
ARDL Eviews9 David Giles
100% (1)
ARDL Eviews9 David Giles
35 pages
HW1 Solution
No ratings yet
HW1 Solution
23 pages
Cap0 Slides
No ratings yet
Cap0 Slides
53 pages
Understanding Autocorrelation in Econometrics
No ratings yet
Understanding Autocorrelation in Econometrics
24 pages
Cc3b3digo Stata Macroeconometrc3ada
No ratings yet
Cc3b3digo Stata Macroeconometrc3ada
2 pages
TIme Series Week 5
No ratings yet
TIme Series Week 5
6 pages
Practicals Data
No ratings yet
Practicals Data
26 pages
R Codes 1
No ratings yet
R Codes 1
3 pages
Time Series Analysis of Real Compensation
No ratings yet
Time Series Analysis of Real Compensation
9 pages
Chapter 14
No ratings yet
Chapter 14
9 pages
Time-Series Autocorrelation Guide
No ratings yet
Time-Series Autocorrelation Guide
13 pages
R Code for Bayesian VAR Analysis
No ratings yet
R Code for Bayesian VAR Analysis
3 pages
#1. Declare The Dataset As A Time Series, TS, Object in R
No ratings yet
#1. Declare The Dataset As A Time Series, TS, Object in R
3 pages
Econometrics
No ratings yet
Econometrics
10 pages
R Fourier
No ratings yet
R Fourier
18 pages
Model Linear
No ratings yet
Model Linear
33 pages
2 Session2 TS2024
No ratings yet
2 Session2 TS2024
32 pages
AMDA Practical - A048
No ratings yet
AMDA Practical - A048
35 pages
30 Lecture 26, 27 and 28 Slides
No ratings yet
30 Lecture 26, 27 and 28 Slides
23 pages
Classification
No ratings yet
Classification
5 pages
Autocorrelation in Time Series
No ratings yet
Autocorrelation in Time Series
101 pages
MBS-14 Akmal Shahzad Econo 3rd Assignment
No ratings yet
MBS-14 Akmal Shahzad Econo 3rd Assignment
4 pages
Heteroscedasticity & Autocorrelation
No ratings yet
Heteroscedasticity & Autocorrelation
5 pages
Machine Learning-Lecture 2 (Student)
No ratings yet
Machine Learning-Lecture 2 (Student)
9 pages
Homework 2
100% (1)
Homework 2
14 pages
Regression and Classification Analysis
No ratings yet
Regression and Classification Analysis
101 pages
Lec 05 - Time Series Regression Model
No ratings yet
Lec 05 - Time Series Regression Model
32 pages
ECS4863 - Solutions To Activity 1.3
No ratings yet
ECS4863 - Solutions To Activity 1.3
16 pages
Multicollinearity and Oaxaca - Tutorial
No ratings yet
Multicollinearity and Oaxaca - Tutorial
35 pages
Stochastic vs Deterministic Trends in Econometrics
No ratings yet
Stochastic vs Deterministic Trends in Econometrics
20 pages
Business Forecast Vishay Sood
No ratings yet
Business Forecast Vishay Sood
8 pages
Lec 05 2 - Time Series Regression Model
No ratings yet
Lec 05 2 - Time Series Regression Model
75 pages
Introductory Econometrics For Finance Chris Brooks Solutions To Review Questions - Chapter 5
No ratings yet
Introductory Econometrics For Finance Chris Brooks Solutions To Review Questions - Chapter 5
9 pages
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
No ratings yet
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
8 pages
Johansen's Cointegration Method Guide
No ratings yet
Johansen's Cointegration Method Guide
13 pages
Quant Strat Trade R
No ratings yet
Quant Strat Trade R
180 pages
Computational Laboratory For Economics
0% (1)
Computational Laboratory For Economics
461 pages
Assigment 2
No ratings yet
Assigment 2
2 pages
EViews Analysis of Inflation Determinants
No ratings yet
EViews Analysis of Inflation Determinants
30 pages
Lecture - L5 - Time Series ECM
No ratings yet
Lecture - L5 - Time Series ECM
28 pages
Eco 401 Econometrics: SI 2021, Week 9, 9 November 2021
No ratings yet
Eco 401 Econometrics: SI 2021, Week 9, 9 November 2021
39 pages
VAR Analysis for Econometrics Students
No ratings yet
VAR Analysis for Econometrics Students
9 pages
RA Assignment A004
No ratings yet
RA Assignment A004
16 pages
Autocorrelation: What Happens If The Error Terms Are Correlated?
No ratings yet
Autocorrelation: What Happens If The Error Terms Are Correlated?
37 pages
MBA 4043 10 Ranveer Singh Econometrics
No ratings yet
MBA 4043 10 Ranveer Singh Econometrics
12 pages
ECN 5013-Time Series Models-II
No ratings yet
ECN 5013-Time Series Models-II
6 pages
Ass1 Q2 Daisy Econometric Prediction ARIMA
No ratings yet
Ass1 Q2 Daisy Econometric Prediction ARIMA
14 pages
Lab 5
No ratings yet
Lab 5
6 pages
Part (A) : Setting The Seed
No ratings yet
Part (A) : Setting The Seed
2 pages
Economic Data Analysis: Autocorrelation
No ratings yet
Economic Data Analysis: Autocorrelation
8 pages
Auto and Cross-Correlation in Regression
No ratings yet
Auto and Cross-Correlation in Regression
37 pages
Rebel: My Escape From Saudi Arabia To Freedom 1st Edition Rahaf Mohammed Instant Download
No ratings yet
Rebel: My Escape From Saudi Arabia To Freedom 1st Edition Rahaf Mohammed Instant Download
130 pages
Biostatistics Primer Part 2
No ratings yet
Biostatistics Primer Part 2
10 pages
Hypothesis Testing Quiz for Students
No ratings yet
Hypothesis Testing Quiz for Students
3 pages
(Ebook PDF) Statistics Unplugged 4th Edition by Sally Caldwell - Download The Ebook With All Fully Detailed Chapters
100% (5)
(Ebook PDF) Statistics Unplugged 4th Edition by Sally Caldwell - Download The Ebook With All Fully Detailed Chapters
42 pages
What Sampling Method Is Best For Small Population
No ratings yet
What Sampling Method Is Best For Small Population
3 pages
10.4324 9780429261404 Previewpdf
No ratings yet
10.4324 9780429261404 Previewpdf
59 pages
Zscore 4
No ratings yet
Zscore 4
6 pages
Journal
No ratings yet
Journal
12 pages
Solution Manual For Applied Statistics in Business and Economics 6th Edition
No ratings yet
Solution Manual For Applied Statistics in Business and Economics 6th Edition
7 pages
Descriptive vs. Inrerential
No ratings yet
Descriptive vs. Inrerential
10 pages
Statistics Homework Solver
100% (2)
Statistics Homework Solver
7 pages
Perception of Grade 11 Students Towards Premarital Sex
No ratings yet
Perception of Grade 11 Students Towards Premarital Sex
26 pages
Probability Distributions and Expectations
No ratings yet
Probability Distributions and Expectations
4 pages
Minangkabau Woven Innovation Study
No ratings yet
Minangkabau Woven Innovation Study
30 pages
Seminar1 Part1 Student Copy SUSS
No ratings yet
Seminar1 Part1 Student Copy SUSS
6 pages
Student Motivation in PE Study
No ratings yet
Student Motivation in PE Study
58 pages
Research Aptitude UGC NET Paper 1 Notes Part 1
No ratings yet
Research Aptitude UGC NET Paper 1 Notes Part 1
16 pages
Definition and Types of Quantitative Research
No ratings yet
Definition and Types of Quantitative Research
43 pages
NYU Resume Book 2024 1711989327
No ratings yet
NYU Resume Book 2024 1711989327
28 pages
Evaluating The Impact of Urban Transit ... Idence From Bogotá's TransMilenio0
No ratings yet
Evaluating The Impact of Urban Transit ... Idence From Bogotá's TransMilenio0
44 pages
Cambridge IGCSE™: Sociology 0495/12
No ratings yet
Cambridge IGCSE™: Sociology 0495/12
23 pages
Time Series Forecast - A Basic Introduction Using Python
No ratings yet
Time Series Forecast - A Basic Introduction Using Python
18 pages
Assignment On Statistical Models
No ratings yet
Assignment On Statistical Models
2 pages
CUSAT BBA LLB Syllabus 2020-21
No ratings yet
CUSAT BBA LLB Syllabus 2020-21
134 pages
23 STS907
No ratings yet
23 STS907
16 pages
Impact of Co-Curriculars on Student Development
No ratings yet
Impact of Co-Curriculars on Student Development
14 pages
Surveys in Social Research 6nbsped 0415530156 9780415530156 - Compress
No ratings yet
Surveys in Social Research 6nbsped 0415530156 9780415530156 - Compress
401 pages
Optimal Clusters in Data Sets
No ratings yet
Optimal Clusters in Data Sets
6 pages
Composite Sampling Guidance for Regulators
No ratings yet
Composite Sampling Guidance for Regulators
4 pages
Lecture Notes For Chapter 4 Introduction To Data Mining: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 4 Introduction To Data Mining: by Tan, Steinbach, Kumar
101 pages