0% found this document useful (0 votes)
64 views27 pages

Spatial Regression Analysis in R

This document discusses spatial regression modeling. It introduces three main spatial regression models: spatial error model, spatial lag model, and spatial autoregressive model. The spatial error model incorporates spatial effects through the error term. The spatial lag model includes a spatially lagged dependent variable as a predictor. The spatial autoregressive model combines aspects of the spatial error and lag models. The document provides examples of implementing these spatial regression models in R using the Boston housing price data, including checking for spatial autocorrelation, model selection using LM tests, and comparing model fit.

Uploaded by

Up Toyou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views27 pages

Spatial Regression Analysis in R

This document discusses spatial regression modeling. It introduces three main spatial regression models: spatial error model, spatial lag model, and spatial autoregressive model. The spatial error model incorporates spatial effects through the error term. The spatial lag model includes a spatially lagged dependent variable as a predictor. The spatial autoregressive model combines aspects of the spatial error and lag models. The document provides examples of implementing these spatial regression models in R using the Boston housing price data, including checking for spatial autocorrelation, model selection using LM tests, and comparing model fit.

Uploaded by

Up Toyou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Spatial Regression

Praktikum 6 | Statistika Spasial


rahmaanisa@[Link]
Spatial autocorrelation in residuals
Spatial error model
• Incorporates spatial effects through error term

y  x  
  W  
• Where:
 is the vector of error terms, spatially weighted using
the weights matrix (W)
 is the spatial error coefficient
ξ is a vector of uncorrelated error terms
• If there is no spatial correlation between the errors,
then  = 0
Spatial autocorrelation in Dependent Variable
Spatial lag model
• Incorporates spatial effects by including a spatially
lagged dependent variable as an additional predictor
y  Wy  x  
• Where:
Wy is the spatially lagged DVs for weight s matrix W
x is a matrix of observatio ns on the explanatory variables
 is a vector of error terms
 is the spatial coefficient
• If there is no spatial dependence, and y does no
depend on neighboring y values,  = 0
Spatial Model

y  Wy  XB  u; u  Wu  
 ~ N (0,  2 I )

 0  0 0
 0 0  0

OLS SEM SAR


y  XB   y  XB  u y  Wy  XB  
Spatial Regression in R
Example: Housing Prices in Boston
CRIM per capita crime rate by town

ZN proportion of residential land zoned for lots over 25,000 ft2

INDUS proportion of non-retail business acres per town

CHAS Charles River dummy variable (=1 if tract bounds river; 0 otherwise)

NOX Nitrogen oxide concentration (parts per 10 million)

RM average number of rooms per dwelling

AGE proportion of owner-occupied units built prior to 1940

DIS weighted distances to five Boston employment centres

RAD index of accessibility to radial highways

TAX full-value property-tax rate per $10,000

PTRATIO pupil-teacher ratio by town

B 1000(Bk - 0.63)2 where Bk is the proportion of blacks by town

LSTAT % lower status of the population

MEDV Median value of owner-occupied homes in $1000's


Spatial Regression in R
1. Read in [Link]
2. Define neighbors (k nearest w/point data)
3. Create weights matrix
4. Moran’s test of DV, Moran scatterplot
5. Run OLS regression
6. Check residuals for spatial dependence
7. Determine which SR model to use w/LM tests
8. Run spatial regression model
Boston Data Set
• Silahkan download di alamat:
[Link]

• Pilih file dengan nama: [Link]

• Extract di working directory Anda masing2


Load the packages
library(rgdal)
library(maptools)
library(spdep)
Define neighbors and create weights matrix
> setwd("D:\\[Link]\\Statistika Spasial (S2)\\TA 2017-
2018\\Praktkum RA\\Reference\\Elisabeth Root")
> boston<-readOGR(dsn="boston",layer="boston")
> class(boston)

> boston$LOGMEDV<-log(boston$CMEDV)

> coords<-coordinates(boston)
> IDs<-[Link](as(boston, "[Link]"))

> bost_kd1<-dnearneigh(coords, d1=0, d2=3.973, [Link]=IDs)


> plot(boston)
> plot(bost_kd1, coords, add=T)

> bost_kd1_w<- nb2listw(bost_kd1)


Moran’s I on the DV
> [Link](boston$LOGMEDV, listw=bost_kd1_w)

Moran's I test under randomisation

data: boston$LOGMEDV
weights: bost_kd1_w

Moran I statistic standard deviate = 24.5658, p-value < 2.2e-16


alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.3273430100 -0.0019801980 0.0001797138
Moran Plot for the DV
> [Link](boston$LOGMEDV,
bost_kd1_w,
labels=[Link](boston$ID))
OLS Regression
bostlm<-lm(LOGMEDV~RM + LSTAT + CRIM + ZN + CHAS + DIS, data=boston)

Residuals:
Min 1Q Median 3Q Max
-0.71552 -0.11248 -0.02159 0.10678 0.93024

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.8718878 0.1316376 21.817 < 2e-16 ***
RM 0.1153095 0.0172813 6.672 6.70e-11 ***
LSTAT -0.0345160 0.0019665 -17.552 < 2e-16 ***
CRIM -0.0115726 0.0012476 -9.276 < 2e-16 ***
ZN 0.0019330 0.0005512 3.507 0.000494 ***
CHAS 0.1342672 0.0370521 3.624 0.000320 ***
DIS -0.0302262 0.0066230 -4.564 6.33e-06 ***
---
Residual standard error: 0.2081 on 499 degrees of freedom
Multiple R-squared: 0.7433, Adjusted R-squared: 0.7402
F-statistic: 240.8 on 6 and 499 DF, p-value: < 2.2e-16
Checking residuals for spatial autocorrelation
> boston$lmresid<-residuals(bostlm)
> [Link](bostlm, bost_kd1_w)

Global Moran's I for regression residuals

Moran I statistic standard deviate = 5.8542, p-value = 2.396e-09


alternative hypothesis: greater
sample estimates:
Observed Moran's I Expectation Variance
0.0700808323 -0.0054856590 0.0001666168
Determining the type of dependence
> [Link](bostlm, bost_kd1_w, test="all")

Lagrange multiplier diagnostics for spatial dependence

LMerr = 26.1243, df = 1, p-value = 3.201e-07


LMlag = 46.7233, df = 1, p-value = 8.175e-12
RLMerr = 5.0497, df = 1, p-value = 0.02463
RLMlag = 25.6486, df = 1, p-value = 4.096e-07
SARMA = 51.773, df = 2, p-value = 5.723e-12

• Robust tests used to find a proper alternative


• Only use robust forms when BOTH LMErr and LMLag are significant
One more diagnostic…
> [Link](“lmtest”)
> library(lmtest)

> bptest(bostlm)

studentized Breusch-Pagan test

data: bostlm
BP = 70.9173, df = 6, p-value = 2.651e-13

• Indicates errors are heteroskedastic


• Not surprising since we have spatial dependence
Running a spatial lag model
> bostlag<-lagsarlm(LOGMEDV~RM + LSTAT + CRIM + ZN + CHAS + DIS,
data=boston, bost_kd1_w)
Type: lag
Coefficients: (asymptotic standard errors)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.94228260 0.19267675 10.0805 < 2.2e-16
RM 0.10158292 0.01655116 6.1375 8.382e-10
LSTAT -0.03227679 0.00192717 -16.7483 < 2.2e-16
CRIM -0.01033127 0.00120283 -8.5891 < 2.2e-16
ZN 0.00166558 0.00052968 3.1445 0.001664
CHAS 0.07238573 0.03608725 2.0059 0.044872
DIS -0.04285133 0.00655158 -6.5406 6.127e-11

Rho: 0.34416, LR test value:37.426, p-value:9.4936e-10


Asymptotic standard error: 0.051967
z-value: 6.6226, p-value: 3.5291e-11
Wald statistic: 43.859, p-value: 3.5291e-11

Log likelihood: 98.51632 for lag model


ML residual variance (sigma squared): 0.03944, (sigma: 0.1986)
AIC: -179.03, (AIC for lm: -143.61)
A few more diagnostics
LM test for residual autocorrelation
test value: 1.9852, p-value: 0.15884

> [Link](bostlag)

studentized Breusch-Pagan test


data:
BP = 60.0237, df = 6, p-value = 4.451e-11

• LM test suggests there is no more spatial autocorrelation in the


data
• BP test indicates remaining heteroskedasticity in the residuals
• Most likely due to misspecification
Running a spatial error model
> bosterr<-errorsarlm(LOGMEDV~RM + LSTAT + CRIM + ZN + CHAS + DIS,
data=boston, listw=bost_kd1_w)
Type: error
Coefficients: (asymptotic standard errors)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.96330332 0.13381870 22.1442 < 2.2e-16
RM 0.09816980 0.01700824 5.7719 7.838e-09
LSTAT -0.03413153 0.00194289 -17.5674 < 2.2e-16
CRIM -0.01055839 0.00125282 -8.4277 < 2.2e-16
ZN 0.00200686 0.00062018 3.2359 0.001212
CHAS 0.06527760 0.03766168 1.7333 0.083049
DIS -0.02780598 0.01064794 -2.6114 0.009017

Lambda: 0.59085, LR test value: 24.766, p-value: 6.4731e-07


Asymptotic standard error: 0.086787
z-value: 6.8081, p-value: 9.8916e-12
Wald statistic: 46.35, p-value: 9.8918e-12

Log likelihood: 92.18617 for error model


ML residual variance (sigma squared): 0.03989, (sigma: 0.19972)
AIC: -166.37, (AIC for lm: -143.61)
Latihan
• Input data berikut:
kemiskinan<-[Link]("[Link]

• Input data bobot berikut:


bobot<-[Link]("[Link]
Memodelkan dengan regresi biasa (OLS)

[Link]<-kemiskinan[,-(1:4)]
[Link]<-lm(Y~.,data=[Link])
[Link]<-residuals([Link])
Uji asumsi
library(nortest)
library(car)

[Link]([Link])
hist([Link])
qqnorm([Link],datax=T)
qqline(rnorm(length([Link]),
mean([Link]),sd([Link])),datax=T, col="red")

durbinWatsonTest([Link])
[Link]([Link])
Uji Asumsi
Uji LM
bobot<-[Link]("[Link]
bobot<-[Link](bobot)
LM<-[Link]([Link], mat2listw(bobot, style="W"),
test=c("LMerr", "LMlag","RLMerr","RLMlag","SARMA"))
Lagrange multiplier diagnostics for spatial dependence

data:
model: lm(formula = Y ~ ., data = [Link])
weights: mat2listw(bobot, style = "W")

LMerr = 15.399, df = 1, p-value = 8.702e-05


LMlag = 20.303, df = 1, p-value = 6.61e-06
RLMerr = 0.10805, df = 1, p-value = 0.7424
RLMlag = 5.0118, df = 1, p-value = 0.02518
SARMA = 20.411, df = 2, p-value = 3.697e-05
Model SAR
sar1<-lagsarlm(Y~.,
data=[Link],
mat2listw(bobot))
err.sar1<-residuals(sar1)
summary(sar1)
Next Steps
 Diagnostics Checking for all models
 Compare the goodness of fits
 Select the best model

You might also like