We start our demo with an application of pspatreg to the analysis of cross-sectional data. In particular, we use Ames dataset (included in the package AmesHousing) which contains data on 2,930 properties in Ames, Iowa. The dataset includes information related to house characteristics (bedrooms, garages, fireplaces, pools, porches, etc.), location characteristics (neighborhood), lot information (zoning, shape, size, etc.), ratings of condition and quality and sale price (from 2006 to 2010). The section is organized as follows:
Description of dataset, spatial weights matrix and model specifications;
Estimation results of linear spatial models and comparison with the results obtained with the package spatialreg;
Estimation results of semiparametric spatial models.
The dataset is a spatial point dataset. It contains cross-sectional
information on 2,930 properties in Ames, Iowa. The raw dataset
(ames
) has been transformed in a spatial point dataset of
class sf
as follows:
library(pspatreg)
library(spdep)
library(sf)
library(ggplot2)
library(dplyr)
library(dbscan)
<- AmesHousing::make_ames()# Raw Ames Housing Data
ames <- st_as_sf(ames, coords = c("Longitude", "Latitude"))
ames_sf $Longitude <- ames$Longitude
ames_sf$Latitude <- ames$Latitude ames_sf
The dependent variable in the regression analysis is Sale_Price, while we selected the following variables as covariates:
- Lot_Area: Lot size in square feet
- Total_Bsmt_SF: Total square feet of basement area
- Garage_Cars: Size of garage in car capacity
- Gr_Liv_Area: Above grade (ground) living area square feet
- Fireplaces: Number of fireplaces
Due to the skewed distribution of the dependent variable Sale_Price, we use the log-transformation:
ggplot(data = ames_sf) + geom_histogram(mapping = aes(x = Sale_Price))
ggplot(data = ames_sf) + geom_histogram(mapping = aes(x = log(Sale_Price)))
$lnSale_Price <- log(ames_sf$Sale_Price)
ames_sf$lnLot_Area <- log(ames_sf$Lot_Area)
ames_sf$lnTotal_Bsmt_SF <- log(ames_sf$Total_Bsmt_SF+1)
ames_sf$lnGr_Liv_Area <- log(ames_sf$Gr_Liv_Area) ames_sf
Creating spatial weights is a necessary step when using areal data. To do so, it is necessary to choose a criterion to define the neighbours, and then to assign weights to each of them.
In particular, we have used a graph-based neighbors list (a Delauney
triangulation neighbor list) after eliminating duplicates in coordinates
values (thus the final sf
object used in the analysis is
ames_sf1
):
<- ames_sf[duplicated(ames_sf$Longitude)==FALSE,]
ames_sf1 <- cbind(ames_sf1$Longitude, ames_sf1$Latitude)
coord_sf1 # Delauney triangulation neighbours (symmetric)
<- row.names(as(ames_sf1, "sf"))
ID <- tri2nb(coord_sf1)
col.tri.nb <- graph2nb(soi.graph(col.tri.nb,coord_sf1), row.names=ID)
soi_nb is.symmetric.nb(soi_nb,verbose = TRUE, force = FALSE)
## [1] TRUE
<- nb2listw(soi_nb, style = "W", zero.policy = FALSE) listW
We define different formula for linear and nonlinear (semiparametric) models with and without a spatial trend. The Durbin formula is used for types “sdm”, “slx” or “sdem”.
In the case of semiparametric terms, in 3d or in 2d (as it is the case of spatial trend), the number of knots used to construct the B-splines basis needs to be specified.
# Linear Model
<- lnSale_Price ~ lnLot_Area+lnTotal_Bsmt_SF+
formlin +Garage_Cars+Fireplaces
lnGr_Liv_Area
<- ~ lnLot_Area+lnTotal_Bsmt_SF+
durbinformlin +Garage_Cars+Fireplaces
lnGr_Liv_Area
# Semiparametric model without spatial trend
<- lnSale_Price ~ Fireplaces + Garage_Cars +
formgam pspl(lnLot_Area, nknots = 20) +
pspl(lnTotal_Bsmt_SF, nknots = 20) +
pspl(lnGr_Liv_Area, nknots = 20)
# Semiparametric model with spatial trend in 2d
<- lnSale_Price ~ Fireplaces + Garage_Cars +
form2d pspl(lnLot_Area, nknots = 20) +
pspl(lnTotal_Bsmt_SF, nknots = 20) +
pspl(lnGr_Liv_Area, nknots = 20) +
pspt(Longitude,Latitude, nknots = c(10, 10), psanova = FALSE)
# Semiparametric model with PS-ANOVA spatial trend in 2d
<- lnSale_Price ~ Fireplaces + Garage_Cars +
form2d_psanova pspl(lnLot_Area, nknots = 20) +
pspl(lnTotal_Bsmt_SF, nknots = 20) +
pspl(lnGr_Liv_Area, nknots = 20) +
pspt(Longitude,Latitude, nknots = c(10, 10), psanova = TRUE)
<- ~ Fireplaces + Garage_Cars + pspl(lnLot_Area, nknots = 20) durbinformnonlin
We first estimate standard spatial linear autoregressive models using
the function pspatfit()
included in the package
pspatreg (based, in the default, on the REML
estimators) and compare them with the results obtained using the
functions provided by the package spatialreg (based on
the ML estimators).
pspatfit()
The SAR model for cross-sectional data can be specified as: \[y_{i}=\rho \sum_{j=1}^N w_{ij,N} y_{j} + \sum_{k=1}^K \beta_k x_{k,i} + \epsilon_{i}\]
\[\epsilon_{i} \sim
i.i.d.(0,\sigma^2_\epsilon)\] To estimate this model, we use the
option type="sar"
:
<- pspatfit(formlin, data = ames_sf1,listw = listW, method = "eigen", type = "sar")
linsar summary(linsar)
##
## Call
## pspatfit(formula = formlin, data = ames_sf1, listw = listW, type = "sar",
## method = "eigen")
##
## Parametric Terms
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.6976628 0.1034715 35.7361 < 2.2e-16
## lnLot_Area 0.0327109 0.0074716 4.3780 1.242e-05
## lnTotal_Bsmt_SF 0.0505658 0.0030507 16.5753 < 2.2e-16
## lnGr_Liv_Area 0.3993883 0.0136386 29.2837 < 2.2e-16
## Garage_Cars 0.1194316 0.0053948 22.1383 < 2.2e-16
## Fireplaces 0.0557000 0.0061150 9.1088 < 2.2e-16
## rho 0.3791002 0.0110024 34.4562 < 2.2e-16
##
## (Intercept) ***
## lnLot_Area ***
## lnTotal_Bsmt_SF ***
## lnGr_Liv_Area ***
## Garage_Cars ***
## Fireplaces ***
## rho ***
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Goodness-of-Fit
##
## EDF Total: 7
## Sigma: 0.208602
## AIC: -6438.86
## BIC: -6397.35
All \(\beta's\) are significant and positive as expected. The estimated spatial autoregressive (0.38) is also positive and significant.
Extract coefficients:
coef(linsar)
## rho (Intercept) lnLot_Area
## 0.37910018 3.69766281 0.03271090
## lnTotal_Bsmt_SF lnGr_Liv_Area Garage_Cars
## 0.05056578 0.39938825 0.11943160
## Fireplaces
## 0.05570003
Extract fitted values and residuals:
<- fitted(linsar)
fits plot(fits, ames_sf1$lnSale_Price)
<- residuals(linsar)
resids plot(fits, resids)
Extract log-likelihood and restricted log-likelihood:
logLik(linsar)
## 'log Lik.' 3226.428 (df=7)
logLik(linsar, REML = TRUE)
## 'log Lik.' 3195.181 (df=7)
Extract the covariance matrix of estimated coefficients. Argument
bayesian
allows to get frequentist (default) or bayesian
covariances:
vcov(linsar)
## (Intercept) lnLot_Area
## (Intercept) 1.070634e-02 -3.455908e-04
## lnLot_Area -3.455908e-04 5.582523e-05
## lnTotal_Bsmt_SF -3.956736e-05 3.375567e-07
## lnGr_Liv_Area -1.073722e-03 -2.107449e-05
## Garage_Cars 2.103679e-04 -4.427323e-06
## Fireplaces 2.394866e-04 -6.365671e-06
## lnTotal_Bsmt_SF lnGr_Liv_Area
## (Intercept) -3.956736e-05 -1.073722e-03
## lnLot_Area 3.375567e-07 -2.107449e-05
## lnTotal_Bsmt_SF 9.306556e-06 -2.958465e-06
## lnGr_Liv_Area -2.958465e-06 1.860109e-04
## Garage_Cars -2.172059e-06 -2.823525e-05
## Fireplaces -1.475256e-06 -2.602909e-05
## Garage_Cars Fireplaces
## (Intercept) 2.103679e-04 2.394866e-04
## lnLot_Area -4.427323e-06 -6.365671e-06
## lnTotal_Bsmt_SF -2.172059e-06 -1.475256e-06
## lnGr_Liv_Area -2.823525e-05 -2.602909e-05
## Garage_Cars 2.910392e-05 -2.952703e-06
## Fireplaces -2.952703e-06 3.739314e-05
vcov(linsar, bayesian = TRUE)
## (Intercept) lnLot_Area
## (Intercept) 1.070634e-02 -3.455908e-04
## lnLot_Area -3.455908e-04 5.582523e-05
## lnTotal_Bsmt_SF -3.956736e-05 3.375567e-07
## lnGr_Liv_Area -1.073722e-03 -2.107449e-05
## Garage_Cars 2.103679e-04 -4.427323e-06
## Fireplaces 2.394866e-04 -6.365671e-06
## lnTotal_Bsmt_SF lnGr_Liv_Area
## (Intercept) -3.956736e-05 -1.073722e-03
## lnLot_Area 3.375567e-07 -2.107449e-05
## lnTotal_Bsmt_SF 9.306556e-06 -2.958465e-06
## lnGr_Liv_Area -2.958465e-06 1.860109e-04
## Garage_Cars -2.172059e-06 -2.823525e-05
## Fireplaces -1.475256e-06 -2.602909e-05
## Garage_Cars Fireplaces
## (Intercept) 2.103679e-04 2.394866e-04
## lnLot_Area -4.427323e-06 -6.365671e-06
## lnTotal_Bsmt_SF -2.172059e-06 -1.475256e-06
## lnGr_Liv_Area -2.823525e-05 -2.602909e-05
## Garage_Cars 2.910392e-05 -2.952703e-06
## Fireplaces -2.952703e-06 3.739314e-05
A print method to get printed coefficients, standard errors and p-values of parametric terms:
print(linsar)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.6977 0.1035 35.7361 0
## lnLot_Area 0.0327 0.0075 4.3780 0
## lnTotal_Bsmt_SF 0.0506 0.0031 16.5753 0
## lnGr_Liv_Area 0.3994 0.0136 29.2837 0
## Garage_Cars 0.1194 0.0054 22.1383 0
## Fireplaces 0.0557 0.0061 9.1088 0
## rho 0.3791 0.0110 34.4562 0
Computing average direct, indirect and total marginal impacts:
<- impactspar(linsar, list_varpar)
imp_parvar_sar summary(imp_parvar_sar)
##
## Total Parametric Impacts (sar)
## Estimate Std. Error t value
## lnLot_Area 0.0521190 0.0123449 4.2218921
## lnTotal_Bsmt_SF 0.0816164 0.0052111 15.6618939
## lnGr_Liv_Area 0.6432398 0.0252946 25.4299434
## Garage_Cars 0.1928710 0.0093188 20.6970820
## Fireplaces 0.0899593 0.0105895 8.4951155
## Pr(>|t|)
## lnLot_Area 0
## lnTotal_Bsmt_SF 0
## lnGr_Liv_Area 0
## Garage_Cars 0
## Fireplaces 0
##
## Direct Parametric Impacts (sar)
## Estimate Std. Error t value
## lnLot_Area 0.0351155 0.0082951 4.2333125
## lnTotal_Bsmt_SF 0.0549917 0.0034134 16.1103355
## lnGr_Liv_Area 0.4333902 0.0153661 28.2043851
## Garage_Cars 0.1299500 0.0058991 22.0288023
## Fireplaces 0.0606106 0.0070538 8.5926044
## Pr(>|t|)
## lnLot_Area 0
## lnTotal_Bsmt_SF 0
## lnGr_Liv_Area 0
## Garage_Cars 0
## Fireplaces 0
##
## Indirect Parametric Impacts (sar)
## Estimate Std. Error t value
## lnLot_Area 0.0170035 0.0040849 4.1625660
## lnTotal_Bsmt_SF 0.0266247 0.0019695 13.5181776
## lnGr_Liv_Area 0.2098497 0.0116605 17.9966276
## Garage_Cars 0.0629210 0.0038987 16.1388336
## Fireplaces 0.0293486 0.0036501 8.0404851
## Pr(>|t|)
## lnLot_Area 0
## lnTotal_Bsmt_SF 0
## lnGr_Liv_Area 0
## Garage_Cars 0
## Fireplaces 0
As expected, all marginal impacts are strongly significant and
spillover impacts are rather high. We compare these results with those
obtained using ML estimates with lagsarlm()
(package
spatialreg):
<- spatialreg::lagsarlm(formlin, data = ames_sf1, listw = listW, method = "eigen")
spatregsar summary(spatregsar)
##
## Call:
## spatialreg::lagsarlm(formula = formlin, data = ames_sf1, listw = listW,
## method = "eigen")
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.198850 -0.083134 0.013940 0.103622 0.833629
##
## Type: lag
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.6848760 0.1375643 26.7866 < 2.2e-16
## lnLot_Area 0.0326465 0.0074952 4.3557 1.327e-05
## lnTotal_Bsmt_SF 0.0505013 0.0030733 16.4322 < 2.2e-16
## lnGr_Liv_Area 0.3989056 0.0143884 27.7241 < 2.2e-16
## Garage_Cars 0.1191327 0.0056556 21.0645 < 2.2e-16
## Fireplaces 0.0556015 0.0061491 9.0422 < 2.2e-16
##
## Rho: 0.37934, LR test value: 881.09, p-value: < 2.22e-16
## Asymptotic standard error: 0.011434
## z-value: 33.178, p-value: < 2.22e-16
## Wald statistic: 1100.8, p-value: < 2.22e-16
##
## Log likelihood: 674.5428 for lag model
## ML residual variance (sigma squared): 0.033306, (sigma: 0.1825)
## Number of observations: 2777
## Number of parameters estimated: 8
## AIC: -1333.1, (AIC for lm: -454)
## LM test for residual autocorrelation
## test value: 24.394, p-value: 7.8509e-07
<- as(listW, "CsparseMatrix")
W <- spatialreg::trW(W, type="mult")
trMatc set.seed(1)
::impacts(spatregsar,listw=listW) spatialreg
## Impact measures (lag, exact):
## Direct Indirect Total
## lnLot_Area 0.03543224 0.01716761 0.05259985
## lnTotal_Bsmt_SF 0.05481059 0.02655679 0.08136738
## lnGr_Liv_Area 0.43294388 0.20976970 0.64271358
## Garage_Cars 0.12929822 0.06264749 0.19194571
## Fireplaces 0.06034592 0.02923877 0.08958469
<- spatialreg::impacts(spatregsar, tr=trMatc, R=200)
SAR.impact <- as.character(names(summary(linsar)$bfixed[-1]))
list_varpar <- impactspar(linsar, list_varpar)
imp_parvar summary(imp_parvar)
##
## Total Parametric Impacts (sar)
## Estimate Std. Error t value
## lnLot_Area 0.0527053 0.0126207 4.1761050
## lnTotal_Bsmt_SF 0.0815465 0.0050342 16.1985605
## lnGr_Liv_Area 0.6436259 0.0252074 25.5331721
## Garage_Cars 0.1924182 0.0094206 20.4252207
## Fireplaces 0.0893026 0.0099155 9.0063271
## Pr(>|t|)
## lnLot_Area 0
## lnTotal_Bsmt_SF 0
## lnGr_Liv_Area 0
## Garage_Cars 0
## Fireplaces 0
##
## Direct Parametric Impacts (sar)
## Estimate Std. Error t value
## lnLot_Area 0.0354985 0.0084924 4.1800142
## lnTotal_Bsmt_SF 0.0549195 0.0032628 16.8320907
## lnGr_Liv_Area 0.4334517 0.0149992 28.8982965
## Garage_Cars 0.1295867 0.0059223 21.8810737
## Fireplaces 0.0601447 0.0066199 9.0854955
## Pr(>|t|)
## lnLot_Area 0
## lnTotal_Bsmt_SF 0
## lnGr_Liv_Area 0
## Garage_Cars 0
## Fireplaces 0
##
## Indirect Parametric Impacts (sar)
## Estimate Std. Error t value
## lnLot_Area 0.0172068 0.0041656 4.1306973
## lnTotal_Bsmt_SF 0.0266271 0.0019574 13.6031501
## lnGr_Liv_Area 0.2101742 0.0119501 17.5876251
## Garage_Cars 0.0628315 0.0039944 15.7298777
## Fireplaces 0.0291579 0.0034229 8.5184981
## Pr(>|t|)
## lnLot_Area 0
## lnTotal_Bsmt_SF 0
## lnGr_Liv_Area 0
## Garage_Cars 0
## Fireplaces 0
# Let's compare direct impacts
round(data.frame(spatialreg_direct = summary(SAR.impact, zstats = TRUE, short = TRUE)$res$direct,
sptpreg_direct = summary(imp_parvar_sar)$dir_table[,1]), 3)
## spatialreg_direct sptpreg_direct
## lnLot_Area 0.035 0.035
## lnTotal_Bsmt_SF 0.055 0.055
## lnGr_Liv_Area 0.433 0.433
## Garage_Cars 0.129 0.130
## Fireplaces 0.060 0.061
# Let's compare indirect impacts
round(data.frame(spatialreg_indirect = summary(SAR.impact, zstats = TRUE, short = TRUE)$res$indirect,
sptpreg_indirect = summary(imp_parvar_sar)$ind_table[,1]),3)
## spatialreg_indirect sptpreg_indirect
## lnLot_Area 0.017 0.017
## lnTotal_Bsmt_SF 0.027 0.027
## lnGr_Liv_Area 0.210 0.210
## Garage_Cars 0.063 0.063
## Fireplaces 0.029 0.029
# Let's compare total impacts
round(data.frame(spatialreg_total = summary(SAR.impact, zstats = TRUE, short = TRUE)$res$total,
sptpreg_total = summary(imp_parvar_sar)$tot_table[,1]), 3)
## spatialreg_total sptpreg_total
## lnLot_Area 0.053 0.052
## lnTotal_Bsmt_SF 0.081 0.082
## lnGr_Liv_Area 0.643 0.643
## Garage_Cars 0.192 0.193
## Fireplaces 0.090 0.090
pspatfit()
We now estimate the SLX model that only captures local spatial spillovers through the spatial lags of the explanatory variables:
\[y_{i}= \sum_{k=1}^K \beta_k x_{k,i}
+\sum_{k=1}^K \delta_k \sum_{j=1}^N w_{ij,N}x_{k,j}+
\epsilon_{i}\] \[\epsilon_{i} \sim
i.i.d.(0,\sigma^2_\epsilon)\] This model is estimated with the
function pspatfit()
using the option
type = "slx"
and specifying the set of spatial lags through
Durbin = durbinformlin
:
<- pspatfit(formlin, data = ames_sf1, listw = listW, method = "eigen",
linslx type = "slx", Durbin = durbinformlin)
summary(linslx)
##
## Call
## pspatfit(formula = formlin, data = ames_sf1, listw = listW, type = "slx",
## method = "eigen", Durbin = durbinformlin)
##
## Parametric Terms
## Estimate Std. Error t value
## (Intercept) 7.0751382 0.1528354 46.2925
## lnLot_Area 0.0644581 0.0130493 4.9396
## lnTotal_Bsmt_SF 0.0544297 0.0035152 15.4843
## lnGr_Liv_Area 0.4297845 0.0171579 25.0489
## Garage_Cars 0.1234317 0.0068662 17.9767
## Fireplaces 0.0631958 0.0071814 8.7999
## Wlag.lnLot_Area -0.0333047 0.0144459 -2.3055
## Wlag.lnTotal_Bsmt_SF 0.0339716 0.0046119 7.3661
## Wlag.lnGr_Liv_Area 0.0588254 0.0210296 2.7973
## Wlag.Garage_Cars 0.1404327 0.0082123 17.1002
## Wlag.Fireplaces 0.0305388 0.0090539 3.3730
## Pr(>|t|)
## (Intercept) < 2.2e-16 ***
## lnLot_Area 8.296e-07 ***
## lnTotal_Bsmt_SF < 2.2e-16 ***
## lnGr_Liv_Area < 2.2e-16 ***
## Garage_Cars < 2.2e-16 ***
## Fireplaces < 2.2e-16 ***
## Wlag.lnLot_Area 0.0212136 *
## Wlag.lnTotal_Bsmt_SF 2.307e-13 ***
## Wlag.lnGr_Liv_Area 0.0051895 **
## Wlag.Garage_Cars < 2.2e-16 ***
## Wlag.Fireplaces 0.0007537 ***
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Goodness-of-Fit
##
## EDF Total: 11
## Sigma: 0.203923
## AIC: -6042.92
## BIC: -5977.7
Now we compute impacts for the SLX model. In this case, contrary to the case of the SAR and SDM models, we do not need simulations to make inference on these marginal impacts. We only need to properly compute the variance of total impact using this formula:
\[ Var(\hat{\beta_k})\_{tot} = Var(\hat{\beta_k}) + Var(\hat{\delta_k}) + 2* Cov(\hat{\beta_k}, \hat{\delta_k}) \]
<- impactspar(linslx, listw = listW)
imp_parvar_slx summary(imp_parvar_slx)
##
## Parametric Impacts (slx)
## Direct Indirect Total
## lnLot_Area 0.06446 -0.03330 0.03115
## lnTotal_Bsmt_SF 0.05443 0.03397 0.08840
## lnGr_Liv_Area 0.42978 0.05883 0.48861
## Garage_Cars 0.12343 0.14043 0.26386
## Fireplaces 0.06320 0.03054 0.09373
##
## Standard errors:
## Direct Indirect Total
## lnLot_Area 0.013049 0.014446 0.009491
## lnTotal_Bsmt_SF 0.003515 0.004612 0.005097
## lnGr_Liv_Area 0.017158 0.021030 0.020880
## Garage_Cars 0.006866 0.008212 0.008077
## Fireplaces 0.007181 0.009054 0.009734
##
## Z-values:
## Direct Indirect Total
## lnLot_Area 4.94 -2.305 3.282
## lnTotal_Bsmt_SF 15.48 7.366 17.343
## lnGr_Liv_Area 25.05 2.797 23.400
## Garage_Cars 17.98 17.100 32.670
## Fireplaces 8.80 3.373 9.630
##
## p-values:
## Direct Indirect Total
## lnLot_Area 3.915e-07 1.057e-02 5.149e-04
## lnTotal_Bsmt_SF 2.215e-54 8.785e-14 1.115e-67
## lnGr_Liv_Area 8.984e-139 2.577e-03 2.114e-121
## Garage_Cars 1.484e-72 7.393e-66 2.053e-234
## Fireplaces 6.847e-19 3.718e-04 2.990e-22
We compare the non-nested models linsar
and
linslx
using the function anova()
with the
argument lrtest = FALSE
:
anova(linsar, linslx, lrtest = FALSE)
## logLik rlogLik edf AIC BIC
## linsar 3226.4 3195.2 7 -6438.9 -6334.9
## linslx 3032.5 2978.7 11 -6042.9 -5870.3
It emerges that, from a statistical point of view, the SAR model outperforms the SLX model, suggesting a global spatial diffusion of idiosyncratic shocks.
Now, we compare the results obtained with pspatfit()
with those obtained using ML estimates with lmSLX()
(package spatialreg):
<- spatialreg::lmSLX(formlin, data = ames_sf1,
spatregslx listw = listW)
<- spatialreg::impacts(spatregslx)
SLX.impact # Let's compare direct impacts
round(data.frame(spatialreg_direct = summary(SLX.impact)$impacts$direct,
sptpreg_direct = summary(imp_parvar_slx)$mimpacts[,1]), 3)
## spatialreg_direct sptpreg_direct
## lnLot_Area 0.064 0.064
## lnTotal_Bsmt_SF 0.054 0.054
## lnGr_Liv_Area 0.430 0.430
## Garage_Cars 0.123 0.123
## Fireplaces 0.063 0.063
# Let's compare indirect impacts
round(data.frame(spatialreg_indirect = summary(SLX.impact)$impacts$indirect,
sptpreg_indirect = summary(imp_parvar_slx)$mimpacts[,2]), 3)
## spatialreg_indirect sptpreg_indirect
## lnLot_Area -0.033 -0.033
## lnTotal_Bsmt_SF 0.034 0.034
## lnGr_Liv_Area 0.059 0.059
## Garage_Cars 0.140 0.140
## Fireplaces 0.031 0.031
# Let's compare total impacts
round(data.frame(spatialreg_total = summary(SLX.impact)$impacts$total,
sptpreg_total = summary(imp_parvar_slx)$mimpacts[,3]), 3)
## spatialreg_total sptpreg_total
## lnLot_Area 0.031 0.031
## lnTotal_Bsmt_SF 0.088 0.088
## lnGr_Liv_Area 0.489 0.489
## Garage_Cars 0.264 0.264
## Fireplaces 0.094 0.094
pspatfit()
:The SDM specification encompasses both SAR and SLX: \[y_{i}=\rho \sum_{j=1}^N w_{ij,N} y_{j} + \sum_{k=1}^K \beta_k x_{k,i} +\sum_{k=1}^K \delta_k \sum_{j=1}^N w_{ij,N}x_{k,j}+ \epsilon_{i}\]
\[\epsilon_{i} \sim i.i.d.(0,\sigma^2_\epsilon)\]
We can estimate this model using the option
type = "sdm"
:
<- pspatfit(formlin, data = ames_sf1, listw = listW, method = "eigen", type = "sdm")
linsdm summary(linsdm)
##
## Call
## pspatfit(formula = formlin, data = ames_sf1, listw = listW, type = "sdm",
## method = "eigen")
##
## Parametric Terms
## Estimate Std. Error t value
## (Intercept) 4.4726568 0.1342919 33.3055
## lnLot_Area 0.0736741 0.0114661 6.4254
## lnTotal_Bsmt_SF 0.0480986 0.0030887 15.5726
## lnGr_Liv_Area 0.4197858 0.0150761 27.8445
## Garage_Cars 0.0977829 0.0060331 16.2076
## Fireplaces 0.0573061 0.0063101 9.0816
## Wlag.lnLot_Area -0.0554372 0.0126932 -4.3675
## Wlag.lnTotal_Bsmt_SF 0.0078896 0.0040523 1.9469
## Wlag.lnGr_Liv_Area -0.1107810 0.0184781 -5.9953
## Wlag.Garage_Cars 0.0689136 0.0072159 9.5502
## Wlag.Fireplaces 0.0018016 0.0079554 0.2265
## rho 0.3706482 0.0143361 25.8542
## Pr(>|t|)
## (Intercept) < 2.2e-16 ***
## lnLot_Area 1.543e-10 ***
## lnTotal_Bsmt_SF < 2.2e-16 ***
## lnGr_Liv_Area < 2.2e-16 ***
## Garage_Cars < 2.2e-16 ***
## Fireplaces < 2.2e-16 ***
## Wlag.lnLot_Area 1.303e-05 ***
## Wlag.lnTotal_Bsmt_SF 0.05165 .
## Wlag.lnGr_Liv_Area 2.296e-09 ***
## Wlag.Garage_Cars < 2.2e-16 ***
## Wlag.Fireplaces 0.82085
## rho < 2.2e-16 ***
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Goodness-of-Fit
##
## EDF Total: 12
## Sigma: 0.204557
## AIC: -6555.69
## BIC: -6484.54
LR test for nested models and ANOVA tables:
anova(linsar, linsdm, lrtest = TRUE)
## logLik rlogLik edf AIC BIC LRtest
## linsar 3226.4 3195.2 7 -6438.9 -6334.9
## linsdm 3289.8 3234.7 12 -6555.7 -6374.3 78.991
## p.val
## linsar
## linsdm 1.3644e-15
anova(linslx, linsdm, lrtest = TRUE)
## logLik rlogLik edf AIC BIC LRtest
## linslx 3032.5 2978.7 11 -6042.9 -5870.3
## linsdm 3289.8 3234.7 12 -6555.7 -6374.3 511.92
## p.val
## linslx
## linsdm 2.4194e-113
The LR test suggests that the parametric SDM model outperforms both SAR and SLX.
Computing average direct and indirect marginal impacts:
<- impactspar(linsdm, list_varpar)
imp_parvar_sdm summary(imp_parvar_sdm)
##
## Total Parametric Impacts (sdm)
## Estimate Std. Error t value
## lnLot_Area 0.0280348 0.0130094 2.1549672
## lnTotal_Bsmt_SF 0.0890978 0.0072098 12.3578037
## lnGr_Liv_Area 0.4909620 0.0320430 15.3219618
## Garage_Cars 0.2656510 0.0124283 21.3746147
## Fireplaces 0.0936987 0.0139967 6.6943653
## Pr(>|t|)
## lnLot_Area 0.0312
## lnTotal_Bsmt_SF 0.0000
## lnGr_Liv_Area 0.0000
## Garage_Cars 0.0000
## Fireplaces 0.0000
##
## Direct Parametric Impacts (sdm)
## Estimate Std. Error t value
## lnLot_Area 0.0674193 0.0104178 6.4715501
## lnTotal_Bsmt_SF 0.0537648 0.0032167 16.7142793
## lnGr_Liv_Area 0.4293906 0.0156969 27.3551022
## Garage_Cars 0.1209794 0.0059681 20.2709144
## Fireplaces 0.0621563 0.0062509 9.9436250
## Pr(>|t|)
## lnLot_Area 0
## lnTotal_Bsmt_SF 0
## lnGr_Liv_Area 0
## Garage_Cars 0
## Fireplaces 0
##
## Indirect Parametric Impacts (sdm)
## Estimate Std. Error t value
## lnLot_Area -0.0393844 0.0134482 -2.9286087
## lnTotal_Bsmt_SF 0.0353330 0.0054891 6.4369121
## lnGr_Liv_Area 0.0615714 0.0256972 2.3960389
## Garage_Cars 0.1446716 0.0100585 14.3830237
## Fireplaces 0.0315424 0.0109690 2.8755961
## Pr(>|t|)
## lnLot_Area 0.0034
## lnTotal_Bsmt_SF 0.0000
## lnGr_Liv_Area 0.0166
## Garage_Cars 0.0000
## Fireplaces 0.0040
Comparing the results with those obtained using ML estimates with
lagsarlm()
function of package
spatialreg:
<- spatialreg::lagsarlm(formlin, data = ames_sf1, listw = listW,
spatregsdm method = "eigen", Durbin = TRUE)
<- as(listW, "CsparseMatrix")
W <- spatialreg::trW(W, type = "mult")
trMatc set.seed(1)
<- spatialreg::impacts(spatregsdm, tr = trMatc, R = 200)
SDM.impact # Let's compare direct impacts
round(data.frame(spatialreg_direct = summary(SDM.impact, zstats = TRUE, short = TRUE)$res$direct,
sptpreg_direct = summary(imp_parvar_sdm)$dir_table[,1]),3)
## spatialreg_direct sptpreg_direct
## lnLot_Area 0.068 0.067
## lnTotal_Bsmt_SF 0.054 0.054
## lnGr_Liv_Area 0.429 0.429
## Garage_Cars 0.120 0.121
## Fireplaces 0.062 0.062
# Let's compare indirect impacts
round(data.frame(spatialreg_indirect = summary(SDM.impact, zstats = TRUE, short = TRUE)$res$indirect,
sptpreg_indirect = summary(imp_parvar_sdm)$ind_table[,1]), 3)
## spatialreg_indirect sptpreg_indirect
## lnLot_Area -0.039 -0.039
## lnTotal_Bsmt_SF 0.035 0.035
## lnGr_Liv_Area 0.060 0.062
## Garage_Cars 0.144 0.145
## Fireplaces 0.031 0.032
# Let's compare indirect impacts
round(data.frame(spatialreg_indirect = summary(SDM.impact, zstats=TRUE, short = TRUE)$res$total,
sptpreg_indirect = summary(imp_parvar_sdm)$tot_table[,1]), 3)
## spatialreg_indirect sptpreg_indirect
## lnLot_Area 0.029 0.028
## lnTotal_Bsmt_SF 0.089 0.089
## lnGr_Liv_Area 0.490 0.491
## Garage_Cars 0.264 0.266
## Fireplaces 0.094 0.094
pspatfit()
The last parametric specification considered here is the SEM. This model that spatial spillovers occurs only for the unobserved random shocks:
\[y_{i}=\sum_{k=1}^K \beta_k x_{k,i} +
\epsilon_{i}\] \[\epsilon_{i}=\theta
\sum_{j=1}^N w_{ij,N}\epsilon_{j}+u_{i}\] \[u_{i} \sim i.i.d.(0,\sigma^2_u)\] We
estimate this model using the option type = "sem"
:
<- pspatfit(formlin, data = ames_sf1, listw = listW, method = "eigen", type = "sem")
linsem summary(linsem)
##
## Call
## pspatfit(formula = formlin, data = ames_sf1, listw = listW, type = "sem",
## method = "eigen")
##
## Parametric Terms
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.2271291 0.1293879 55.8563 < 2.2e-16
## lnLot_Area 0.0747966 0.0106472 7.0250 2.684e-12
## lnTotal_Bsmt_SF 0.0505039 0.0032908 15.3469 < 2.2e-16
## lnGr_Liv_Area 0.4853052 0.0158438 30.6306 < 2.2e-16
## Garage_Cars 0.1206543 0.0063310 19.0578 < 2.2e-16
## Fireplaces 0.0645938 0.0067493 9.5704 < 2.2e-16
## delta 0.4459877 0.0158826 28.0803 < 2.2e-16
##
## (Intercept) ***
## lnLot_Area ***
## lnTotal_Bsmt_SF ***
## lnGr_Liv_Area ***
## Garage_Cars ***
## Fireplaces ***
## delta ***
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Goodness-of-Fit
##
## EDF Total: 7
## Sigma: 0.234186
## AIC: -6060.65
## BIC: -6019.15
anova(linsem,linsdm, lrtest = TRUE)
## logLik rlogLik edf AIC BIC LRtest
## linsem 3037.3 3007.7 7 -6060.7 -5959.9
## linsdm 3289.8 3234.7 12 -6555.7 -6374.3 453.99
## p.val
## linsem
## linsdm 6.7678e-96
The spatial spillover parameter \(\delta\) is rather high (0.45) and statistically significant. As well known, the SEM is also nested in the SDM, so we can use a LR test to compare the two models. The results suggest again that the SDM is the best parametric specification.
Comparing the results with those obtained using ML estimates with
errorsarlm()
function of package
spatialreg:
<- spatialreg::errorsarlm(formlin, data = ames_sf1, listw = listW, method = "eigen")
spatregsem summary(spatregsem)
##
## Call:
## spatialreg::errorsarlm(formula = formlin, data = ames_sf1, listw = listW,
## method = "eigen")
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.318632 -0.089159 0.012209 0.108057 0.968884
##
## Type: error
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.2360703 0.1290059 56.0910 < 2.2e-16
## lnLot_Area 0.0752097 0.0106174 7.0836 1.404e-12
## lnTotal_Bsmt_SF 0.0502507 0.0032804 15.3185 < 2.2e-16
## lnGr_Liv_Area 0.4841243 0.0157950 30.6505 < 2.2e-16
## Garage_Cars 0.1193857 0.0063115 18.9155 < 2.2e-16
## Fireplaces 0.0642931 0.0067281 9.5560 < 2.2e-16
##
## Lambda: 0.44654, LR test value: 502.92, p-value: < 2.22e-16
## Asymptotic standard error: 0.014098
## z-value: 31.674, p-value: < 2.22e-16
## Wald statistic: 1003.2, p-value: < 2.22e-16
##
## Log likelihood: 485.4588 for error model
## ML residual variance (sigma squared): 0.036888, (sigma: 0.19206)
## Number of observations: 2777
## Number of parameters estimated: 8
## AIC: -954.92, (AIC for lm: -454)
::Hausman.test(spatregsem)# Test OLS vs. SEM spatialreg
##
## Spatial Hausman test (asymptotic)
##
## data: NULL
## Hausman test = 408.6, df = 6, p-value < 2.2e-16
::LR.Sarlm(spatregsdm,spatregsem)## Common factor test spatialreg
##
## Likelihood ratio for spatial linear models
##
## data:
## Likelihood ratio = 505.03, df = 5, p-value <
## 2.2e-16
## sample estimates:
## Log likelihood of spatregsdm
## 737.9734
## Log likelihood of spatregsem
## 485.4588
round(data.frame(sptpsar = summary(linsem)$bfixed, spatregsem = summary(spatregsem)$coefficients), 3)
## sptpsar spatregsem
## (Intercept) 7.227 7.236
## lnLot_Area 0.075 0.075
## lnTotal_Bsmt_SF 0.051 0.050
## lnGr_Liv_Area 0.485 0.484
## Garage_Cars 0.121 0.119
## Fireplaces 0.065 0.064
We now provide examples of the estimation of semiparametric models. Let’s start with a simple semiparametric model without spatial trends and without spatially lagged terms: \[y_{i}=\sum_{k=1}^K \beta^*_k x^*_{k,it} +\sum_{\delta=1}^\Delta g_\delta(x_{\delta, it}) + \epsilon_{i}\]
\[\epsilon_{i}\sim
i.i.d.(0,\sigma^2_\epsilon)\] In particular, we introduce the
discrete variables Fireplaces and Garage_Cars as
linear terms and the continuous variables lnLot_Area,
lnTotal_Bsmt_SF, and lnGr_Liv_Area as smooth terms,
using the function pspl()
with 20 knots:
<- lnSale_Price ~ Fireplaces + Garage_Cars +
formgam pspl(lnLot_Area, nknots = 20) +
pspl(lnTotal_Bsmt_SF, nknots = 20) +
pspl(lnGr_Liv_Area, nknots = 20)
<- pspatfit(formgam, data = ames_sf1)
gam summary(gam)
##
## Call
## pspatfit(formula = formgam, data = ames_sf1)
##
## Parametric Terms
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.7419169 0.1940518 55.356 < 2.2e-16 ***
## Fireplaces 0.0706684 0.0069634 10.149 < 2.2e-16 ***
## Garage_Cars 0.1578378 0.0063725 24.769 < 2.2e-16 ***
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Non-Parametric Terms
## EDF
## pspl(lnLot_Area, nknots = 20) 11.6416
## pspl(lnTotal_Bsmt_SF, nknots = 20) 6.1814
## pspl(lnGr_Liv_Area, nknots = 20) 13.8279
##
## Goodness-of-Fit
##
## EDF Total: 34.6509
## Sigma: 0.202927
## AIC: -5876.5
## BIC: -5671.05
The EDF numbers clearly suggest that the three continuout variables enter the model nonlinearly.
Now, we introduce the spatial lag of the dependent variable, thus specifying a semiparametric SAR model: \[y_{i}=\rho \sum_{j=1}^N w_{ij,N} y_{j} +\sum_{k=1}^K \beta^*_k x^*_{k,i} + \sum_{\delta=1}^\Delta g_\delta(x_{\delta, i}) +\epsilon_{i}\] \[\epsilon_{i}\sim i.i.d.(0,\sigma^2_\epsilon)\]
<- pspatfit(formgam, data = ames_sf1, listw = listW, method = "eigen", type = "sar")
gamsar summary(gamsar)
##
## Call
## pspatfit(formula = formgam, data = ames_sf1, listw = listW, type = "sar",
## method = "eigen")
##
## Parametric Terms
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.9278045 0.2321497 29.8420 < 2.2e-16 ***
## Fireplaces 0.0505388 0.0058912 8.5787 < 2.2e-16 ***
## Garage_Cars 0.1033614 0.0053970 19.1515 < 2.2e-16 ***
## rho 0.3432733 0.0110722 31.0031 < 2.2e-16 ***
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Non-Parametric Terms
## EDF
## pspl(lnLot_Area, nknots = 20) 14.4067
## pspl(lnTotal_Bsmt_SF, nknots = 20) 7.1898
## pspl(lnGr_Liv_Area, nknots = 20) 16.1097
##
## Goodness-of-Fit
##
## EDF Total: 41.7062
## Sigma: 0.189394
## AIC: -6585.59
## BIC: -6338.31
anova(linsar, gamsar, lrtest = TRUE)
## logLik rlogLik edf AIC BIC LRtest
## linsar 3226.4 3195.2 7.000 -6438.9 -6334.9
## gamsar 3334.5 3321.7 41.706 -6585.6 -6313.3 252.96
## p.val
## linsar
## gamsar 5.6473e-35
The spatial spillover parameter is now 0.34, a bit lower than the one estimated with the linear SAR (0.38) and SDM (0.37), confirming the trade off between nonlinearities and spatial dependence (Basile et al. 2014). The log-likelihood of the semiparametric SAR is higher than that of the linear SAR, and the LR test also suggests that this difference is statistically significant (notice that the linear SAR model is nested in the semiparametric SAR). Moreover, the AIC value of the semiparametric model is lower than that of the linear SAR, confirming that the goodness of fit of the semiparametric model is higher that that of the linear model. However, the BIC value works in favor of the linear specification. This is because the BIC penalizes more strongly more complex models than the AIC.
Let’s now introduce also a spatial trend 2d (without the ANOVA decomposition) in order to control for unobserved spatial heterogeneity:
\[y_{i}=\rho \sum_{j=1}^N w_{ij,N} y_{j}+ \sum_{k=1}^K \beta^*_k x^*_{k,i} + \sum_{\delta=1}^\Delta g_\delta(x_{\delta, i}) + \widetilde{ f}(s_{1i},s_{2i})+\epsilon_{i}\]
\[\epsilon_{i}\sim i.i.d.(0,\sigma^2_\epsilon)\] To speed up the computational time, we compute the spatial Jacobian using the Chebyshev transformation.
<- pspatfit(form2d, data = ames_sf1, listw = listW, method = "Chebyshev", type = "sar")
sp2dsar summary(sp2dsar)
##
## Call
## pspatfit(formula = form2d, data = ames_sf1, listw = listW, type = "sar",
## method = "Chebyshev")
##
## Parametric Terms
## Estimate Std. Error t value Pr(>|t|)
## Fireplaces 0.0566376 0.0057981 9.7684 < 2.2e-16 ***
## Garage_Cars 0.0654183 0.0056882 11.5007 < 2.2e-16 ***
## rho 0.1807030 0.0132934 13.5934 < 2.2e-16 ***
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Non-Parametric Terms
## EDF
## pspl(lnLot_Area, nknots = 20) 14.849
## pspl(lnTotal_Bsmt_SF, nknots = 20) 7.296
## pspl(lnGr_Liv_Area, nknots = 20) 16.207
##
## Non-Parametric Spatio-Temporal Trend
## EDF
## f(sp1, sp2) 48.402
##
## Goodness-of-Fit
##
## EDF Total: 93.7541
## Sigma: 0.174099
## AIC: -6873.89
## BIC: -6318.01
anova(gamsar, sp2dsar, lrtest=TRUE)
## logLik rlogLik edf AIC BIC LRtest
## gamsar 3334.5 3321.7 41.706 -6585.6 -6313.3
## sp2dsar 3530.7 3524.3 93.754 -6873.9 -6308.4 405.22
## p.val
## gamsar
## sp2dsar 3.6448e-56
The estimated spatial spillover parameter \(\rho\) (0.19) is much lower than the one estimated above, suggesting that the SAR model without spatial trend (both linear and nonlinear) actually captures spatial autocorrelated unobserved heterogeneity.
The marginal (direct, indirect and total) impacts for parametric
terms are computed as usual with the function
impactspar()
:
<- c("Fireplaces", "Garage_Cars")
list_varpar <- impactspar(sp2dsar, list_varpar)
imp_parvar summary(imp_parvar)
##
## Total Parametric Impacts (sar)
## Estimate Std. Error t value Pr(>|t|)
## Fireplaces 0.0692105 0.0074557 9.2829108 0
## Garage_Cars 0.0801063 0.0071279 11.2383344 0
##
## Direct Parametric Impacts (sar)
## Estimate Std. Error t value Pr(>|t|)
## Fireplaces 0.0576530 0.0061305 9.4042300 0
## Garage_Cars 0.0667326 0.0058558 11.3960476 0
##
## Indirect Parametric Impacts (sar)
## Estimate Std. Error t value Pr(>|t|)
## Fireplaces 0.0115575 0.0015763 7.3322498 0
## Garage_Cars 0.0133737 0.0016118 8.2972527 0
As for the three non-parametric terms, we can plot the estimated smooth impact functions using the algorithms described in the vignette A:
<- c("lnLot_Area", "lnTotal_Bsmt_SF",
list_varnopar "lnGr_Liv_Area")
<- impactsnopar(sp2dsar, listw = listW, viewplot = TRUE,smooth = FALSE)
sp2dsar_impnopar plot_impactsnopar(sp2dsar_impnopar, data = ames_sf1, smooth = TRUE)
Now, an example with the ANOVA decomposition of the spatial trend (PS-ANOVA):
\[y_{i}=\rho \sum_{j=1}^N w_{ij,N} y_{j}+ \sum_{k=1}^K \beta^*_k x^*_{k,i} + \sum_{\delta=1}^\Delta g_\delta(x_{\delta, i}) + f_1(s_{1i})+f_2(s_{2i})+f_{1,2}(s_{1i},s_{2i})+\epsilon_{i}\]
\[\epsilon_{i}\sim
i.i.d.(0,\sigma^2_\epsilon)\] This model is estimated using the
option psanova = TRUE
within the function
pspt()
for the spatial trend:
# Semiparametric model with PS-ANOVA spatial trend in 2d
<- lnSale_Price ~ Fireplaces + Garage_Cars +
form2d_psanova pspl(lnLot_Area, nknots = 20) +
pspl(lnTotal_Bsmt_SF, nknots = 20) +
pspl(lnGr_Liv_Area, nknots = 20) +
pspt(Longitude,Latitude, nknots = c(10, 10), psanova = TRUE)
<- pspatfit(form2d_psanova, data = ames_sf1, listw = listW,
sp2danovasar method = "Chebyshev", type = "sar")
summary(sp2danovasar)
##
## Call
## pspatfit(formula = form2d_psanova, data = ames_sf1, listw = listW,
## type = "sar", method = "Chebyshev")
##
## Parametric Terms
## Estimate Std. Error t value Pr(>|t|)
## Intercept 8.9920451 0.2321429 38.7350 < 2.2e-16 ***
## Fireplaces 0.0570771 0.0058221 9.8035 < 2.2e-16 ***
## Garage_Cars 0.0649741 0.0057239 11.3514 < 2.2e-16 ***
## rho 0.1759478 0.0133925 13.1378 < 2.2e-16 ***
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Non-Parametric Terms
## EDF
## pspl(lnLot_Area, nknots = 20) 15.0875
## pspl(lnTotal_Bsmt_SF, nknots = 20) 7.3422
## pspl(lnGr_Liv_Area, nknots = 20) 16.2796
##
## Non-Parametric Spatio-Temporal Trend
## EDF
## f1 0.826
## f2 6.985
## f12 77.470
##
## Goodness-of-Fit
##
## EDF Total: 130.991
## Sigma: 0.195116
## AIC: -6786.08
## BIC: -6009.42
anova(sp2dsar, sp2danovasar, lrtest=FALSE)
## logLik rlogLik edf AIC BIC
## sp2dsar 3530.7 3524.3 93.754 -6873.9 -6308.4
## sp2danovasar 3524.0 3505.9 130.991 -6786.1 -5979.4
Plot of non-parametric direct, indirect and total impacts:
<- impactsnopar(sp2danovasar, listw = listW, viewplot = FALSE)
sp2danovasarimpnopar plot_impactsnopar(sp2danovasarimpnopar, data = ames_sf1, smooth = TRUE)
Parametric direct, indirect and total impacts:
<- as.character(names(summary(sp2danovasar)$bfixed[1]))
list_varpar <- impactspar(sp2danovasar, list_varpar)
imp_parvar summary(imp_parvar)
##
## Total Parametric Impacts (sar)
## Estimate Std. Error t value Pr(>|t|)
## Fireplaces 0.0692412 0.0069141 10.0145083 0
## Garage_Cars 0.0787322 0.0071085 11.0757132 0
##
## Direct Parametric Impacts (sar)
## Estimate Std. Error t value Pr(>|t|)
## Fireplaces 0.0579769 0.0057039 10.1645145 0
## Garage_Cars 0.0659238 0.0058361 11.2957841 0
##
## Indirect Parametric Impacts (sar)
## Estimate Std. Error t value Pr(>|t|)
## Fireplaces 0.0112643 0.0015043 7.4880408 0
## Garage_Cars 0.0128083 0.0016253 7.8808174 0
Now, we show how to plot spatial trends using spatial coordinates.
Notice, that the database is an sf
object and excludes
duplicated spatial points. For the model without the ANOVA
decomposition, we can only plot the 2d smooth interaction effect of
latitude and longitude:
plot_sp2d(sp2dsar, data = ames_sf1)
For the model with the ANOVA decomposition, we plot either the 2d spatial trend or its decomposition in main effects and interaction effect:
plot_sp2d(sp2danovasar, data = ames_sf1, addmain = TRUE, addint = TRUE)
Finally, we estimate a semiparametric Spatial Durbin Model with spatial trend, selecting the spatial lag covariates. The results, however, are in favor of the SAR model with spatial trend.
# Semiparametric model with PS-ANOVA spatial trend in 2d
<- lnSale_Price ~ Fireplaces + Garage_Cars +
form2d_psanova pspl(lnLot_Area, nknots = 20) +
pspl(lnTotal_Bsmt_SF, nknots = 20) +
pspl(lnGr_Liv_Area, nknots = 20) +
pspt(Longitude,Latitude, nknots = c(10, 10), psanova = TRUE)
<- ~ Fireplaces + Garage_Cars +
durbinformnonlin pspl(lnLot_Area, nknots = 20)
<- pspatfit(form2d_psanova, data = ames_sf1, listw = listW, method = "Chebyshev", type = "sdm", Durbin = durbinformnonlin)
sp2danovasdm summary(sp2danovasdm)
##
## Call
## pspatfit(formula = form2d_psanova, data = ames_sf1, listw = listW,
## type = "sdm", method = "Chebyshev", Durbin = durbinformnonlin)
##
## Parametric Terms
## Estimate Std. Error t value Pr(>|t|)
## Intercept 8.9922595 0.2326829 38.6460 < 2.2e-16
## Fireplaces 0.0561448 0.0058888 9.5342 < 2.2e-16
## Garage_Cars 0.0617895 0.0057924 10.6674 < 2.2e-16
## Wlag.Fireplaces 0.0103328 0.0073488 1.4061 0.159823
## Wlag.Garage_Cars 0.0198640 0.0073226 2.7127 0.006717
## rho 0.1788981 0.0149778 11.9442 < 2.2e-16
##
## Intercept ***
## Fireplaces ***
## Garage_Cars ***
## Wlag.Fireplaces
## Wlag.Garage_Cars **
## rho ***
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Non-Parametric Terms
## EDF
## pspl(lnLot_Area, nknots = 20) 14.8846
## pspl(lnTotal_Bsmt_SF, nknots = 20) 7.3656
## pspl(lnGr_Liv_Area, nknots = 20) 16.2196
## pspl(Wlag.lnLot_Area, nknots = 20) 13.8968
##
## Non-Parametric Spatio-Temporal Trend
## EDF
## f1 0.116
## f2 6.937
## f12 60.128
##
## Goodness-of-Fit
##
## EDF Total: 128.548
## Sigma: 0.180651
## AIC: -6774.94
## BIC: -6012.77
anova(sp2danovasdm, sp2danovasar, lrtest = FALSE)
## logLik rlogLik edf AIC BIC
## sp2danovasdm 3516 3486.0 128.55 -6774.9 -5958.8
## sp2danovasar 3524 3505.9 130.99 -6786.1 -5979.4