pspl_terms {pspatreg} | R Documentation |
Functions to include non-parametric continous covariates and spatial or spatio-temporal trends in semiparametric regression models.
Description
The pspl()
and pspt()
functions
allow the inclusion of non-parametric continuous covariates
and spatial or spatio-temporal trends in semiparametric
regression models. Both type of terms are modelled using P-splines.
pspl()
: This function allows the inclusion of terms for
non-parametric covariates in semiparametric models.
Each non-parametric covariate must be included with its own pspl
term in a formula.
pspt()
: This function allows the inclusion of a spatial or
spatio-temporal trend in the formula of the
semiparametric spatial or spatio-temporal models.
The trend can be decomposed in an ANOVA
functional way including main and interaction effects.
Usage
pspl(
x,
xl = min(x) - 0.01 * abs(min(x)),
xr = max(x) + 0.01 * abs(max(x)),
nknots = 10,
bdeg = 3,
pord = 2,
decom = 2,
scale = TRUE
)
pspt(
sp1,
sp2,
time = NULL,
scale = TRUE,
ntime = NULL,
xl_sp1 = min(sp1) - 0.01 * abs(min(sp1)),
xr_sp1 = max(sp1) + 0.01 * abs(max(sp1)),
xl_sp2 = min(sp2) - 0.01 * abs(min(sp2)),
xr_sp2 = max(sp2) + 0.01 * abs(max(sp2)),
xl_time = min(time) - 0.01 * abs(min(time)),
xr_time = max(time) + 0.01 * abs(max(time)),
nknots = c(10, 10, 5),
bdeg = c(3, 3, 3),
pord = c(2, 2, 2),
decom = 2,
psanova = FALSE,
nest_sp1 = 1,
nest_sp2 = 1,
nest_time = 1,
f1_main = TRUE,
f2_main = TRUE,
ft_main = TRUE,
f12_int = TRUE,
f1t_int = TRUE,
f2t_int = TRUE,
f12t_int = TRUE
)
Arguments
x |
Name of the covariate. |
xl |
Minimum of the interval for the continuous covariate. |
xr |
Maximum of the interval for the continuous covariate. |
nknots |
Vector including the number of knots of each
coordinate for spline bases. Default = c(10,10,5). The order of the knots
in the vector follows the order of the specified spatio-temporal parameters
so the first value of the vector is the number of knots for |
bdeg |
Order of the B-spline bases. Default = c(3,3,3). |
pord |
Order of the penalty for the difference matrices in P-spline. Default = c(2,2,2). |
decom |
Type of decomposition of fixed part when P-spline
term is expressed as a mixed model. If |
scale |
Logical value to scale the spatial and temporal coordinates before the estimation of semiparametric model. Default = 'TRUE' |
sp1 |
Name of the first spatial coordinate. |
sp2 |
Name of the second spatial coordinate. |
time |
Name of the temporal coordinate. It must be specified only for spatio-temporal trends when using panel data. Default = 'NULL'. |
ntime |
Number of temporal periods in panel data. |
xl_sp1 |
Minimum of the interval for the first spatial coordinate. |
xr_sp1 |
Maximum of the interval for the first spatial coordinate. |
xl_sp2 |
Minimum of the interval for the second spatial coordinate. |
xr_sp2 |
Maximum of the interval for the second spatial coordinate. |
xl_time |
Minimum of the interval for the temporal coordinate. |
xr_time |
Maximum of the interval for the temporal coordinate. |
psanova |
Logical value to choose an ANOVA decomposition
of the spatial or spatio-temporal trend. Default = 'FALSE'.
If 'TRUE', you must specify the divisors for
main, and interaction effects. More in |
nest_sp1 |
Vector including the divisor of the knots for main and interaction effects for the first spatial coordinate. It is used for ANOVA decomposition models including nested bases. Default = 1 (no nested bases). The values must be divisors and the resulting value of the division should not be smaller than 4. |
nest_sp2 |
Vector including the divisor of the knots for main and interaction effects for the second spatial coordinate. It is used for ANOVA decomposition models including nested bases. Default = 1 (no nested bases). The values must be divisors and the resulting value of the division should not be smaller than 4. |
nest_time |
Vector including the divisor of the knots for main and interaction effects for the temporal coordinate. It is used for ANOVA decomposition models including nested bases. Default = 1 (no nested bases). The values must be divisors and the resulting value of the division should not be smaller than 4. |
f1_main |
Logical value to include main effect for the first spatial coordinate in ANOVA models. Default = 'TRUE'. |
f2_main |
Logical value to include main effect for the second spatial coordinate in ANOVA models. Default = 'TRUE'. |
ft_main |
Logical value to include main effect for the temporal coordinate in ANOVA models. Default = 'TRUE'. |
f12_int |
Logical value to include second-order interaction effect between first and second spatial coordinates in ANOVA models. Default = 'TRUE'. |
f1t_int |
Logical value to include second-order interaction effect between first spatial and temporal coordinates in ANOVA models. Default = 'TRUE'. |
f2t_int |
Logical value to include second-order interaction effect between second spatial and temporal coordinates in ANOVA models. Default = 'TRUE'. |
f12t_int |
Logical value to include third-order interaction effect between first and second spatial coordinates and temporal coordinates in ANOVA models. Default = 'TRUE'. |
Value
pspl()
: An object of class bs including.
B | Matrix including B-spline basis for the covariate |
a | List including nknots, knots, bdeg, pord and decom. |
pspt()
: An object of class bs including.
B | Matrix including B-spline basis for the covariate |
a | List including sp1, sp2, time, nknots, bdeg, pord, decom, psanova, nest_sp1, nest_sp2, nest_time, f1_main, f2_main, ft_main, f12_int, f1t_int, f2t_int, and f12t_int. |
References
Eilers, P. and Marx, B. (1996). Flexible Smoothing with B-Splines and Penalties. Statistical Science, (11), 89-121.
Eilers, P. and Marx, B. (2021). Practical Smoothing. The Joys of P-Splines. Cambridge University Press.
Fahrmeir, L.; Kneib, T.; Lang, S.; and Marx, B. (2021). Regression. Models, Methods and Applications (2nd Ed.). Springer.
Lee, D. and Durban, M. (2011). P-Spline ANOVA Type Interaction Models for Spatio-Temporal Smoothing. Statistical Modelling, (11), 49-69. <doi:10.1177/1471082X1001100104>
Lee, D. J., Durban, M., and Eilers, P. (2013). Efficient two-dimensional smoothing with P-spline ANOVA mixed models and nested bases. Computational Statistics & Data Analysis, (61), 22-37. <doi:10.1016/j.csda.2012.11.013>
Minguez, R.; Basile, R. and Durban, M. (2020). An Alternative Semiparametric Model for Spatial Panel Data. Statistical Methods and Applications, (29), 669-708. <doi: 10.1007/s10260-019-00492-8>
Wood, S.N. (2017). Generalized Additive Models. An Introduction with
R
(second edition). CRC Press, Boca Raton.
See Also
pspatfit
estimate semiparametric spatial or
spatio-temporal regression models.
Examples
library(pspatreg)
###############################################
# Examples using spatial data of Ames Houses.
###############################################
library(spdep)
library(sf)
ames <- AmesHousing::make_ames() # Raw Ames Housing Data
ames_sf <- st_as_sf(ames, coords = c("Longitude", "Latitude"))
ames_sf$Longitude <- ames$Longitude
ames_sf$Latitude <- ames$Latitude
ames_sf$lnSale_Price <- log(ames_sf$Sale_Price)
ames_sf$lnLot_Area <- log(ames_sf$Lot_Area)
ames_sf$lnTotal_Bsmt_SF <- log(ames_sf$Total_Bsmt_SF+1)
ames_sf$lnGr_Liv_Area <- log(ames_sf$Gr_Liv_Area)
ames_sf1 <- ames_sf[(duplicated(ames_sf$Longitude) == FALSE), ]
#### GAM pure with pspatreg
form1 <- lnSale_Price ~ Fireplaces + Garage_Cars +
pspl(lnLot_Area, nknots = 20) +
pspl(lnTotal_Bsmt_SF, nknots = 20) +
pspl(lnGr_Liv_Area, nknots = 20)
gampure <- pspatfit(form1, data = ames_sf1)
summary(gampure)
########### Constructing the spatial weights matrix
coord_sf1 <- cbind(ames_sf1$Longitude, ames_sf1$Latitude)
k5nb <- knn2nb(knearneigh(coord_sf1, k = 5,
longlat = TRUE, use_kd_tree = FALSE), sym = TRUE)
lw_ames <- nb2listw(k5nb, style = "W",
zero.policy = FALSE)
##################### GAM + SAR Model
gamsar <- pspatfit(form1, data = ames_sf1,
type = "sar", listw = lw_ames,
method = "Chebyshev")
summary(gamsar)
### Models with 2d spatial trend
form2 <- lnSale_Price ~ Fireplaces + Garage_Cars +
pspl(lnLot_Area, nknots = 20) +
pspl(lnTotal_Bsmt_SF, nknots = 20) +
pspl(lnGr_Liv_Area, nknots = 20) +
pspt(Longitude, Latitude,
nknots = c(10, 10),
psanova = FALSE)
##################### GAM + GEO Model
gamgeo2d <- pspatfit(form2, data = ames_sf1)
summary(gamgeo2d)
gamgeo2dsar <- pspatfit(form2, data = ames_sf1,
type = "sar",
listw = lw_ames,
method = "Chebyshev")
summary(gamgeo2dsar)
### Models with psanova 2d spatial trend
form3 <- lnSale_Price ~ Fireplaces + Garage_Cars +
pspl(lnLot_Area, nknots = 20) +
pspl(lnTotal_Bsmt_SF, nknots = 20) +
pspl(lnGr_Liv_Area, nknots = 20) +
pspt(Longitude, Latitude,
nknots = c(10, 10),
psanova = TRUE)
gamgeo2danovasar <- pspatfit(form3, data = ames_sf1,
type = "sar",
listw = lw_ames, method = "Chebyshev")
summary(gamgeo2danovasar)
###############################################
###################### Examples using a panel data of rate of
###################### unemployment for 103 Italian provinces in 1996-2019.
###############################################
## load spatial panel and Wsp_it
## 103 Italian provinces. Period 1996-2019
data(unemp_it, package = "pspatreg")
## Wsp_it is a matrix. Create a neighboord list
lwsp_it <- spdep::mat2listw(Wsp_it, style = "W")
### Spatio-temporal semiparametric ANOVA model
### Interaction terms f12,f1t,f2t and f12t with nested basis
### Remark: nest_sp1, nest_sp2 and nest_time must be divisors of nknots
form4 <- unrate ~ partrate + agri + cons +
pspl(serv, nknots = 15) +
pspl(empgrowth, nknots = 20) +
pspt(long, lat, year,
nknots = c(18, 18, 8),
psanova = TRUE,
nest_sp1 = c(1, 2, 2),
nest_sp2 = c(1, 2, 2),
nest_time = c(1, 2, 2))
sptanova <- pspatfit(form4, data = unemp_it)
summary(sptanova)
################################################
### Interaction terms f1t not included in ANOVA decomposition
form5 <- unrate ~ partrate + agri + cons +
pspl(serv, nknots = 15) +
pspl(empgrowth, nknots=20) +
pspt(long, lat, year,
nknots = c(18, 18, 8),
psanova = TRUE,
nest_sp1 = c(1, 2, 3),
nest_sp2 = c(1, 2, 3),
nest_time = c(1, 2, 2),
f1t_int = FALSE)
## Add sar specification and ar1 temporal correlation
sptanova2_sar_ar1 <- pspatfit(form5, data = unemp_it,
listw = lwsp_it,
type = "sar",
cor = "ar1")
summary(sptanova2_sar_ar1)