besf {spmoran} | R Documentation |
Spatial regression with RE-ESF for very large samples
Description
Parallel and memory-free implementation of RE-ESF-based spatial regression for very large samples. This model estimates residual spatial dependence, constant coefficients, and non-spatially varying coefficients (NVC; coefficients varying with respect to explanatory variable value).
Usage
besf( y, x = NULL, nvc = FALSE, nvc_sel = TRUE, coords, s_id = NULL,
covmodel="exp", enum = 200, method = "reml", penalty = "bic", nvc_num = 5,
maxiter = 30, bsize = 4000, ncores = NULL )
Arguments
y |
Vector of explained variables (N x 1) |
x |
Matrix of explanatory variables (N x K) |
nvc |
If TRUE, NVCs are assumed on x. Otherwise, constant coefficients are assumed. Default is FALSE |
nvc_sel |
If TRUE, type of coefficients (NVC or constant) is selected through a BIC (default) or AIC minimization. If FALSE, NVCs are assumed across x. Alternatively, nvc_sel can be given by column number(s) of x. For example, if nvc_sel = 2, the coefficient on the second explanatory variable in x is NVC and the other coefficients are constants. The Default is TRUE |
coords |
Matrix of spatial point coordinates (N x 2) |
s_id |
Optional. ID specifying groups modeling spatially dependent process (N x 1). If it is specified, group-level spatial process is estimated. It is useful. e.g., for multilevel modeling (s_id is given by the group ID) and panel data modeling (s_id is given by individual location id). Default is NULL |
covmodel |
Type of kernel to model spatial dependence. The currently available options are "exp" for the exponential kernel, "gau" for the Gaussian kernel, and "sph" for the spherical kernel |
enum |
Number of Moran eigenvectors to be used for spatial process modeling (scalar). Default is 200 |
method |
Estimation method. Restricted maximum likelihood method ("reml") and maximum likelihood method ("ml") are available. Default is "reml" |
penalty |
Penalty to select type of coefficients (NVC or constant) to stablize the estimates. The current options are "bic" for the Baysian information criterion-type penalty (N x log(K)) and "aic" for the Akaike information criterion (2K) (see Muller et al., 2013). Default is "bic" |
nvc_num |
Number of basis functions used to model NVC. An intercept and nvc_num natural spline basis functions are used to model each NVC. Default is 5 |
maxiter |
Maximum number of iterations. Default is 30 |
bsize |
Block/badge size. bsize x bsize elements are iteratively processed during the parallelized computation. Default is 4000 |
ncores |
Number of cores used for the parallel computation. If ncores = NULL, the number of available cores is detected. Default is NULL |
Value
b |
Matrix with columns for the estimated coefficients on x, their standard errors, z-values, and p-values (K x 4). Effective if nvc =FALSE |
c_vc |
Matrix of estimated NVCs on x (N x K). Effective if nvc =TRUE |
cse_vc |
Matrix of standard errors for the NVCs on x (N x K). Effective if nvc =TRUE |
ct_vc |
Matrix of t-values for the NVCs on x (N x K). Effective if nvc =TRUE |
cp_vc |
Matrix of p-values for the NVCs on x (N x K). Effective if nvc =TRUE |
s |
Vector of estimated variance parameters (2 x 1). The first and the second elements denote the standard deviation and the Moran's I value of the estimated spatially dependent component, respectively. The Moran's I value is scaled to take a value between 0 (no spatial dependence) and 1 (the maximum possible spatial dependence). Based on Griffith (2003), the scaled Moran'I value is interpretable as follows: 0.25-0.50:weak; 0.50-0.70:moderate; 0.70-0.90:strong; 0.90-1.00:marked |
e |
Vector whose elements are residual standard error (resid_SE), adjusted conditional R2 (adjR2(cond)), restricted log-likelihood (rlogLik), Akaike information criterion (AIC), and Bayesian information criterion (BIC). When method = "ml", restricted log-likelihood (rlogLik) is replaced with log-likelihood (logLik) |
vc |
List indicating whether NVC are removed or not during the BIC/AIC minimization. 1 indicates not removed whreas 0 indicates removed |
r |
Vector of estimated random coefficients on Moran's eigenvectors (L x 1) |
sf |
Vector of estimated spatial dependent component (N x 1) |
pred |
Vector of predicted values (N x 1) |
resid |
Vector of residuals (N x 1) |
other |
List of other outputs, which are internally used |
Author(s)
Daisuke Murakami
References
Griffith, D. A. (2003). Spatial autocorrelation and spatial filtering: gaining understanding through theory and scientific visualization. Springer Science & Business Media.
Murakami, D. and Griffith, D.A. (2015) Random effects specifications in eigenvector spatial filtering: a simulation study. Journal of Geographical Systems, 17 (4), 311-331.
Murakami, D. and Griffith, D.A. (2019) A memory-free spatial additive mixed modeling for big spatial data. Japan Journal of Statistics and Data Science. DOI:10.1007/s42081-019-00063-x.
See Also
Examples
require(spdep)
data(boston)
y <- boston.c[, "CMEDV" ]
x <- boston.c[,c("CRIM","ZN","INDUS", "CHAS", "NOX","RM", "AGE",
"DIS" ,"RAD", "TAX", "PTRATIO", "B", "LSTAT")]
xgroup <- boston.c[,"TOWN"]
coords <- boston.c[,c("LON", "LAT")]
######## Regression considering spatially dependent residuals
#res <- besf(y = y, x = x, coords=coords)
#res
######## Regression considering spatially dependent residuals and NVC
######## (coefficients or NVC is selected)
#res2 <- besf(y = y, x = x, coords=coords, nvc = TRUE)
######## Regression considering spatially dependent residuals and NVC
######## (all the coefficients are NVCs)
#res3 <- besf(y = y, x = x, coords=coords, nvc = TRUE, nvc_sel=FALSE)