addlearn_local {spmoran} | R Documentation |
Additional learning of local processes and prediction for large samples
This function performs an additional learning of local variations in spatially varying coefficients. While the SVC model implemented in resf_vc
or besf_vc
can be less accurate for large samples (e.g., n > 5,000) due to a degeneracy/over-smoothing problem, this additional learning mitigates this problem by synthesizing/averaging the model with local SVC models. The resulting spatial prediction implemented in this function is expected to be more accurate than the resf_vc function.
addlearn_local( mod, meig0 = NULL, x0 = NULL, xconst0=NULL, xgroup0=NULL,
cl_num=NULL, cl=NULL, parallel=FALSE, ncores=NULL )
mod |
meig0 |
Moran eigenvectors at prediction sites. Output from |
x0 |
Matrix of explanatory variables at prediction sites whose coefficients are allowed to vary across geographical space (N_0 x K). Default is NULL |
xconst0 |
Matrix of explanatory variables at prediction sites whose coefficients are assumed constant (or NVC) across space (N_0 x K_const). Default is NULL |
xgroup0 |
Matrix of group indeces at prediction sites that may be group IDs (integers) or group names (N_0 x K_g). Default is NULL |
cl_num |
Number of local sub-models being aggregated/averaged. If NULL, the number is determined so that the number of samples per sub-model equals approximately 600. Default is NULL |
cl |
Vector of cluster ID for each sample (N x 1). If specified, the local sub-models are given by this ID. If NULL, k-means clustering based on spatial coordinates is performed to obtain spatial clusters each of which contain approximately 600 samples. Default is NULL |
parallel |
If TRUE, the model is estimated through parallel computation. The default is FALSE for |
ncores |
Number of cores used for the parallel computation. If ncores = NULL and parallel = TRUE, the number of available cores is detected. Default is NULL |
b_vc |
Matrix of estimated spatially varying coefficients (SVCs) on x (N x K) |
bse_vc |
Matrix of standard errors for the SVCs on x (N x k) |
z_vc |
Matrix of z-values for the SVCs on x (N x K) |
p_vc |
Matrix of p-values for the SVCs on x (N x K) |
c |
Matrix with columns for the estimated coefficients on xconst, their standard errors, z-values, and p-values (K_c x 4) |
b_g |
List of K_g matrices with columns for the estimated group effects, their standard deviations, and t-values |
s |
List of 2 elements summarizing variance parameters characterizing SVCs of each local sub-model. The first element contains standard deviations of each SVCs while the second elementcontains their Moran's I values that are scaled to take a value between 0 (no spatial dependence) and 1 (strongest positive spatial dependence). Based on Griffith (2003), the scaled Moran'I value is interpretable as follows: 0.25-0.50:weak; 0.50-0.70:moderate; 0.70-0.90:strong; 0.90-1.00:marked |
s_global |
The same variance parameters for the globa sub-model |
s_g |
Vector of standard deviations of the group effects |
e |
Error statistics. It includes residual standard error (resid_SE), adjusted conditional R2 (adjR2(cond)), restricted log-likelihood (rlogLik), Akaike information criterion (AIC), and Bayesian information criterion (BIC) |
pred |
Matrix of predicted values for y (pred) and their standard errors (pred_se) (N x 2) |
resid |
Vector of residuals (N x 1) |
cl |
Vector of cluster ID being used (N x 1) |
pred0 |
Matrix of predicted values for y (pred) and their standard errors (pred_se) at prediction sites (N_0 x 2) |
b_vc0 |
Matrix of estimated spatially varying coefficients (SVCs) at prediction sites (N_0 x K) |
bse_vc0 |
Matrix of standard errors for the SVCs at prediction sites (N_0 x k) |
z_vc0 |
Matrix of z-values for the SVCs at prediction sites (N x K) |
p_vc0 |
Matrix of p-values for the SVCs at prediction sites (N x K) |
other |
List of other outputs, which are internally used |
Daisuke Murakami
Murakami, D., Sugasawa, S., T., Seya, H., and Griffith, D.A. (2024) Sub-model aggregation-based scalable eigenvector spatial filtering: application to spatially varying coefficient modeling. Arxiv.
See Also
dat0 <- data.frame(house@coords,house@data)
dat <- dat0[dat0$yrbuilt>=1980,]
###### purpose 1: improve SVC modeling accuracy ######
###### (i.e., addressing the over-smoothing problem) #
y <- log(dat[,"price"])
x <- dat[,c("age","rooms")]
xconst <- dat[,c("lotsize","s1994","s1995","s1996","s1997","s1998")]
coords <- dat[ ,c("long","lat")]
meig <- meigen_f( coords )
## Not run: Remove # and run
# res0 <- resf_vc(y = y,x = x, xconst = xconst, meig = meig)
# res <- addlearn_local(res0) # It adjusts SVCs to model local patterns
# res
####### parallel version for very large samples (e.g., n >100,000)
# bes0 <- besf_vc(y = y,x = x, xconst = xconst, coords=coords)
# bes <- addlearn_local( bes0 )
####### purpose 2: improve predictive accuracy ########
#samp <- sample( dim( dat )[ 1 ], 2500)
#d <- dat[ samp, ] ## Data at observed sites
#y <- log(d[,"price"])
#x <- d[,c("age","rooms")]
#xconst <- d[,c("lotsize","s1994","s1995","s1996","s1997","s1998")]
#coords <- d[ ,c("long","lat")]
#d0 <- dat[-samp, ] ## Data at observed sites
#y0 <- log(d0[,"price"])
#x0 <- d0[,c("age","rooms")]
#xconst0 <- d0[,c("lotsize","s1994","s1995","s1996","s1997","s1998")]
#coords0 <- d0[ ,c("long","lat")]
#meig <- meigen_f( coords )
#res0 <- resf_vc(y = y,x = x, xconst = xconst, meig = meig)
#meig0 <- meigen0( meig=meig, coords0=coords0 )
#res <- addlearn_local(res0, meig0=meig0, x0=x0, xconst0=xconst0) #
#pred <- res$pred0 ## Predictive values