besf_vc {spmoran}R Documentation

Spatially and non-spatially varying coefficient (SNVC) modeling for very large samples

Description

Parallel and memory-free implementation of SNVC modeling for very large samples. The model estimates residual spatial dependence, constant coefficients, spatially varying coefficients (SVCs), non-spatially varying coefficients (NVC; coefficients varying with respect to explanatory variable value), and SNVC (= SVC + NVC). Type of coefficients can be selected through BIC/AIC minimization. By default, it estimates a SVC model. SNVCs can be mapped just like SVCs. Unlike SVC models, SNVC model is robust against spurious correlation (multicollinearity), so, stable (see Murakami and Griffith, 2020).

Note: The SVC model can be less accurate for large samples due to a degeneracy/over-smoothing problem (see Murakami et al., 2023). The addlearn_local is useful to mitigate this problem (See the coding example below).

Usage

besf_vc( y, x, xconst = NULL, coords, s_id = NULL, x_nvc = FALSE, xconst_nvc = FALSE,
         x_sel = TRUE, x_nvc_sel = TRUE, xconst_nvc_sel = TRUE, nvc_num=5,
         method = "reml", penalty = "bic", maxiter = 30,
         covmodel="exp",enum = 200, bsize = 4000, ncores=NULL )

Arguments

y

Vector of explained variables (N x 1)

x

Matrix of explanatory variables with spatially varying coefficients (SVC) (N x K)

xconst

Matrix of explanatory variables with constant coefficients (N x K_c). Default is NULL

coords

Matrix of spatial point coordinates (N x 2)

s_id

Optional. ID specifying groups modeling spatially dependent process (N x 1). If it is specified, group-level spatial process is estimated. It is useful for multilevel modeling (s_id is given by the group ID) and panel data modeling (s_id is given by individual location id). Default is NULL

x_nvc

If TRUE, SNVCs are assumed on x. Otherwise, SVCs are assumed. Default is FALSE

xconst_nvc

If TRUE, NVCs are assumed on xconst. Otherwise, constant coefficients are assumed. Default is FALSE

x_sel

If TRUE, type of coefficient (SVC or constant) on x is selected through a BIC (default) or AIC minimization. If FALSE, SVCs are assumed across x. Alternatively, x_sel can be given by column number(s) of x. For example, if x_sel = 2, the coefficient on the second explanatory variable in x is SVC and the other coefficients are constants. The Default is TRUE

x_nvc_sel

If TRUE, type of coefficient (NVC or constant) on x is selected through the BIC (default) or AIC minimization. If FALSE, NVCs are assumed across x. Alternatively, x_nvc_sel can be given by column number(s) of x. For example, if x_nvc_sel = 2, the coefficient on the second explanatory variable in x is NVC and the other coefficients are constants. The Default is TRUE

xconst_nvc_sel

If TRUE, type of coefficient (NVC or constant) on xconst is selected through the BIC (default) or AIC minimization. If FALSE, NVCs are assumed across xconst. Alternatively, xconst_nvc_sel can be given by column number(s) of xconst. For example, if xconst_nvc_sel = 2, the coefficient on the second explanatory variable in xconst is NVC and the other coefficients are constants.The Default is TRUE

nvc_num

Number of basis functions used to model NVC. An intercept and nvc_num natural spline basis functions are used to model each NVC. Default is 5

method

Estimation method. Restricted maximum likelihood method ("reml") and maximum likelihood method ("ml") are available. Default is "reml"

penalty

Penalty to select type of coefficients (SNVC, SVC, NVC, or constant) to stablize the estimates. The current options are "bic" for the Baysian information criterion-type penalty (N x log(K)) and "aic" for the Akaike information criterion (2K) (see Muller et al., 2013). Default is "bic"

maxiter

Maximum number of iterations. Default is 30

covmodel

Type of kernel to model spatial dependence. The currently available options are "exp" for the exponential kernel, "gau" for the Gaussian kernel, and "sph" for the spherical kernel

enum

Number of Moran eigenvectors to be used for spatial process modeling (scalar). Default is 200

bsize

Block/badge size. bsize x bsize elements are iteratively processed during the parallelized computation. Default is 4000

ncores

Number of cores used for the parallel computation. If ncores = NULL, the number of available cores is detected. Default is NULL

Value

b_vc

Matrix of estimated SNVC (= SVC + NVC) on x (N x K)

bse_vc

Matrix of standard errors for the SNVCs on x (N x k)

z_vc

Matrix of z-values for the SNVCs on x (N x K)

p_vc

Matrix of p-values for the SNVCs on x (N x K)

B_vc_s

List summarizing estimated SVCs (in SNVC) on x. The four elements are the SVCs (N x K), the standard errors (N x K), z-values (N x K), and p-values (N x K), respectively

B_vc_n

List summarizing estimated NVCs (in SNVC) on x. The four elements are the NVCs (N x K), the standard errors (N x K), z-values (N x K), and p-values (N x K), respectively

c

Matrix with columns for the estimated coefficients on xconst, their standard errors, z-values, and p-values (K_c x 4). Effective if xconst_nvc = FALSE

c_vc

Matrix of estimated NVCs on xconst (N x K_c). Effective if xconst_nvc = TRUE

cse_vc

Matrix of standard errors for the NVCs on xconst (N x K_c). Effective if xconst_nvc = TRUE

cz_vc

Matrix of z-values for the NVCs on xconst (N x K_c). Effective if xconst_nvc = TRUE

cp_vc

Matrix of p-values for the NVCs on xconst (N x K_c). Effective if xconst_nvc = TRUE

s

List of variance parameters in the SNVC (SVC + NVC) on x. The first element is a 2 x K matrix summarizing variance parameters for SVC. The (1, k)-th element is the standard deviation of the k-th SVC, while the (2, k)-th element is the Moran's I value that is scaled to take a value between 0 (no spatial dependence) and 1 (strongest spatial dependence). Based on Griffith (2003), the scaled Moran'I value is interpretable as follows: 0.25-0.50:weak; 0.50-0.70:moderate; 0.70-0.90:strong; 0.90-1.00:marked. The second element of s is the vector of standard deviations of the NVCs

s_c

Vector of standard deviations of the NVCs on xconst

vc

List indicating whether SVC/NVC are removed or not during the BIC/AIC minimization. 1 indicates not removed (replaced with constant) whreas 0 indicates removed

e

Vector whose elements are residual standard error (resid_SE), adjusted conditional R2 (adjR2(cond)), restricted log-likelihood (rlogLik), Akaike information criterion (AIC), and Bayesian information criterion (BIC). When method = "ml", restricted log-likelihood (rlogLik) is replaced with log-likelihood (logLik)

pred

Vector of predicted values (N x 1)

resid

Vector of residuals (N x 1)

other

List of other outputs, which are internally used

Author(s)

Daisuke Murakami

References

Muller, S., Scealy, J.L., and Welsh, A.H. (2013) Model selection in linear mixed models. Statistical Science, 28 (2), 136-167.

Murakami, D., Yoshida, T., Seya, H., Griffith, D.A., and Yamagata, Y. (2017) A Moran coefficient-based mixed effects approach to investigate spatially varying relationships. Spatial Statistics, 19, 68-89.

Murakami, D., and Griffith, D.A. (2019). Spatially varying coefficient modeling for large datasets: Eliminating N from spatial regressions. Spatial Statistics, 30, 39-64.

Murakami, D. and Griffith, D.A. (2019) A memory-free spatial additive mixed modeling for big spatial data. Japan Journal of Statistics and Data Science. DOI:10.1007/s42081-019-00063-x.

Murakami, D., and Griffith, D.A. (2020) Balancing spatial and non-spatial variations in varying coefficient modeling: a remedy for spurious correlation. ArXiv.

See Also

resf_vc, addlearn_local

Examples

require(spdep)
data(boston)
y	      <- boston.c[, "CMEDV"]
x       <- boston.c[,c("CRIM", "AGE")]
xconst  <- boston.c[,c("ZN","DIS","RAD","NOX",  "TAX","RM", "PTRATIO", "B")]
xgroup  <- boston.c[,"TOWN"]
coords  <- boston.c[,c("LON", "LAT")]

############## SVC modeling1 #################
######## (SVC on x; Constant coefficients on xconst)
#res	    <- besf_vc(y=y,x=x,xconst=xconst,coords=coords, x_sel = FALSE )
#res
#plot_s(res,0) # Spatially varying intercept
#plot_s(res,1) # 1st SVC
#plot_s(res,2) # 2nd SVC
#
######## For large samples (n > 5,000), the following additional learning
######## mitigates an degeneracy/over-smoothing problem in SVCs
#res1     <- addlearn_local(res)
#res1
#plot_s(res1,0) # Spatially varying intercept
#plot_s(res1,1) # 1st SVC
#plot_s(res1,2) # 2nd SVC


############## SVC modeling2 #################
######## (SVC or constant coefficients on x; Constant coefficients on xconst)
#res2	  <- besf_vc(y=y,x=x,xconst=xconst,coords=coords )

############## SVC modeling3 #################
######## - Group-level SVC or constant coefficients on x
######## - Constant coefficients on xconst
#res3	  <- besf_vc(y=y,x=x,xconst=xconst,coords=coords, s_id=xgroup)

############## SNVC modeling1 #################
######## - SNVC, SVC, NVC, or constant coefficients on x
######## - Constant coefficients on xconst

#res4	  <- besf_vc(y=y,x=x,xconst=xconst,coords=coords, x_nvc =TRUE)

############## SNVC modeling2 #################
######## - SNVC, SVC, NVC, or constant coefficients on x
######## - NVC or Constant coefficients on xconst

#res5	  <- besf_vc(y=y,x=x,xconst=xconst,coords=coords, x_nvc =TRUE, xconst_nvc=TRUE)
#plot_s(res5,0)            # Spatially varying intercept
#plot_s(res5,1)            # 1st SNVC (SVC + NVC)
#plot_s(res5,1,btype="svc")# SVC in the 1st SNVC
#plot_n(res5,1,xtype="x")  # NVC in the 1st NVC on x
#plot_n(res5,6,xtype="xconst")# NVC in the 6t NVC on xcnost



[Package spmoran version 0.2.3 Index]