R: Spatial Lag Model Trees

lagsarlmtree {lagsarlmtree}

R Documentation

Spatial Lag Model Trees

Description

Model-based recursive partitioning based on linear regression adjusting for a (global) spatial simultaneous autoregressive lag.

Usage

lagsarlmtree(formula, data, listw = NULL, method = "eigen",
  zero.policy = NULL, interval = NULL, control = list(),
  rhowystart = NULL, abstol = 0.001, maxit = 100, 
  dfsplit = TRUE, verbose = FALSE, plot = FALSE, ...)

Arguments

`formula`	formula specifying the response variable and regressors and partitioning variables, respectively. For details see below.
`data`	data.frame to be used for estimating the model tree.
`listw`	a weights object for the spatial lag part of the model.
`method`	"eigen" (default) - the Jacobian is computed as the product of (1 - rho*eigenvalue) using `eigenw`, and "spam" or "Matrix_J" for strictly symmetric weights lists of styles "B" and "C", or made symmetric by similarity (Ord, 1975, Appendix C) if possible for styles "W" and "S", using code from the spam or Matrix packages to calculate the determinant; “Matrix” and “spam_update” provide updating Cholesky decomposition methods; "LU" provides an alternative sparse matrix decomposition approach. In addition, there are "Chebyshev" and Monte Carlo "MC" approximate log-determinant methods; the Smirnov/Anselin (2009) trace approximation is available as "moments". Three methods: "SE_classic", "SE_whichMin", and "SE_interp" are provided experimentally, the first to attempt to emulate the behaviour of Spatial Econometrics toolbox ML fitting functions. All use grids of log determinant values, and the latter two attempt to ameliorate some features of "SE_classic".
`zero.policy`	default NULL, use global option value; if TRUE assign zero to the lagged value of zones without neighbours, if FALSE (default) assign NA - causing `lagsarlm()` to terminate with an error
`interval`	default is NULL, search interval for autoregressive parameter
`control`	list of extra control arguments - see `lagsarlm`
`rhowystart`	numeric. A vector of length `nrow(data)`, to be used as an offset in estimation of the first tree. `NULL` by default, which results in an initialization with the root model (without partitioning).
`abstol`	numeric. The convergence criterion used for estimation of the model. When the difference in log-likelihoods of the model from two consecutive iterations is smaller than `abstol`, estimation of the model tree has converged.
`maxit`	numeric. The maximum number of iterations to be performed in estimation of the model tree.
`dfsplit`	logical or numeric. `as.integer(dfsplit)` is the degrees of freedom per selected split employed when extracting the log-likelihood.
`verbose`	Should the log-likelihood value of the estimated model be printed for every iteration of the estimation?
`plot`	Should the tree be plotted at every iteration of the estimation? Note that selecting this option slows down execution of the function.
`...`	Additional arguments to be passed to `lmtree()`. See `mob_control` documentation for details.

Details

Spatial lag trees learn a tree where each terminal node is associated with different regression coefficients while adjusting for a (global) spatial simultaneous autoregressive lag. This allows for detection of subgroup-specific coefficients with respect to selected covariates, while adjusting for spatial correlations in the data. The estimation algorithm iterates between (1) estimation of the tree given an offset of the spatial lag effect, and (2) estimation of the spatial lag model given the tree structure.

The code is still under development and might change in future versions.

Value

The function returns a list with the following objects:

`formula`	The formula as specified with the `formula` argument.
`call`	the matched call.
`tree`	The final `lmtree`.
`lagsarlm`	The final `lagsarlm` model.
`data`	The dataset specified with the `data` argument including added auxiliary variables `.rhowy` and `.tree` from the last iteration.
`nobs`	Number of observations.
`loglik`	The log-likelihood value of the last iteration.
`df`	Degrees of freedom.
`dfsplit`	degrees of freedom per selected split as specified with the `dfsplit` argument.
`iterations`	The number of iterations used to estimate the `lagsarlmtree`.
`maxit`	The maximum number of iterations specified with the `maxit` argument.
`rhowystart`	Offset in estimation of the first tree as specified in the `rhowystart` argument.
`abstol`	The prespecified value for the change in log-likelihood to evaluate convergence, as specified with the `abstol` argument.
`listw`	The `listw` object used.
`mob.control`	A list containing control parameters passed to `lmtree()`, as specified with ....

References

Wagner M, Zeileis A (2019). Heterogeneity and Spatial Dependence of Regional Growth in the EU: A Recursive Partitioning Approach. German Economic Review, 20(1), 67–82. doi: 10.1111/geer.12146 https://eeecon.uibk.ac.at/~zeileis/papers/Wagner+Zeileis-2019.pdf

Examples

## data and spatial weights
data("GrowthNUTS2", package = "lagsarlmtree")
data("WeightsNUTS2", package = "lagsarlmtree")

## spatial lag model tree
system.time(tr <- lagsarlmtree(ggdpcap ~ gdpcap0 + shgfcf + shsh + shsm |
    gdpcap0 + accessrail + accessroad + capital + regboarder + regcoast + regobj1 + cee + piigs,
  data = GrowthNUTS2, listw = WeightsNUTS2$invw,
  minsize = 12, alpha = 0.05))
print(tr)
plot(tr, tp_args = list(which = 1))

## query coefficients
coef(tr, model = "tree")
coef(tr, model = "rho")
coef(tr, model = "all")
system.time({
ev <- eigenw(WeightsNUTS2$invw)
tr1 <- lagsarlmtree(ggdpcap ~ gdpcap0 + shgfcf + shsh + shsm |
    gdpcap0 + accessrail + accessroad + capital + regboarder + regcoast + regobj1 + cee + piigs,
  data = GrowthNUTS2, listw = WeightsNUTS2$invw, method = "eigen",
  control = list(pre_eig = ev), minsize = 12, alpha = 0.05)
})
coef(tr1, model = "rho")

[Package lagsarlmtree version 1.0-1 Index]