R: Fit a sparse SGLMM.

sparse.sglmm {ngspatial}

R Documentation

Fit a sparse SGLMM.

Description

Fit a sparse SGLMM.

Usage

sparse.sglmm(formula, family = gaussian, data, offset, A, method = c("BSF",
  "RSR"), attractive = 50, repulsive = 0, tol = 0.01, minit = 10000,
  maxit = 1e+06, tune = list(), hyper = list(), model = TRUE,
  x = FALSE, y = FALSE, verbose = FALSE, parallel = FALSE)

Arguments

`formula`	an object of class `formula`: a symbolic description of the model to be fitted.
`family`	a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function, or the result of a call to a family function. (See `family` for details of family functions.) Supported families are `binomial`, `gaussian` (default), `negbinomial`, and `poisson`.
`data`	an optional data frame, list, or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `sparse.sglmm` is called.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric vector of length equal to the number of cases. One or more `offset` terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See `model.offset`.
`A`	the adjacency matrix for the underlying graph.
`method`	the basis to use. The options are Bayesian spatial filtering (“`BSF`”) and restricted spatial regression (“`RSR`”).
`attractive`	the number of attractive Moran eigenvectors to use. The default is 50. See ‘Details’ for more information.
`repulsive`	the number of repulsive Moran eigenvectors to use. The default is 0. See ‘Details’ for more information.
`tol`	a tolerance. If all Monte Carlo standard errors are smaller than `tol`, no more samples are drawn from the posterior. The default is 0.01.
`minit`	the minimum sample size. This should be large enough to permit accurate estimation of Monte Carlo standard errors. The default is 10,000.
`maxit`	the maximum sample size. Sampling from the posterior terminates when all Monte Carlo standard errors are smaller than `tol` or when `maxit` samples have been drawn, whichever happens first. The default is 1,000,000.
`tune`	(where relevant) a list containing `sigma.s`, `sigma.h`, and `sigma.theta`. These are the standard deviations for the `\gamma`, `\delta`, and `\theta` proposals, respectively.
`hyper`	a list containing `sigma.b`, the prior standard deviation for `\beta`, and (where relevant) `a.h` and `b.h`, the parameters of the gamma prior for `\tau_h`.
`model`	a logical value indicating whether the model frame should be included as a component of the returned value.
`x`	a logical value indicating whether the model matrix used in the fitting process should be returned as a component of the returned value.
`y`	a logical value indicating whether the response vector used in the fitting process should be returned as a component of the returned value.
`verbose`	a logical value indicating whether to print MCMC progress to the screen. Defaults to `FALSE`.
`parallel`	(for parallelized computation of the Moran operator) a list containing `type` and `nodes`, the cluster type and number of slave nodes, respectively. The former must be one of “`FORK`”, “`MPI`”, “`NWS`”, “`PSOCK`”, or “`SOCK`” (default). The latter must be a whole number greater than 1. This argument defaults to `FALSE`, in which case the matrix multiplications are not parallelized.

Details

This function fits the sparse restricted spatial regression model of Hughes and Haran (2013), or the Bayesian spatial filtering model of Hughes (2017). The first stage of the model is

g(\mu_i)=x_i^\prime\beta+m_i^\prime\gamma\hspace{1 cm}(i=1,\dots,n)

or, in vectorized form,

g(\mu)=X\beta+M\gamma,

where X is the design matrix, \beta is a p-vector of regression coefficients, the columns of M are q eigenvectors of the Moran operator, and \gamma are spatial random effects. Arguments attractive and repulsive can be used to control the number of eigenvectors used. The default values are 50 and 0, respectively, which corresponds to pure spatial smoothing. Inclusion of some repulsive eigenvectors can be advantageous in certain applications.

The second stage, i.e., the prior for \gamma, is

p(\gamma\mid\tau_s)\propto\tau_s^{q/2}\exp\left(-\frac{\tau_s}{2}\gamma^\prime M^\prime QM\gamma\right),

where \tau_s is a smoothing parameter and Q is the graph Laplacian.

The prior for \beta is spherical p-variate normal with mean zero and common standard deviation sigma.b, which defaults to 1,000. The prior for \tau_s is gamma with parameters 0.5 and 2,000. The same prior is used for \theta (when family is negbinomial).

When the response is normally distributed, the identity link is assumed, in which case the first stage is

\mu=X\beta+M\gamma+M\delta,

where \delta are heterogeneity random effects. When the response is Poisson distributed, heterogeneity random effects are optional. In any case, the prior on \delta is spherical q-variate normal with mean zero and common variance 1/\tau_h. The prior for \tau_h is gamma with parameters a_h and b_h, the values of which are controlled by the user through argument hyper.

If the response is Bernoulli, negative binomial, or Poisson, \beta and \gamma are updated using Metropolis-Hastings random walks with normal proposals. The proposal covariance matrix for \beta is the estimated asymptotic covariance matrix from a glm fit to the data (see vcov). The proposal for \gamma is spherical normal with common standard deviation sigma.s.

The updates for \tau_s and \tau_h are Gibbs updates irrespective of the response distribution.

If the response is Poisson distributed and heterogeneity random effects are included, those random effects are updated using a Metropolis-Hastings random walk with a spherical normal proposal. The common standard deviation is sigma.h.

If the response is normally distributed, all updates are Gibbs updates.

If the response is negative binomial, the dispersion parameter \theta is updated using a Metropolis-Hastings random walk with a normal proposal. Said proposal has standard deviation sigma.theta, which can be provided by the user as an element of argument tune.

Value

sparse.sglmm returns an object of class “sparse.sglmm”, which is a list containing the following components.

`coefficients`	the estimated regression coefficients.
`fitted.values`	the fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.
`linear.predictors`	the linear fit on link scale.
`residuals`	the response residuals.
`iter`	the size of the posterior sample.
`beta.sample`	an `iter` by `p` matrix containing the posterior samples for `\beta`.
`gamma.sample`	an `iter` by `q` matrix containing the posterior samples for `\gamma`.
`delta.sample`	(where relevant) an `iter` by `q` matrix containing the posterior samples for `\delta`.
`theta.sample`	(where relevant) a vector containing the posterior samples for `\theta`.
`tau.s.sample`	a vector containing the posterior samples for `\tau_s`.
`tau.h.sample`	(where relevant) a vector containing the posterior samples for `\tau_h`.
`gamma.est`	the estimate of `\gamma`.
`delta.est`	(where relevant) the estimate of `\delta`.
`tau.s.est`	the estimate of `\tau_s`.
`tau.h.est`	(where relevant) the estimate of `\tau_h`.
`theta.est`	(where relevant) the estimate of `\theta`.
`beta.mcse`	the Monte Carlo standard errors for `\beta`.
`gamma.mcse`	the Monte Carlo standard errors for `\gamma`.
`delta.mcse`	(where relevant) the Monte Carlo standard errors for `\delta`.
`tau.s.mcse`	the Monte Carlo standard error for `\tau_s`.
`tau.h.mcse`	(where relevant) the Monte Carlo standard error for `\tau_h`.
`theta.mcse`	(where relevant) the Monte Carlo standard error for `\theta`.
`D.bar`	the goodness of fit component of the DIC.
`pD`	the penalty component of the DIC.
`dic`	the deviance information criterion.
`beta.accept`	the acceptance rate for `\beta`.
`gamma.accept`	the acceptance rate for `\gamma`.
`delta.accept`	(where relevant) the acceptance rate for `\delta`.
`theta.accept`	(where relevant) the acceptance rate for `\theta`.
`y`	if requested (the default), the `y` vector used.
`X`	if requested, the model matrix.
`M`	if requested, the matrix of Moran eigenvectors.
`eigen.values`	if requested, the spectrum of the Moran operator.
`hyper`	a list containing the names and values of the hyperparameters.
`tune`	a list containing the names and values of the tuning parameters.
`model`	if requested (the default), the model frame.
`call`	the matched call.
`formula`	the formula supplied.
`terms`	the `terms` object used.
`data`	the `data` argument.
`offset`	the offset vector used.
`xlevels`	(where relevant) a record of the levels of the factors used in fitting.

References

Hughes, J. and Haran, M. (2013) Dimension reduction and alleviation of confounding for spatial generalized linear mixed models. Journal of the Royal Statistical Society, Series B, 75(1), 139–159.

Examples

## Not run: 

The following code duplicates the analysis described in (Hughes and Haran, 2013). The data are
infant mortality data for 3,071 US counties. We do a spatial Poisson regression with offset.

data(infant)
infant$low_weight = infant$low_weight / infant$births
attach(infant)
Z = deaths
X = cbind(1, low_weight, black, hispanic, gini, affluence, stability)
data(A)
set.seed(123456)
fit = sparse.sglmm(Z ~ X - 1 + offset(log(births)), family = poisson, A = A, method = "RSR",
                   tune = list(sigma.s = 0.02), verbose = TRUE)
summary(fit)

## End(Not run)

[Package ngspatial version 1.2-2 Index]

Fit a sparse SGLMM.

Description

Usage

Arguments

Details

Value

References

See Also

Examples