autologistic {ngspatial} | R Documentation |
Fit a centered autologistic model using maximum pseudolikelihood estimation or MCMC for Bayesian inference.
Description
Fit a centered autologistic model using maximum pseudolikelihood estimation or MCMC for Bayesian inference.
Usage
autologistic(formula, data, A, method = c("PL", "Bayes"), model = TRUE,
x = FALSE, y = FALSE, verbose = FALSE, control = list())
Arguments
formula |
an object of class |
data |
an optional data frame, list, or environment (or object coercible by |
A |
the adjacency matrix for the underlying graph. The matrix need not be binary, but it must be numeric and symmetric. |
method |
the method to use for inference. “ |
model |
a logical value indicating whether the model frame should be included as a component of the returned value. |
x |
a logical value indicating whether the model matrix used in the fitting process should be returned as a component of the returned value. |
y |
a logical value indicating whether the response vector used in the fitting process should be returned as a component of the returned value. |
verbose |
a logical value indicating whether to print various messages to the screen, including progress updates when |
control |
a list of the following control parameters.
|
Details
This function fits the centered autologistic model of Caragea and Kaiser (2009) using maximum pseudolikelihood estimation or MCMC for Bayesian inference. The joint distribution for the centered autologistic model is
\pi(Z\mid\theta)=c(\theta)^{-1}\exp\left(Z^\prime X\beta - \eta Z^\prime A\mu + \frac{\eta}{2}Z^\prime AZ\right),
where \theta = (\beta^\prime, \eta)^\prime
is the parameter vector, c(\theta)
is an intractable normalizing function, Z
is the response vector, X
is the design matrix,
\beta
is a (p-1)
-vector of regression coefficients, A
is the adjacency matrix for the underlying graph, \mu
is the vector of independence expectations,
and \eta
is the spatial dependence parameter.
Maximum pseudolikelihood estimation sidesteps the intractability of c(\theta)
by maximizing the product of the conditional likelihoods.
Confidence intervals are then obtained using a parametric bootstrap or a normal approximation, i.e., sandwich estimation. The bootstrap datasets are generated by perfect sampling (rautologistic
).
The bootstrap samples can be generated in parallel using the parallel package.
Bayesian inference is obtained using the auxiliary variable algorithm of Moller et al. (2006).
The auxiliary variables are generated by perfect sampling.
The prior distributions are (1) zero-mean normal with independent coordinates for \beta
, and (2) uniform for \eta
.
The common standard deviation for the normal prior can be supplied by the user as control parameter sigma
. The default is 1,000. The uniform prior has support [0, 2] by default, but the right endpoint can be supplied (as control parameter eta.max
) by the user.
The posterior covariance matrix of \theta
is estimated using samples obtained during a training run. The default number of iterations for the training run is 100,000, but this can be controlled by the user (via parameter trainit
). The estimated covariance matrix is then used as the proposal variance for a Metropolis-Hastings random walk. The proposal distribution is normal. The posterior samples obtained during the second run are used for inference. The length of the run can be controlled by the user via parameters minit
, maxit
, and tol
. The first determines the minimum number of iterations. If minit
has been reached, the sampler will terminate when maxit
is reached or all Monte Carlo standard errors are smaller than tol
, whichever happens first. The default values for minit
, maxit
, and tol
are 100,000; 1,000,000; and 0.01, respectively.
Value
autologistic
returns an object of class “autologistic
”, which is a list containing the following components.
coefficients |
the point estimate of |
fitted.values |
the fitted mean values, obtained by transforming the linear predictors by the inverse of the link function. |
linear.predictors |
the linear fit on link scale. |
residuals |
the response residuals. |
iter |
the size of the bootstrap/posterior sample. |
sample |
(where relevant) an |
mcse |
(where relevant) a |
S |
(where relevant) the estimated sandwich matrix. |
accept |
(Bayes) the acceptance rate for the MCMC sampler. |
y |
if requested (the default), the |
X |
if requested, the model matrix. |
model |
if requested (the default), the model frame. |
call |
the matched call. |
formula |
the formula supplied. |
method |
the method used for inference. |
convergence |
the integer code returned by |
message |
a character string to go along with |
terms |
the |
data |
the |
xlevels |
(where relevant) a record of the levels of the factors used in fitting. |
control |
a list containing the names and values of the control parameters. |
value |
the value of the negative log pseudolikelihood at its minimum. |
References
Caragea, P. and Kaiser, M. (2009) Autologistic models with interpretable parameters. Journal of Agricultural, Biological, and Environmental Statistics, 14(3), 281–300.
Hughes, J., Haran, M. and Caragea, P. C. (2011) Autologistic models for binary data on a lattice. Environmetrics, 22(7), 857–871.
Moller, J., Pettitt, A., Berthelsen, K., and Reeves, R. (2006) An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants. Biometrika, 93(2), 451–458.
See Also
rautologistic
, residuals.autologistic
, summary.autologistic
, vcov.autologistic
Examples
# Use the 20 x 20 square lattice as the underlying graph.
n = 20
A = adjacency.matrix(n)
# Assign coordinates to each vertex such that the coordinates are restricted to the unit square
# centered at the origin.
x = rep(0:(n - 1) / (n - 1), times = n) - 0.5
y = rep(0:(n - 1) / (n - 1), each = n) - 0.5
X = cbind(x, y) # Use the vertex locations as spatial covariates.
beta = c(2, 2) # These are the regression coefficients.
col1 = "white"
col2 = "black"
colfunc = colorRampPalette(c(col1, col2))
# Simulate a dataset with the above mentioned regression component and eta equal to 0.6. This
# value of eta corresponds to dependence that is moderate in strength.
theta = c(beta, eta = 0.6)
set.seed(123456)
Z = rautologistic(X, A, theta)
# Find the MPLE, and do not compute confidence intervals.
fit = autologistic(Z ~ X - 1, A = A, control = list(confint = "none"))
summary(fit)
# The following examples are not executed by default since the computation is time consuming.
# Compute confidence intervals based on the normal approximation. Estimate the "filling" in the
# sandwich matrix using a parallel parametric bootstrap, where the computation is distributed
# across six cores on the host workstation.
# set.seed(123456)
# fit = autologistic(Z ~ X - 1, A = A, verbose = TRUE,
# control = list(confint = "sandwich", nodes = 6))
# summary(fit)
# Compute confidence intervals based on a parallel parametric bootstrap. Use a bootstrap sample
# of size 500, and distribute the computation across six cores on the host workstation.
# set.seed(123456)
# fit = autologistic(Z ~ X - 1, A = A, verbose = TRUE,
# control = list(confint = "bootstrap", bootit = 500, nodes = 6))
# summary(fit)
# Do MCMC for Bayesian inference. The length of the training run will be 10,000, and
# the length of the subsequent inferential run will be at least 10,000.
# set.seed(123456)
# fit = autologistic(Z ~ X - 1, A = A, verbose = TRUE, method = "Bayes",
# control = list(trainit = 10000, minit = 10000, sigma = 1000))
# summary(fit)