CAR_INLA {bigDM} | R Documentation |
Fit a (scalable) spatial Poisson mixed model to areal count data, where several CAR prior distributions can be specified for the spatial random effect.
Description
Fit a spatial Poisson mixed model to areal count data. The linear predictor is modelled as
\log{r_{i}}=\alpha+\mathbf{x_i}^{'}\mathbf{\beta} + \xi_i, \quad \mbox{for} \quad i=1,\ldots,n;
where \alpha
is a global intercept, \mathbf{x_i}^{'}=(x_{i1},\ldots,x_{ip})
is a p-vector of standardized covariates in the i-th area,
\mathbf{\beta}=(\beta_1,\ldots,\beta_p)
is the p-vector of fixed effects coefficients, and \xi_i
is a spatially structured random effect.
Several conditional autoregressive (CAR) prior distributions can be specified for the spatial random effect, such as the intrinsic CAR prior (Besag et al. 1991), the convolution or BYM prior (Besag et al. 1991),
the CAR prior proposed by Leroux et al. (1999), and the reparameterization of the BYM model given by Dean et al. (2001) named BYM2 (Riebler et al. 2016).
If covariates are included in the model, two different approaches can be used to address the potential confounding issues between the fixed effects and the spatial random effects of the model: restricted regression and the use of orthogonality constraints.
At the moment, only continuous covariates can be included in the model as potential risk factors, which are automatically standardized before fitting the model. See Adin et al. (2021) for further details.
Three main modelling approaches can be considered:
the usual model with a global spatial random effect whose dependence structure is based on the whole neighbourhood graph of the areal units (
model="global"
argument)a Disjoint model based on a partition of the whole spatial domain where independent spatial CAR models are simultaneously fitted in each partition (
model="partition"
andk=0
arguments)a modelling approach where k-order neighbours are added to each partition to avoid border effects in the Disjoint model (
model="partition"
andk>0
arguments).
For both the Disjoint and k-order neighbour models, parallel or distributed computation strategies can be performed to speed up computations by using the 'future' package (Bengtsson 2021).
Inference is conducted in a fully Bayesian setting using the integrated nested Laplace approximation (INLA; Rue et al. (2009)) technique through the R-INLA package (https://www.r-inla.org/). For the scalable model proposals (Orozco-Acosta et al. 2021), approximate values of the Deviance Information Criterion (DIC) and Watanabe-Akaike Information Criterion (WAIC) can also be computed.
The function allows also to use the new hybrid approximate method that combines the Laplace method with a low-rank Variational Bayes correction to the posterior mean (van Niekerk et al. 2023) by including the inla.mode="compact"
argument.
Usage
CAR_INLA(
carto = NULL,
ID.area = NULL,
ID.group = NULL,
O = NULL,
E = NULL,
X = NULL,
confounding = NULL,
W = NULL,
prior = "Leroux",
model = "partition",
k = 0,
strategy = "simplified.laplace",
PCpriors = FALSE,
merge.strategy = "original",
compute.intercept = NULL,
compute.DIC = TRUE,
n.sample = 1000,
compute.fitted.values = FALSE,
save.models = FALSE,
plan = "sequential",
workers = NULL,
inla.mode = "classic",
num.threads = NULL
)
Arguments
carto |
object of class |
ID.area |
character; name of the variable that contains the IDs of spatial areal units. |
ID.group |
character; name of the variable that contains the IDs of the spatial partition (grouping variable).
Only required if |
O |
character; name of the variable that contains the observed number of disease cases for each areal units. |
E |
character; name of the variable that contains either the expected number of disease cases or the population at risk for each areal unit. |
X |
a character vector containing the names of the covariates within the |
confounding |
one of either |
W |
optional argument with the binary adjacency matrix of the spatial areal units. If |
prior |
one of either |
model |
one of either |
k |
numeric value with the neighbourhood order used for the partition model. Usually k=2 or 3 is enough to get good results.
If k=0 (default) the Disjoint model is considered. Only required if |
strategy |
one of either |
PCpriors |
logical value (default |
merge.strategy |
one of either |
compute.intercept |
CAUTION! This argument is deprecated from version 0.5.2. |
compute.DIC |
logical value; if |
n.sample |
numeric; number of samples to generate from the posterior marginal distribution of the linear predictor when computing approximate DIC/WAIC values. Default to 1000. |
compute.fitted.values |
logical value (default |
save.models |
logical value (default |
plan |
one of either |
workers |
character or vector (default |
inla.mode |
one of either |
num.threads |
maximum number of threads the inla-program will use. See |
Details
For a full model specification and further details see the vignettes accompanying this package.
Value
This function returns an object of class inla
. See the mergeINLA
function for details.
References
Adin A, Goicoa T, Hodges JS, Schnell P, Ugarte MD (2021). “Alleviating confounding in spatio-temporal areal models with an application on crimes against women in India.” Statistical Modelling, 1471082X211015452. doi:10.1177/1471082X211015452.
Bengtsson H (2021). “A unifying framework for parallel and distributed processing in R using futures.” The R Journal, 13(2), 273–291. doi:10.32614/RJ-2021-048.
Besag J, York J, Mollié A (1991). “Bayesian image restoration, with two applications in spatial statistics.” Annals of the Institute of Statistical Mathematics, 43(1), 1–20. doi:10.1007/bf00116466.
Dean CB, Ugarte MD, Militino AF (2001). “Detecting interaction between random region and fixed age effects in disease mapping.” Biometrics, 57(1), 197–202. doi:10.1111/j.0006-341x.2001.00197.x.
Leroux BG, Lei X, Breslow N (1999). “Estimation of disease rates in small areas: A new mixed model for spatial dependence.” In Halloran ME, Berry D (eds.), Statistical Models in Epidemiology, the Environment, and Clinical Trials, 179–191. Springer-Verlag: New York.
Riebler A, Sørbye SH, Simpson D, Rue H (2016). “An intuitive Bayesian spatial model for disease mapping that accounts for scaling.” Statistical methods in medical research, 25(4), 1145–1165. doi:10.1177/0962280216660421.
Rue H, Martino S, Chopin N (2009). “Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(2), 319–392. doi:10.1111/j.1467-9868.2008.00700.x.
Orozco-Acosta E, Adin A, Ugarte MD (2021). “Scalable Bayesian modeling for smoothing disease mapping risks in large spatial data sets using INLA.” Spatial Statistics, 41, 100496. doi:10.1016/j.spasta.2021.100496.
van Niekerk J, Krainski E, Rustand D, Rue H (2023). “A new avenue for Bayesian inference with INLA.” Computational Statistics & Data Analysis, 181, 107692. doi:10.1016/j.csda.2023.107692.
Examples
## Not run:
if(require("INLA", quietly=TRUE)){
## Load the Spain colorectal cancer mortality data ##
data(Carto_SpainMUN)
## Global model with a Leroux CAR prior distribution ##
Global <- CAR_INLA(carto=Carto_SpainMUN, ID.area="ID", O="obs", E="exp",
prior="Leroux", model="global", strategy="gaussian")
summary(Global)
## Disjoint model with a Leroux CAR prior distribution ##
## using 4 local clusters to fit the models in parallel ##
Disjoint <- CAR_INLA(carto=Carto_SpainMUN, ID.area="ID", ID.group="region", O="obs", E="exp",
prior="Leroux", model="partition", k=0, strategy="gaussian",
plan="cluster", workers=rep("localhost",4))
summary(Disjoint)
## 1st-order neighbourhood model with a Leroux CAR prior distribution ##
## using 4 local clusters to fit the models in parallel ##
order1 <- CAR_INLA(carto=Carto_SpainMUN, ID.area="ID", ID.group="region", O="obs", E="exp",
prior="Leroux", model="partition", k=1, strategy="gaussian",
plan="cluster", workers=rep("localhost",4))
summary(order1)
## 2nd-order neighbourhood model with a Leroux CAR prior distribution ##
## using 4 local clusters to fit the models in parallel ##
order2 <- CAR_INLA(carto=Carto_SpainMUN, ID.area="ID", ID.group="region", O="obs", E="exp",
prior="Leroux", model="partition", k=2, strategy="gaussian",
plan="cluster", workers=rep("localhost",4))
summary(order2)
}
## End(Not run)