MCAR_INLA {bigDM} | R Documentation |
Fit a (scalable) spatial multivariate Poisson mixed model to areal count data where dependence between spatial patterns of the diseases is addressed through the use of M-models (Botella-Rocamora et al. 2015).
Description
Fit a spatial multivariate Poisson mixed model to areal count data. The linear predictor is modelled as
\log{r_{ij}}=\alpha_j + \theta_{ij}, \quad \mbox{for} \quad i=1,\ldots,n; \quad j=1,\ldots,J
where \alpha_j
is a disease-specific intercept and \theta_{ij}
is the spatial main effect of area i
for the j
-th disease.
Following Botella-Rocamora et al. (2015), we rearrange the spatial effects into the matrix \mathbf{\Theta} = \{ \theta_{ij}: i=1, \ldots, I; j=1, \ldots, J \}
whose columns are spatial random effects and its joint distribution specifies how dependence within-diseases and between-diseases is defined.
Several conditional autoregressive (CAR) prior distributions can be specified to deal with spatial dependence within-diseases, such as the intrinsic CAR prior (Besag et al. 1991), the CAR prior proposed by Leroux et al. (1999), and the proper CAR prior distribution.
As in the CAR_INLA
function, three main modelling approaches can be considered:
the usual model with a global spatial random effect whose dependence structure is based on the whole neighbourhood graph of the areal units (
model="global"
argument)a Disjoint model based on a partition of the whole spatial domain where independent spatial CAR models are simultaneously fitted in each partition (
model="partition"
andk=0
arguments)a modelling approach where k-order neighbours are added to each partition to avoid border effects in the Disjoint model (
model="partition"
andk>0
arguments).
For both the Disjoint and k-order neighbour models, parallel or distributed computation strategies can be performed to speed up computations by using the 'future' package (Bengtsson 2021).
Inference is conducted in a fully Bayesian setting using the integrated nested Laplace approximation (INLA; Rue et al. (2009)) technique through the R-INLA package (https://www.r-inla.org/). For the scalable model proposals (Orozco-Acosta et al. 2021), approximate values of the Deviance Information Criterion (DIC) and Watanabe-Akaike Information Criterion (WAIC) can also be computed.
The function allows also to use the new hybrid approximate method that combines the Laplace method with a low-rank Variational Bayes correction to the posterior mean (van Niekerk et al. 2023) by including the inla.mode="compact"
argument.
Usage
MCAR_INLA(
carto = NULL,
data = NULL,
ID.area = NULL,
ID.disease = NULL,
ID.group = NULL,
O = NULL,
E = NULL,
W = NULL,
prior = "intrinsic",
model = "partition",
k = 0,
strategy = "simplified.laplace",
merge.strategy = "original",
compute.intercept = NULL,
compute.DIC = TRUE,
n.sample = 1000,
compute.fitted.values = FALSE,
save.models = FALSE,
plan = "sequential",
workers = NULL,
inla.mode = "classic",
num.threads = NULL
)
Arguments
carto |
object of class |
data |
object of class |
ID.area |
character; name of the variable that contains the IDs of spatial areal units. The values of this variable must match those given in the |
ID.disease |
character; name of the variable that contains the IDs of the diseases. |
ID.group |
character; name of the variable that contains the IDs of the spatial partition (grouping variable). Only required if |
O |
character; name of the variable that contains the observed number of cases for each areal unit and disease. |
E |
character; name of the variable that contains either the expected number of cases or the population at risk for each areal unit and disease. |
W |
optional argument with the binary adjacency matrix of the spatial areal units. If |
prior |
one of either |
model |
one of either |
k |
numeric value with the neighbourhood order used for the partition model. Usually k=2 or 3 is enough to get good results. If k=0 (default) the Disjoint model is considered. Only required if |
strategy |
one of either |
merge.strategy |
one of either |
compute.intercept |
CAUTION! This argument is deprecated from version 0.5.2. |
compute.DIC |
logical value; if |
n.sample |
numeric; number of samples to generate from the posterior marginal distribution of the linear predictor when computing approximate DIC/WAIC values. Default to 1000. |
compute.fitted.values |
logical value (default |
save.models |
logical value (default |
plan |
one of either |
workers |
character or vector (default |
inla.mode |
one of either |
num.threads |
maximum number of threads the inla-program will use. See |
Details
For a full model specification and further details see the vignettes accompanying this package.
Value
This function returns an object of class inla
. See the mergeINLA
function for details.
References
Bengtsson H (2021). “A unifying framework for parallel and distributed processing in R using futures.” The R Journal, 13(2), 273–291. doi:10.32614/RJ-2021-048.
Besag J, York J, MolliĆ© A (1991). “Bayesian image restoration, with two applications in spatial statistics.” Annals of the Institute of Statistical Mathematics, 43(1), 1–20. doi:10.1007/bf00116466.
Botella-Rocamora P, Martinez-Beneito MA, Banerjee S (2015). “A unifying modeling framework for highly multivariate disease mapping.” Statistics in Medicine, 34(9), 1548–1559. doi:10.1002/sim.6423.
Leroux BG, Lei X, Breslow N (1999). “Estimation of disease rates in small areas: A new mixed model for spatial dependence.” In Halloran ME, Berry D (eds.), Statistical Models in Epidemiology, the Environment, and Clinical Trials, 179–191. Springer-Verlag: New York.
Rue H, Martino S, Chopin N (2009). “Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(2), 319–392. doi:10.1111/j.1467-9868.2008.00700.x.
Vicente G, Adin A, Goicoa T, Ugarte MD (2023). “High-dimensional order-free multivariate spatial disease mapping.” Statistics and Computing, 33(5), 104. doi:10.1007/s11222-023-10263-x.
van Niekerk J, Krainski E, Rustand D, Rue H (2023). “A new avenue for Bayesian inference with INLA.” Computational Statistics & Data Analysis, 181, 107692. doi:10.1016/j.csda.2023.107692.
Examples
## Not run:
if(require("INLA", quietly=TRUE)){
## Load the sf object that contains the spatial polygons of the municipalities of Spain ##
data(Carto_SpainMUN)
str(Carto_SpainMUN)
## Load the simulated cancer mortality data (three diseases) ##
data(Data_MultiCancer)
str(Data_MultiCancer)
## Fit the Global model with an iCAR prior for the within-disease random effects ##
Global <- MCAR_INLA(carto=Carto_SpainMUN, data=Data_MultiCancer,
ID.area="ID", ID.disease="disease", O="obs", E="exp",
prior="intrinsic", model="global", strategy="gaussian")
summary(Global)
## Fit the Disjoint model with an iCAR prior for the within-disease random effects ##
## using 4 local clusters to fit the models in parallel ##
Disjoint <- MCAR_INLA(carto=Carto_SpainMUN, data=Data_MultiCancer,
ID.area="ID", ID.disease="disease", O="obs", E="exp", ID.group="region",
prior="intrinsic", model="partition", k=0, strategy="gaussian",
plan="cluster", workers=rep("localhost",4))
summary(Disjoint)
## 1st-order neighbourhood model with an iCAR prior for the within-disease random effects ##
## using 4 local clusters to fit the models in parallel ##
order1 <- MCAR_INLA(carto=Carto_SpainMUN, data=Data_MultiCancer,
ID.area="ID", ID.disease="disease", O="obs", E="exp", ID.group="region",
prior="intrinsic", model="partition", k=1, strategy="gaussian",
plan="cluster", workers=rep("localhost",4))
summary(order1)
}
## End(Not run)