fit_model {disaggregation} | R Documentation |
Fit the disaggregation model
Description
fit_model function takes a disag_data object created by
prepare_data
and performs a Bayesian disaggregation fit.
Usage
fit_model(
data,
priors = NULL,
family = "gaussian",
link = "identity",
iterations = 100,
field = TRUE,
iid = TRUE,
hess_control_parscale = NULL,
hess_control_ndeps = 1e-04,
silent = TRUE
)
disag_model(
data,
priors = NULL,
family = "gaussian",
link = "identity",
iterations = 100,
field = TRUE,
iid = TRUE,
hess_control_parscale = NULL,
hess_control_ndeps = 1e-04,
silent = TRUE
)
Arguments
data |
disag_data object returned by |
priors |
list of prior values |
family |
likelihood function: gaussian, binomial or poisson |
link |
link function: logit, log or identity |
iterations |
number of iterations to run the optimisation for |
field |
logical. Flag the spatial field on or off |
iid |
logical. Flag the iid effect on or off |
hess_control_parscale |
Argument to scale parameters during the calculation of the Hessian.
Must be the same length as the number of parameters. See |
hess_control_ndeps |
Argument to control step sizes during the calculation of the Hessian.
Either length 1 (same step size applied to all parameters) or the same length as the number of parameters.
Default is 1e-3, try setting a smaller value if you get NaNs in the standard error of the parameters.
See |
silent |
logical. Suppress verbose output. |
Details
The model definition
The disaggregation model makes predictions at the pixel level:
link(pred_i) = \beta_0 + \beta X + GP(s_i) + u_i
And then aggregates these predictions to the polygon level using the weighted sum (via the aggregation raster, agg_i
):
cases_j = \sum_{i \epsilon j} pred_i \times agg_i
rate_j = \frac{\sum_{i \epsilon j} pred_i \times agg_i}{\sum_{i \epsilon j} agg_i}
The different likelihood correspond to slightly different models (y_j
is the response count data):
Gaussian: If
\sigma
is the dispersion of the pixel data,\sigma_j
is the dispersion of the polygon data, where\sigma_j = \sigma \sqrt{\sum agg_i^2} / \sum agg_i
dnorm(y_j/\sum agg_i, rate_j, \sigma_j)
- predicts incidence rate.
Binomial: For a survey in polygon j,
y_j
is the number positive andN_j
is the number tested.dbinom(y_j, N_j, rate_j)
- predicts prevalence rate.
Poisson:
dpois(y_j, cases_j)
- predicts incidence count.
Specify priors for the regression parameters, field and iid effect as a single list. Hyperpriors for the field
are given as penalised complexity priors you specify \rho_{min}
and \rho_{prob}
for the range of the field
where P(\rho < \rho_{min}) = \rho_{prob}
, and \sigma_{min}
and \sigma_{prob}
for the variation of the field
where P(\sigma > \sigma_{min}) = \sigma_{prob}
. Also, specify pc priors for the iid effect
The family and link arguments are used to specify the likelihood and link function respectively. The likelihood function can be one of gaussian, poisson or binomial. The link function can be one of logit, log or identity. These are specified as strings.
The field and iid effect can be turned on or off via the field and iid logical flags. Both are default TRUE.
The iterations argument specifies the maximum number of iterations the model can run for to find an optimal point.
The silent argument can be used to publish/suppress verbose output. Default TRUE.
Value
A list is returned of class disag_model
.
The functions summary, print and plot can be used on disag_model
.
The list of class disag_model
contains:
obj |
The TMB model object returned by |
opt |
The optimized model object returned by |
sd_out |
The TMB object returned by |
data |
The disag_data object used as an input to the model. |
model_setup |
A list of information on the model setup. Likelihood function (family), link function(link), logical: whether a field was used (field) and logical: whether an iid effect was used (iid). |
References
Nanda et al. (2023) disaggregation: An R Package for Bayesian Spatial Disaggregation Modeling. <doi:10.18637/jss.v106.i11>
Examples
## Not run:
polygons <- list()
n_polygon_per_side <- 10
n_polygons <- n_polygon_per_side * n_polygon_per_side
n_pixels_per_side <- n_polygon_per_side * 2
for(i in 1:n_polygons) {
row <- ceiling(i/n_polygon_per_side)
col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side)
xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
c(ymax, ymax, ymin, ymin, ymax)))
}
polys <- lapply(polygons,sf::st_polygon)
N <- floor(runif(n_polygons, min = 1, max = 100))
response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000))
spdf <- sf::st_sf(response_df, geometry = polys)
# Create raster stack
r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r) <- terra::ext(spdf)
r[] <- sapply(1:terra::ncell(r), function(x){
rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))}
r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r2) <- terra::ext(spdf)
r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3))
cov_stack <- c(r, r2)
names(cov_stack) <- c('layer1', 'layer2')
test_data <- prepare_data(polygon_shapefile = spdf,
covariate_rasters = cov_stack)
result <- fit_model(test_data, iterations = 2)
## End(Not run)