fit_sae {tipsae}R Documentation

Fitting a Small Area Model

Description

fit_sae() is used to fit Beta-based small area models, such as the classical Beta, zero and/or one inflated Beta and Flexible Beta models. The random effect part can incorporate either a temporal and/or a spatial dependency structure devoted to the prior specification settings. In addition, different prior assumptions can be specified for the unstructured random effects, allowing for robust and shrinking priors and different parametrizations can be set up.

Usage

fit_sae(
  formula_fixed,
  data,
  domains = NULL,
  disp_direct,
  type_disp = c("neff", "var"),
  domain_size = NULL,
  likelihood = c("beta", "flexbeta", "Infbeta0", "Infbeta1", "Infbeta01"),
  prior_reff = c("normal", "t", "VG"),
  spatial_error = FALSE,
  spatial_df = NULL,
  domains_spatial_df = NULL,
  temporal_error = FALSE,
  temporal_variable = NULL,
  scale_prior = list(Unstructured = 2.5, Spatial = 2.5, Temporal = 2.5, Coeff. = 2.5),
  adapt_delta = 0.95,
  max_treedepth = 10,
  init = "0",
  ...
)

Arguments

formula_fixed

An object of class "formula" specifying the linear regression fixed part at the linking level.

data

An object of class "data.frame" containing all relevant quantities.

domains

Data column name displaying the domain names. If NULL (default), the domains are denoted with a progressive number.

disp_direct

Data column name displaying given values of sampling dispersion for each domain. In out-of-sample areas, dispersion must be NA.

type_disp

Parametrization of the dispersion parameter. The choices are variance ("var") or \phi_d + 1 ("neff") parameter.

domain_size

Data column name indicating domain sizes (optional). In out-of-sample areas, sizes must be NA.

likelihood

Sampling likelihood to be used. The choices are "beta" (default), "flexbeta", "Infbeta0", "Infbeta1" and "Infbeta01".

prior_reff

Prior distribution of the unstructured random effect. The choices are: "normal", "t", "VG".

spatial_error

Logical indicating whether to include a spatially structured random effect.

spatial_df

Object of class SpatialPolygonsDataFrame or sf with the shapefile of the studied region. Required if spatial_error = TRUE.

domains_spatial_df

Column name of the spatial_df object displaying the domain names. Required if spatial_error = TRUE.

temporal_error

Logical indicating whether to include a temporally structured random effect.

temporal_variable

Data column name indicating temporal variable. Required if temporal_error = TRUE.

scale_prior

List with the values of the prior scales. 4 named elements must be provided: "Unstructured", "Spatial", "Temporal", "Coeff.". Default: all equal to 2.5.

adapt_delta

HMC option: target average proposal acceptance probability. See stan documentation.

max_treedepth

HMC option: target average proposal acceptance probability. See stan documentation.

init

Initial values specification. See the detailed documentation for the init argument in stan.

...

Arguments passed to sampling (e.g. iter, chains).

Value

A list of class fitsae containing the following objects:

model_settings

A list summarizing all the assumptions of the model: sampling likelihood, presence of intercept, dispersion parametrization, random effects priors and possible structures.

data_obj

A list containing input objects including in-sample and out-of-sample relevant quantities.

stanfit

A stanfit object, outcome of sampling function containing full posterior draws. For details, see stan documentation.

pars_interest

A vector containing the names of parameters whose posterior samples are stored.

call

Image of the function call that produced the fitsae object.

References

Janicki R (2020). “Properties of the beta regression model for small area estimation of proportions and application to estimation of poverty rates.” Communications in Statistics-Theory and Methods, 49(9), 2264–2284.

Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, Riddell A (2017). “Stan: A probabilistic programming language.” Journal of Statistical Software, 76(1), 1–32.

Morris M, Wheeler-Martin K, Simpson D, Mooney SJ, Gelman A, DiMaggio C (2019). “Bayesian hierarchical spatial models: Implementing the Besag York Mollié model in stan.” Spatial and Spatio-Temporal Epidemiology, 31, 100301.

De Nicolò S, Ferrante MR, Pacei S (2023). “Small area estimation of inequality measures using mixtures of Beta.” https://doi.org/10.1093/jrsssa/qnad083.

De Nicolò S, Gardini A (2024). “The R Package tipsae: Tools for Mapping Proportions and Indicators on the Unit Interval.” Journal of Statistical Software, 108(1), 1–36. doi:10.18637/jss.v108.i01.

See Also

sampling for sampler options and summary.fitsae for handling the output.

Examples

library(tipsae)

# loading toy cross sectional dataset
data("emilia_cs")

# fitting a cross sectional model
fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
                    type_disp = "var", disp_direct = "vars", domain_size = "n",
                    # MCMC setting to obtain a fast example. Remove next line for reliable results.
                    chains = 1, iter = 150, seed = 0)


# Spatio-temporal model: it might require time to be fitted
## Not run: 
# loading toy panel dataset
data("emilia")
# loading the shapefile of the concerned areas
data("emilia_shp")

# fitting a spatio-temporal model
fit_ST <- fit_sae(formula_fixed = hcr ~ x,
                  domains = "id",
                  disp_direct = "vars",
                  type_disp = "var",
                  domain_size = "n",
                  data = emilia,
                  spatial_error = TRUE,
                  spatial_df = emilia_shp,
                  domains_spatial_df = "NAME_DISTRICT",
                  temporal_error = TRUE,
                  temporal_variable = "year",
                  max_treedepth = 15,
                  seed = 0)

## End(Not run)


[Package tipsae version 1.0.1 Index]