R: Benchmarking Procedure for Model-Based Estimates

benchmark {tipsae}

R Documentation

Benchmarking Procedure for Model-Based Estimates

Description

The benchmark() function gives the chance to perform a benchmarking procedure on model-based estimates. Benchmarking could target solely the point estimates (single benchmarking) or, alternatively, also the ensemble variability (double benchmarking). Furthermore, an estimate of the overall posterior risk is provided, aggregated for all areas. This value is only yielded when in-sample areas are treated and a single benchmarking is performed.

Usage

benchmark(
  x,
  bench,
  share,
  method = c("raking", "ratio", "double"),
  H = NULL,
  time = NULL,
  areas = NULL
)

Arguments

`x`	Object of class `summary_fitsae`.
`bench`	A numeric value denoting the benchmark for the whole set of areas or a subset of areas.
`share`	A numeric vector of areas weights, in case of proportions it denotes the population shares.
`method`	The method to be specified among `"raking"`, `"ratio"` and `"double"`, see details.
`H`	A numeric value denoting an additional benchmark, to be specified when the `"double"` method is selected, corresponding to the ensemble variability.
`time`	A character string indicating the time period to be considered, in case of temporal models, where a benchmark can be specified only for one time period at a time.
`areas`	If `NULL` (default option), benchmarking is done on the whole set of areas, alternatively it can be done on a subset of them by indicating a vector containing the names of subset areas.

Details

The function allows performing three different benchmarking methods, according to the argument method.

The "ratio" and "raking" methods provide benchmarked estimates that minimize the posterior expectation of the weighted squared error loss, see Datta et al. (2011) and tipsae vignette.
The "double" method accounts for a further benchmark on the weighted ensemble variability, where H is a prespecified value of the estimators variability.

Value

A benchmark_fitsae object being a list of the following elements:

bench_est: A vector including the benchmarked estimates for each considered domain.
post_risk: A numeric value indicating an estimate of the overall posterior risk, aggregated for all areas. This value is only yielded when in-sample areas are treated and a single benchmarking is performed.
method: The benchmarking method performed as selected in the input argument.
time: The time considered as selected in the input argument.
areas: The areas considered as selected in the input argument.
data_obj: A list containing input objects including in-sample and out-of-sample relevant quantities.
model_settings: A list summarizing all the assumptions of the model: sampling likelihood, presence of intercept, dispersion parametrization, random effects priors and possible structures.
model_estimates: Posterior summaries of target parameters for in-sample areas.
model_estimates_oos: Posterior summaries of target parameters for out-of-sample areas.
is_oos: Logical vector defining whether each domain is out-of-sample or not.
direct_est: Vector of direct estimates for in-sample areas.

References

Datta GS, Ghosh M, Steorts R, Maples J (2011). “Bayesian benchmarking with applications to small area estimation.” Test, 20(3), 574–588.

De Nicolò S, Gardini A (2024). “The R Package tipsae: Tools for Mapping Proportions and Indicators on the Unit Interval.” Journal of Statistical Software, 108(1), 1–36. doi:10.18637/jss.v108.i01.

Examples

library(tipsae)

# loading toy dataset
data("emilia_cs")

# fitting a model

fit_beta <- fit_sae(formula_fixed = hcr ~ x, data = emilia_cs, domains = "id",
                    type_disp = "var", disp_direct = "vars", domain_size = "n",
                    # MCMC setting to obtain a fast example. Remove next line for reliable results.
                    chains = 1, iter = 150, seed = 0)

# check model diagnostics
summ_beta <- summary(fit_beta)

# creating a subset of the areas whose estimates have to be benchmarked
subset <- c("RIMINI", "RICCIONE", "RUBICONE", "CESENA - VALLE DEL SAVIO")

# creating population shares of the subset areas
pop <- emilia_cs$pop[emilia_cs$id %in% subset]
shares_subset <- pop / sum(pop)

# perform benchmarking procedure
bmk_subset <- benchmark(x = summ_beta,
                        bench = 0.13,
                        share = shares_subset,
                        method = "raking",
                        areas = subset)

# check benchmarked estimates and posterior risk
bmk_subset$bench_est
bmk_subset$post_risk

[Package tipsae version 1.0.2 Index]