smoothing {tipsae}R Documentation

Variance Smoothing and Effective Sample Sizes Estimation

Description

The smoothing() function implements three methods, all yielding refined estimates of either variance or effective sample size, to account for indicators with different variance functions. The output estimates are ready to be used as known parameters in an area-level model, and they need to be added to the analysed data.frame object. All the implemented methods enable the estimation of the effective sample sizes, whereas "ols" and "gls" also perform a variance smoothing procedure.

Usage

smoothing(
  data,
  direct_estimates,
  area_id = NULL,
  raw_variance = NULL,
  areas_sample_sizes = NULL,
  additional_covariates = NULL,
  method = c("ols", "gls", "kish"),
  var_function = NULL,
  survey_data = NULL,
  survey_area_id = NULL,
  weights = NULL,
  sizes = NULL
)

Arguments

data

A data.frame object including the direct estimates.

direct_estimates

Character string specifying the variable in data denoting the direct estimates.

area_id

Character string indicating the variable with domain names included in data, to be specified if method "kish" is selected.

raw_variance

Character string indicating the variable name for raw variance estimates included in data object, to be specified if methods "ols" or "gls" are selected.

areas_sample_sizes

Character string indicating the variable name for domain sample sizes included in data object, to be specified if methods "ols" or "gls" are selected.

additional_covariates

A vector of character strings indicating the variable names of possible additional covariates, included in data, to be added to the smoothing procedure if methods "ols" or "gls" are selected.

method

The method to be used. The choices are "kish","ols" and "gls".

var_function

An object of class function denoting the variance function of the response variable. The default option (NULL) matches the proportion case being equal to function(x) x * (1 - x). If an alternative function is specified, only variance estimates are provided.

survey_data

An additional dataset to be specified when method "kish" is selected, defined at sampling unit level (e.g., households) and comprising sampling weights, unit sizes and domain names.

survey_area_id

Character string indicating the variable denoting the domain names included in the survey_data object.

weights

Character string indicating the variable including sampling weights in survey_data object.

sizes

Character string indicating the variable including unit sizes in survey_data object.

Value

An object of class smoothing_fitsae, being a list of vectors including dispersion estimates: the variances and, when no alternative variance functions are specified, the effective sample sizes. When "ols" or "gls" method has been selected, the list incorporates also an object of class gls from nlme package.

References

Kish L (1992). “Weighting for Unequal Pi.” Journal of Official Statistics, 8(2), 183.

Fabrizi E, Ferrante MR, Pacei S, Trivisano C (2011). “Hierarchical Bayes multivariate estimation of poverty rates based on increasing thresholds for small domains.” Computational Statistics & Data Analysis, 55(4), 1736–1747.

De Nicolò S, Gardini A (2024). “The R Package tipsae: Tools for Mapping Proportions and Indicators on the Unit Interval.” Journal of Statistical Software, 108(1), 1–36. doi:10.18637/jss.v108.i01.

See Also

gls for details on estimation procedure for "ols" and "gls" methods.

Examples


library(tipsae)

# loading toy dataset
data("emilia_cs")

# perform smoothing procedure
smoo <- smoothing(emilia_cs, direct_estimates = "hcr", area_id = "id",
                  raw_variance = "vars", areas_sample_sizes = "n",
                  var_function = NULL, method = "ols")


[Package tipsae version 1.0.1 Index]