calibration {icarus} | R Documentation |
Calibration on margins
Description
Performs calibration on margins with several methods and customizable parameters
Usage
calibration(
data,
marginMatrix,
colWeights,
method = "linear",
bounds = NULL,
q = NULL,
costs = NULL,
gap = NULL,
popTotal = NULL,
pct = FALSE,
scale = NULL,
description = TRUE,
maxIter = 2500,
check = TRUE,
calibTolerance = 1e-06,
uCostPenalized = 1,
lambda = NULL,
precisionBounds = 1e-04,
forceSimplex = FALSE,
forceBisection = FALSE,
colCalibratedWeights,
exportDistributionImage = NULL,
exportDistributionTable = NULL
)
Arguments
data |
The dataframe containing the survey data |
marginMatrix |
The matrix giving the margins for each column variable included in the calibration problem |
colWeights |
The name of the column containing the initial weights in the survey dataframe |
method |
The method used to calibrate. Can be "linear", "raking", "logit" |
bounds |
Two-element vector containing the lower and upper bounds for bounded methods ("logit") |
q |
Vector of q_k weights described in Deville and Sarndal (1992) |
costs |
The penalized calibration method will be used, using costs defined by this vector. Must match the number of rows of marginMatrix. Negative of non-finite costs are given an infinite cost (coefficient of C^-1 matrix is 0) |
gap |
Only useful for penalized calibration. Sets the maximum gap between max and min calibrated weights / initial weights ratio (and thus is similar to the "bounds" parameter used in regular calibration) |
popTotal |
Precise the total population if margins are defined by relative value in marginMatrix (percentages) |
pct |
If TRUE, margins for categorical variables are considered to be entered as percentages. popTotal must then be set. (FALSE by default) |
scale |
If TRUE, stats (including bounds) on ratio calibrated weights / initial weights are done on a vector multiplied by the weighted non-response ratio (ratio population total / total of initial weights). Has same behavior as "ECHELLE=0" in Calmar. |
description |
If TRUE, output stats about the calibration process as well as the graph of the density of the ratio calibrated weights / initial weights |
maxIter |
The maximum number of iterations before stopping |
check |
performs a few check about the dataframe. TRUE by default |
calibTolerance |
Tolerance for the distance to an exact solution. Could be useful when there is a huge number of margins as the risk of inadvertently setting incompatible constraints is higher. Set to 1e-06 by default. |
uCostPenalized |
Unary cost by which every cost is "costs" column is multiplied |
lambda |
The initial ridge lambda used in penalized calibration. By default, the initial lambda is automatically chosen by the algorithm, but you can speed up the search for the optimum if you already know a lambda close to the lambda_opt corresponding to the gap you set. Be careful, the search zone is reduced when a lambda is set by the user, so the program may not converge if the lambda set is too far from the lambda_opt. |
precisionBounds |
Only used for calibration on minimum bounds. Desired precision for lower and upper reweighting factor, both bounds being as close to 1 as possible |
forceSimplex |
Only used for calibration on tight bounds.Bisection algorithm is used for matrices whose size exceed 1e8. forceSimplex = TRUE forces the use of the simplex algorithm whatever the size of the problem (you might want to set this parameter to TRUE if you have a large memory size) |
forceBisection |
Only used for calibration on tight bounds. Forces the use of the bisection algorithm to solve calibration on tight bounds |
colCalibratedWeights |
Deprecated. Only used in the scope of calibration function |
exportDistributionImage |
File name to which the density plot shown when description is TRUE is exported. Requires package "ggplot2" |
exportDistributionTable |
File name to which the distribution table of before/after weights shown when description is TRUE is exported. Requires package "xtable" |
Value
column containing the final calibrated weights
References
Deville, Jean-Claude, and Carl-Erik Sarndal. "Calibration estimators in survey sampling." Journal of the American statistical Association 87.418 (1992): 376-382.
Bocci, J., and C. Beaumont. "Another look at ridge calibration." Metron 66.1 (2008): 5-20.
Vanderhoeft, Camille. Generalised calibration at statistics Belgium: SPSS Module G-CALIB-S and current practices. Inst. National de Statistique, 2001.
Le Guennec, Josiane, and Olivier Sautory. "Calmar 2: Une nouvelle version de la macro calmar de redressement d'echantillon par calage." Journees de Methodologie Statistique, Paris. INSEE (2002).
Examples
N <- 300 ## population total
## Horvitz Thompson estimator of the mean: 1.666667
weightedMean(data_employees$movies, data_employees$weight, N)
## Enter calibration margins:
mar1 <- c("category",3,80,90,60)
mar2 <- c("sex",2,140,90,0)
mar3 <- c("department",2,100,130,0)
mar4 <- c("salary", 0, 470000,0,0)
margins <- rbind(mar1, mar2, mar3, mar4)
## Compute calibrated weights with raking ratio method
wCal <- calibration(data=data_employees, marginMatrix=margins, colWeights="weight"
, method="raking", description=FALSE)
## Calibrated estimate: 2.471917
weightedMean(data_employees$movies, wCal, N)