calib_el {jointCalib} | R Documentation |
An internal function for calibration of weights using empirical likelihood method
Description
calib_el
performs calibration using empirical likelihood (EL) method. The function is taken from Wu (2005), if algorithm has problem with convergence codes from Zhang, Han and Wu (2022) using constrOptim
is used.
In (pseudo) EL the following (pseudo) EL function is maximized
\[\sum_{i \in r} d_i\log(p_i),\]under the following constraint
\[\sum_{i \in r} p_i = 1,\]with constraints on quantiles (with notation as in Harms and Duchesne (2006))
\[\sum_{i \in r} p_i(a_{i} - \alpha/N) = 0,\]where \(a_{i}\) is created using joint_calib_create_matrix
function, and possibly means
where \(\mu_{x}\) is known population mean of X. For simplicity of notation we assume only one quantile and one mean is known. This can be generalized to multiple quantiles and means.
Usage
calib_el(X, d, totals, maxit = 50, tol = 1e-08, eps = .Machine$double.eps, ...)
Arguments
X |
matrix of variables for calibration of quantiles and totals (first column should be intercept), |
d |
initial d-weights for calibration (e.g. design-weights), |
totals |
vector of totals (where 1 element is the population size), |
maxit |
a numeric value giving the maximum number of iterations, |
tol |
the desired accuracy for the iterative procedure, |
eps |
the desired accuracy for computing the Moore-Penrose generalized inverse (see |
... |
arguments passed to stats::optim via stats::constrOptim. |
Value
Returns a vector of empirical likelihood g-weights
Author(s)
Maciej Beręsewicz based on Wu (2005) and Zhang, Han and Wu (2022)
References
Wu, C. (2005). Algorithms and R codes for the pseudo empirical likelihood method in survey sampling. Survey Methodology, 31(2), 239 (code is taken from https://sas.uwaterloo.ca/~cbwu/Rcodes/LagrangeM2.txt).
Zhang, S., Han, P., and Wu, C. (2023) Calibration Techniques Encompassing Survey Sampling, Missing Data Analysis and Causal Inference. International Statistical Review, 91: 165–192. https://doi.org/10.1111/insr.12518 (code is taken from Supplementary Materials).
Examples
## generate data based on Haziza and Lesage (2016)
set.seed(123)
N <- 1000
x <- runif(N, 0, 80)
y <- exp(-0.1 + 0.1*x) + rnorm(N, 0, 300)
p <- rbinom(N, 1, prob = exp(-0.2 - 0.014*x))
totals_known <- c(N=N, x=sum(x))
df <- data.frame(x, y, p)
df_resp <- df[df$p == 1, ]
df_resp$d <- N/nrow(df_resp)
res <- calib_el(X = model.matrix(~x, df_resp),
d = df_resp$d,
totals = totals_known)
data.frame(known = totals_known, estimated=colSums(res*df_resp$d*model.matrix(~x, df_resp)))