popsize_cond {drpop} | R Documentation |
Estimate total population size and capture probability using user provided set of models conditioned on an attribute.
popsize_cond(
data,
K = 2,
filterrows = FALSE,
funcname = c("rangerlogit"),
condvar,
nfolds = 2,
margin = 0.005,
sl.lib = c("SL.gam", "SL.glm", "SL.glm.interaction", "SL.ranger", "SL.glmnet"),
TMLE = TRUE,
PLUGIN = TRUE,
Nmin = 100,
...
)
data |
The data frame in capture-recapture format for which total population is to be estimated. The first K columns are the capture history indicators for the K lists. The remaining columns are covariates in numeric format. |
K |
The number of lists in the data. typically the first |
filterrows |
A logical value denoting whether to remove all rows with only zeroes. |
funcname |
The vector of estimation function names to obtain the population size. |
condvar |
The covariate for which conditional estimates are required. |
nfolds |
The number of folds to be used for cross fitting. |
margin |
The minimum value the estimates can attain to bound them away from zero. |
sl.lib |
Algorithm library for |
TMLE |
The logical value to indicate whether TMLE has to be computed. |
PLUGIN |
The logical value to indicate whether the plug-in estimates are returned. |
Nmin |
The cutoff for minimum sample size to perform doubly robust estimation. Otherwise, Petersen estimator is returned. |
... |
Any extra arguments passed into the function. See |
A list of estimates containing the following components for each list-pair, model and method (PI = plug-in, DR = doubly-robust, TMLE = targeted maximum likelihood estimate):
result |
A dataframe of the below estimated quantities.
|
N |
The number of data points used in the estimation after removing rows with missing data. |
ifvals |
The estimated influence function values for the observed data. |
nuis |
The estimated nuisance functions (q12, q1, q2) for each element in funcname. |
nuistmle |
The estimated nuisance functions (q12, q1, q2) from tmle for each element in funcname. |
idfold |
The division of the rows into sets (folds) for cross-fitting. |
Das, M., Kennedy, E. H., & Jewell, N.P. (2021). Doubly robust capture-recapture methods for estimating population size. arXiv preprint arXiv:2104.14091.
data = simuldata(n = 10000, l = 2, categorical = TRUE)$data
psin_estimate = popsize_cond(data = data, funcname = c("logit", "gam"),
condvar = 'catcov', PLUGIN = TRUE, TMLE = TRUE)
#this returns the plug-in, the bias-corrected and the tmle estimate for the
#two models conditioned on column catcov