est_predictiveness_cv {vimp} | R Documentation |
Estimate a nonparametric predictiveness functional using cross-fitting
Description
Compute nonparametric estimates of the chosen measure of predictiveness.
Usage
est_predictiveness_cv(
fitted_values,
y,
full_y = NULL,
folds,
type = "r_squared",
C = rep(1, length(y)),
Z = NULL,
folds_Z = folds,
ipc_weights = rep(1, length(C)),
ipc_fit_type = "external",
ipc_eif_preds = rep(1, length(C)),
ipc_est_type = "aipw",
scale = "identity",
na.rm = FALSE,
...
)
Arguments
fitted_values |
fitted values from a regression function using the
observed data; a list of length V, where each object is a set of
predictions on the validation data, or a vector of the same length as |
y |
the observed outcome. |
full_y |
the observed outcome (from the entire dataset, for cross-fitted estimates). |
folds |
the cross-validation folds for the observed data. |
type |
which parameter are you estimating (defaults to |
C |
the indicator of coarsening (1 denotes observed, 0 denotes unobserved). |
Z |
either |
folds_Z |
either the cross-validation folds for the observed data (no coarsening) or a vector of folds for the fully observed data Z. |
ipc_weights |
weights for inverse probability of coarsening (e.g., inverse weights from a two-phase sample) weighted estimation. Assumed to be already inverted (i.e., ipc_weights = 1 / [estimated probability weights]). |
ipc_fit_type |
if "external", then use |
ipc_eif_preds |
if |
ipc_est_type |
IPC correction, either |
scale |
if doing an IPC correction, then the scale that the correction should be computed on (e.g., "identity"; or "logit" to logit-transform, apply the correction, and back-transform). |
na.rm |
logical; should NA's be removed in computation?
(defaults to |
... |
other arguments to SuperLearner, if |
Details
See the paper by Williamson, Gilbert, Simon, and Carone for more
details on the mathematics behind this function and the definition of the
parameter of interest. If sample-splitting is also requested
(recommended, since in this case inferences
will be valid even if the variable has zero true importance), then the
prediction functions are trained as if 2K
-fold cross-validation were run,
but are evaluated on only K
sets (independent between the full and
reduced nuisance regression).
Value
The estimated measure of predictiveness.