R: Estimate a nonparametric predictiveness functional using...

est_predictiveness_cv {vimp}

R Documentation

Estimate a nonparametric predictiveness functional using cross-fitting

Description

Compute nonparametric estimates of the chosen measure of predictiveness.

Usage

est_predictiveness_cv(
  fitted_values,
  y,
  full_y = NULL,
  folds,
  type = "r_squared",
  C = rep(1, length(y)),
  Z = NULL,
  folds_Z = folds,
  ipc_weights = rep(1, length(C)),
  ipc_fit_type = "external",
  ipc_eif_preds = rep(1, length(C)),
  ipc_est_type = "aipw",
  scale = "identity",
  na.rm = FALSE,
  ...
)

Arguments

`fitted_values`	fitted values from a regression function using the observed data; a list of length V, where each object is a set of predictions on the validation data, or a vector of the same length as `y`.
`y`	the observed outcome.
`full_y`	the observed outcome (from the entire dataset, for cross-fitted estimates).
`folds`	the cross-validation folds for the observed data.
`type`	which parameter are you estimating (defaults to `r_squared`, for R-squared-based variable importance)?
`C`	the indicator of coarsening (1 denotes observed, 0 denotes unobserved).
`Z`	either `NULL` (if no coarsening) or a matrix-like object containing the fully observed data.
`folds_Z`	either the cross-validation folds for the observed data (no coarsening) or a vector of folds for the fully observed data Z.
`ipc_weights`	weights for inverse probability of coarsening (e.g., inverse weights from a two-phase sample) weighted estimation. Assumed to be already inverted (i.e., ipc_weights = 1 / [estimated probability weights]).
`ipc_fit_type`	if "external", then use `ipc_eif_preds`; if "SL", fit a SuperLearner to determine the correction to the efficient influence function.
`ipc_eif_preds`	if `ipc_fit_type = "external"`, the fitted values from a regression of the full-data EIF on the fully observed covariates/outcome; otherwise, not used.
`ipc_est_type`	IPC correction, either `"ipw"` (for classical inverse probability weighting) or `"aipw"` (for augmented inverse probability weighting; the default).
`scale`	if doing an IPC correction, then the scale that the correction should be computed on (e.g., "identity"; or "logit" to logit-transform, apply the correction, and back-transform).
`na.rm`	logical; should NA's be removed in computation? (defaults to `FALSE`)
`...`	other arguments to SuperLearner, if `ipc_fit_type = "SL"`.

Details

See the paper by Williamson, Gilbert, Simon, and Carone for more details on the mathematics behind this function and the definition of the parameter of interest. If sample-splitting is also requested (recommended, since in this case inferences will be valid even if the variable has zero true importance), then the prediction functions are trained as if 2K-fold cross-validation were run, but are evaluated on only K sets (independent between the full and reduced nuisance regression).

Value

The estimated measure of predictiveness.

[Package vimp version 2.3.3 Index]