cvCovEst {cvCovEst}R Documentation

Cross-Validated Covariance Matrix Estimator Selector

Description

cvCovEst() identifies the optimal covariance matrix estimator from among a set of candidate estimators.

Usage

cvCovEst(
  dat,
  estimators = c(linearShrinkEst, thresholdingEst, sampleCovEst),
  estimator_params = list(linearShrinkEst = list(alpha = 0), thresholdingEst =
    list(gamma = 0)),
  cv_loss = cvMatrixFrobeniusLoss,
  cv_scheme = "v_fold",
  mc_split = 0.5,
  v_folds = 10L,
  center = TRUE,
  scale = FALSE,
  parallel = FALSE
)

Arguments

dat

A numeric data.frame, matrix, or similar object.

estimators

A list of estimator functions to be considered in the cross-validated estimator selection procedure.

estimator_params

A named list of arguments corresponding to the hyperparameters of covariance matrix estimators in estimators. The name of each list element should match the name of an estimator passed to estimators. Each element of the estimator_params is itself a named list, with the names corresponding to a given estimator's hyperparameter(s). The hyperparameter(s) may be in the form of a single numeric or a numeric vector. If no hyperparameter is needed for a given estimator, then the estimator need not be listed.

cv_loss

A function indicating the loss function to be used. This defaults to the Frobenius loss, cvMatrixFrobeniusLoss(). An observation-based version, cvFrobeniusLoss(), is also made available. Additionally, the cvScaledMatrixFrobeniusLoss() is included for situations in which dat's variables are of different scales.

cv_scheme

A character indicating the cross-validation scheme to be employed. There are two options: (1) V-fold cross-validation, via "v_folds"; and (2) Monte Carlo cross-validation, via "mc". Defaults to Monte Carlo cross-validation.

mc_split

A numeric between 0 and 1 indicating the proportion of observations to be included in the validation set of each Monte Carlo cross-validation fold.

v_folds

An integer larger than or equal to 1 indicating the number of folds to use for cross-validation. The default is 10, regardless of the choice of cross-validation scheme.

center

A logical indicating whether to center the columns of dat to have mean zero.

scale

A logical indicating whether to scale the columns of dat to have unit variance.

parallel

A logical option indicating whether to run the main cross-validation loop with future_lapply(). This is passed directly to cross_validate().

Value

A list of results containing the following elements:

Examples

cvCovEst(
  dat = mtcars,
  estimators = c(
    linearShrinkLWEst, thresholdingEst, sampleCovEst
  ),
  estimator_params = list(
    thresholdingEst = list(gamma = seq(0.1, 0.3, 0.1))
  ),
  center = TRUE,
  scale = TRUE
)

[Package cvCovEst version 1.1.0 Index]