CBDA_Validation {CBDA}R Documentation

CBDA Validation function for Compressive Big Data Analytics

Description

This CBDA function generates *max_covs - min_covs* nested models based on the ranking returned by the *Consolidation* function. It also consolidates all the *max_covs - min_covs* workspaces into a single one.

Usage

CBDA_Validation(label = "CBDA_package_test", alpha = 0.2, Kcol_min = 5,
  Kcol_max = 15, Nrow_min = 30, Nrow_max = 50, misValperc = 0,
  M = 3000, N_cores = 1, top = 1000, workspace_directory = tempdir(),
  max_covs = 100, min_covs = 5)

Arguments

label

This is the label appended to RData workspaces generated within the CBDA calls

alpha

Percentage of the Big Data to hold off for Validation

Kcol_min

Lower bound for the percentage of features-columns sampling (used for the Feature Sampling Range - FSR)

Kcol_max

Upper bound for the percentage of features-columns sampling (used for the Feature Sampling Range - FSR)

Nrow_min

Lower bound for the percentage of cases-rows sampling (used for the Case Sampling Range - CSR)

Nrow_max

Upper bound for the percentage of cases-rows sampling (used for the Case Sampling Range - CSR)

misValperc

Percentage of missing values to introduce in BigData (used just for testing, to mimic real cases).

M

Number of the BigData subsets on which perform Knockoff Filtering and SuperLearner feature mining

N_cores

Number of Cores to use in the parallel implementation

top

Top predictions to select out of the M

workspace_directory

Directory where the results and workspaces are saved

max_covs

Top features to display and include in the Validation Step where nested models are tested

min_covs

Minimum number of top features to include in the initial model for the Validation Step

Value

value


[Package CBDA version 1.0.0 Index]