phecap_run_feature_extraction {PheCAP} | R Documentation |
Run Surrogate-Assisted Feature Extraction (SAFE)
Description
Run surrogate-assisted feature extraction (SAFE) using unlabeled data and subsampling.
Usage
phecap_run_feature_extraction(
data, surrogates,
subsample_size = 1000L, num_subsamples = 200L,
dropout_proportion = 0, frequency_cutoff = 0.5,
start_seed = 45600L, verbose = 0L)
Arguments
data |
An object of class PhecapData, obtained by calling PhecapData(...) |
surrogates |
A list of objects of class PhecapSurrogate, obtained by something like list(PhecapSurrogate(...), PhecapSurrogate(...)) |
subsample_size |
An integer scalar giving the size of each subsample |
num_subsamples |
The number of subsamples drawn for each surrogate |
dropout_proportion |
A scalar between 0 and 1. If it is positive, for each predictor a random subset of observations will be set to zero |
frequency_cutoff |
A scalar between 0 and 1. Variables selected in at least this proportion of the subsamples are the variables finally selected |
start_seed |
in the i-th subsample, the seed is set to start_seed + i |
verbose |
print progress every |
Details
In this unlabeled setting, the extremes of each surrogate are used to define cases and controls. The variables selected are those selected in at least half (or the proportion specified) of the subsamples.
Value
An object of class PhecapFeatureExtraction
, with components
selected |
the names of selected features |
frequency |
the proportion of being selected for each feature |
See Also
See PheCAP-package
for code examples.