trex {TRexSelector} | R Documentation |
Run the T-Rex selector (doi:10.48550/arXiv.2110.06048)
Description
The T-Rex selector (doi:10.48550/arXiv.2110.06048) performs fast variable selection in high-dimensional settings while controlling the false discovery rate (FDR) at a user-defined target level.
Usage
trex(
X,
y,
tFDR = 0.2,
K = 20,
max_num_dummies = 10,
max_T_stop = TRUE,
method = "trex",
GVS_type = "IEN",
cor_coef = NA,
type = "lar",
corr_max = 0.5,
lambda_2_lars = NULL,
rho_thr_DA = 0.02,
hc_dist = "single",
hc_grid_length = min(20, ncol(X)),
parallel_process = FALSE,
parallel_max_cores = min(K, max(1, parallel::detectCores(logical = FALSE))),
seed = NULL,
eps = .Machine$double.eps,
verbose = TRUE
)
Arguments
X |
Real valued predictor matrix. |
y |
Response vector. |
tFDR |
Target FDR level (between 0 and 1, i.e., 0% and 100%). |
K |
Number of random experiments. |
max_num_dummies |
Integer factor determining the maximum number of dummies as a multiple of the number of original variables p (i.e., num_dummies = max_num_dummies * p). |
max_T_stop |
If TRUE the maximum number of dummies that can be included before stopping is set to ceiling(n / 2), where n is the number of data points/observations. |
method |
'trex' for the T-Rex selector (doi:10.48550/arXiv.2110.06048), 'trex+GVS' for the T-Rex+GVS selector (doi:10.23919/EUSIPCO55093.2022.9909883), 'trex+DA+AR1' for the T-Rex+DA+AR1 selector, 'trex+DA+equi' for the T-Rex+DA+equi selector, 'trex+DA+BT' for the T-Rex+DA+BT selector (doi:10.48550/arXiv.2401.15796), 'trex+DA+NN' for the T-Rex+DA+NN selector (doi:10.48550/arXiv.2401.15139). |
GVS_type |
'IEN' for the Informed Elastic Net (doi:10.1109/CAMSAP58249.2023.10403489), 'EN' for the ordinary Elastic Net (doi:10.1111/j.1467-9868.2005.00503.x). |
cor_coef |
AR(1) autocorrelation coefficient for the T-Rex+DA+AR1 selector or equicorrelation coefficient for the T-Rex+DA+equi selector. |
type |
'lar' for 'LARS' and 'lasso' for Lasso. |
corr_max |
Maximum allowed correlation between any two predictors from different clusters (for method = 'trex+GVS'). |
lambda_2_lars |
lambda_2-value for LARS-based Elastic Net. |
rho_thr_DA |
Correlation threshold for the T-Rex+DA+AR1 selector and the T-Rex+DA+equi selector (i.e., method = 'trex+DA+AR1' or 'trex+DA+equi'). |
hc_dist |
Distance measure of the hierarchical clustering/dendrogram (only for trex+DA+BT): 'single' for single-linkage, "complete" for complete linkage, "average" for average linkage (see hclust for more options). |
hc_grid_length |
Length of the height-cutoff-grid for the dendrogram (integer between 1 and the number of original variables p). |
parallel_process |
Logical. If TRUE random experiments are executed in parallel. |
parallel_max_cores |
Maximum number of cores to be used for parallel processing. |
seed |
Seed for random number generator (ignored if parallel_process = FALSE). |
eps |
Numerical zero. |
verbose |
Logical. If TRUE progress in computations is shown. |
Value
A list containing the estimated support vector and additional information, including the number of used dummies and the number of included dummies before stopping.
Examples
data("Gauss_data")
X <- Gauss_data$X
y <- c(Gauss_data$y)
set.seed(1234)
res <- trex(X = X, y = y)
selected_var <- res$selected_var
selected_var