ForwardSelection.Model.Res {FRESA.CAD} | R Documentation |
NeRI-based feature selection procedure for linear, logistic, or Cox proportional hazards regression models
Description
This function performs a bootstrap sampling to rank the most frequent variables that statistically aid the models by minimizing the residuals. After the frequency rank, the function uses a forward selection procedure to create a final model, whose terms all have a significant contribution to the net residual improvement (NeRI).
Usage
ForwardSelection.Model.Res(size = 100,
fraction = 1,
pvalue = 0.05,
loops = 100,
covariates = "1",
Outcome,
variableList,
data,
maxTrainModelSize = 20,
type = c("LM", "LOGIT", "COX"),
testType=c("Binomial", "Wilcox", "tStudent", "Ftest"),
timeOutcome = "Time",
cores = 6,
randsize = 0,
featureSize=0)
Arguments
size |
The number of candidate variables to be tested (the first |
fraction |
The fraction of data (sampled with replacement) to be used as train |
pvalue |
The maximum p-value, associated to the NeRI, allowed for a term in the model (controls the false selection rate) |
loops |
The number of bootstrap loops |
covariates |
A string of the type "1 + var1 + var2" that defines which variables will always be included in the models (as covariates) |
Outcome |
The name of the column in |
variableList |
A data frame with two columns. The first one must have the names of the candidate variables and the other one the description of such variables |
data |
A data frame where all variables are stored in different columns |
maxTrainModelSize |
Maximum number of terms that can be included in the model |
type |
Fit type: Logistic ("LOGIT"), linear ("LM"), or Cox proportional hazards ("COX") |
testType |
Type of non-parametric test to be evaluated by the |
timeOutcome |
The name of the column in |
cores |
Cores to be used for parallel processing |
randsize |
the model size of a random outcome. If randsize is less than zero. It will estimate the size |
featureSize |
The original number of features to be explored in the data frame. |
Value
final.model |
An object of class |
var.names |
A vector with the names of the features that were included in the final model |
formula |
An object of class |
ranked.var |
An array with the ranked frequencies of the features |
formula.list |
A list containing objects of class |
variableList |
A list of variables used in the forward selection |
Author(s)
Jose G. Tamez-Pena and Antonio Martinez-Torteya