AutoScore_rank {AutoScore} | R Documentation |
AutoScore STEP(i): Rank variables with machine learning (AutoScore Module 1)
Description
AutoScore STEP(i): Rank variables with machine learning (AutoScore Module 1)
Usage
AutoScore_rank(train_set, validation_set = NULL, method = "rf", ntree = 100)
Arguments
train_set |
A processed |
validation_set |
A processed |
method |
method for ranking. Options: 1. 'rf' - random forest (default), 2. 'auc' - auc-based (required validation set). For "auc", univariate models will be built based on the train set, and the variable ranking is constructed via the AUC performance of corresponding univariate models on the validation set ('validation_set'). |
ntree |
Number of trees in the random forest (Default: 100). |
Details
The first step in the AutoScore framework is variable ranking. We use random forest (RF), an ensemble machine learning algorithm, to identify the top-ranking predictors for subsequent score generation. This step correspond to Module 1 in the AutoScore paper.
Value
Returns a vector containing the list of variables and its ranking generated by machine learning (random forest)
References
Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32
Xie F, Chakraborty B, Ong MEH, Goldstein BA, Liu N. AutoScore: A Machine Learning-Based Automatic Clinical Score Generator and Its Application to Mortality Prediction Using Electronic Health Records. JMIR Medical Informatics 2020;8(10):e21798
See Also
AutoScore_parsimony
, AutoScore_weighting
, AutoScore_fine_tuning
, AutoScore_testing
, Run vignette("Guide_book", package = "AutoScore")
to see the guidebook or vignette.
Examples
# see AutoScore Guidebook for the whole 5-step workflow
data("sample_data")
names(sample_data)[names(sample_data) == "Mortality_inpatient"] <- "label"
ranking <- AutoScore_rank(sample_data, ntree = 50)