AutoScore_rank {AutoScore}R Documentation

AutoScore STEP(i): Rank variables with machine learning (AutoScore Module 1)

Description

AutoScore STEP(i): Rank variables with machine learning (AutoScore Module 1)

Usage

AutoScore_rank(train_set, validation_set = NULL, method = "rf", ntree = 100)

Arguments

train_set

A processed data.frame that contains data to be analyzed, for training.

validation_set

A processed data.frame that contains data to be analyzed, only for auc-based ranking.

method

method for ranking. Options: 1. 'rf' - random forest (default), 2. 'auc' - auc-based (required validation set). For "auc", univariate models will be built based on the train set, and the variable ranking is constructed via the AUC performance of corresponding univariate models on the validation set ('validation_set').

ntree

Number of trees in the random forest (Default: 100).

Details

The first step in the AutoScore framework is variable ranking. We use random forest (RF), an ensemble machine learning algorithm, to identify the top-ranking predictors for subsequent score generation. This step correspond to Module 1 in the AutoScore paper.

Value

Returns a vector containing the list of variables and its ranking generated by machine learning (random forest)

References

See Also

AutoScore_parsimony, AutoScore_weighting, AutoScore_fine_tuning, AutoScore_testing, Run vignette("Guide_book", package = "AutoScore") to see the guidebook or vignette.

Examples

# see AutoScore Guidebook for the whole 5-step workflow
data("sample_data")
names(sample_data)[names(sample_data) == "Mortality_inpatient"] <- "label"
ranking <- AutoScore_rank(sample_data, ntree = 50)

[Package AutoScore version 1.0.0 Index]