AutoScore_parsimony_Ordinal {AutoScore} | R Documentation |
AutoScore STEP(ii) for ordinal outcomes: Select the best model with parsimony plot (AutoScore Modules 2+3+4)
Description
AutoScore STEP(ii) for ordinal outcomes: Select the best model with parsimony plot (AutoScore Modules 2+3+4)
Usage
AutoScore_parsimony_Ordinal(
train_set,
validation_set,
rank,
link = "logit",
max_score = 100,
n_min = 1,
n_max = 20,
cross_validation = FALSE,
fold = 10,
categorize = "quantile",
quantiles = c(0, 0.05, 0.2, 0.8, 0.95, 1),
max_cluster = 5,
do_trace = FALSE,
auc_lim_min = 0.5,
auc_lim_max = "adaptive"
)
Arguments
train_set |
A processed |
validation_set |
A processed |
rank |
The raking result generated from AutoScore STEP(i) for ordinal
outcomes ( |
link |
The link function used to model ordinal outcomes. Default is
|
max_score |
Maximum total score (Default: 100). |
n_min |
Minimum number of selected variables (Default: 1). |
n_max |
Maximum number of selected variables (Default: 20). |
cross_validation |
If set to |
fold |
The number of folds used in cross validation (Default: 10). Available if |
categorize |
Methods for categorize continuous variables. Options include "quantile" or "kmeans" (Default: "quantile"). |
quantiles |
Predefined quantiles to convert continuous variables to categorical ones. (Default: c(0, 0.05, 0.2, 0.8, 0.95, 1)) Available if |
max_cluster |
The max number of cluster (Default: 5). Available if |
do_trace |
If set to TRUE, all results based on each fold of cross-validation would be printed out and plotted (Default: FALSE). Available if |
auc_lim_min |
Min y_axis limit in the parsimony plot (Default: 0.5). |
auc_lim_max |
Max y_axis limit in the parsimony plot (Default: "adaptive"). |
Details
This is the second step of the general AutoScore workflow for
ordinal outcomes, to generate the parsimony plot to help select a
parsimonious model. In this step, it goes through AutoScore Module 2,3 and
4 multiple times and to evaluate the performance under different variable
list. The generated parsimony plot would give researcher an intuitive
figure to choose the best models. If data size is small (eg, <5000), an
independent validation set may not be a wise choice. Then, we suggest using
cross-validation to maximize the utility of data. Set
cross_validation=TRUE
.
Value
List of mAUC (ie, the average AUC of dichotomous classifications) value for different number of variables
References
Saffari SE, Ning Y, Feng X, Chakraborty B, Volovici V, Vaughan R, Ong ME, Liu N, AutoScore-Ordinal: An interpretable machine learning framework for generating scoring models for ordinal outcomes, arXiv:2202.08407
See Also
AutoScore_rank_Ordinal
,
AutoScore_weighting_Ordinal
,
AutoScore_fine_tuning_Ordinal
,
AutoScore_testing_Ordinal
.
Examples
## Not run:
# see AutoScore-Ordinal Guidebook for the whole 5-step workflow
data("sample_data_ordinal") # Output is named `label`
out_split <- split_data(data = sample_data_ordinal, ratio = c(0.7, 0.1, 0.2))
train_set <- out_split$train_set
validation_set <- out_split$validation_set
ranking <- AutoScore_rank_Ordinal(train_set, ntree=100)
mAUC <- AutoScore_parsimony_Ordinal(
train_set = train_set, validation_set = validation_set,
rank = ranking, max_score = 100, n_min = 1, n_max = 20,
categorize = "quantile", quantiles = c(0, 0.05, 0.2, 0.8, 0.95, 1)
)
## End(Not run)