perff {ordinalForest} | R Documentation |
Performance functions based on Youden's J statistic
Description
In ordfor
so-called performance functions are used to measure the performance of the
smaller regression forests constructed prior to the approximation of the optimal score set.
Except for one, which uses the ranked probability score (enabling class probability estimation), all of these performance functions are based on Youden's J statistic.
These functions may, however, also be used to measure the precision of
predictions on new data or the precision of OOB predictions. Note that the performance function using the
ranked probability score is not covered in this help page. The function rps
from the package
verification
(version 1.42) can be used to calculate the ranked probability score.
Usage
perff_equal(ytest, ytestpred, categ, classweights)
perff_proportional(ytest, ytestpred, categ, classweights)
perff_oneclass(ytest, ytestpred, categ, classweights)
perff_custom(ytest, ytestpred, categ, classweights)
Arguments
ytest |
factor. True values of the target variable. |
ytestpred |
factor. Predicted values of the target variable. |
categ |
character. Needed in the case of |
classweights |
numeric. Needed in the case of |
Details
perff_equal
should be used if it is of interest to classify observations from each class with the same accuracy independent of the class sizes.
Youden's J statistic is calculated with respect to each class ("observation/prediction in class j" vs. "observation/prediction NOT in class j" (j=1,...,J))
and the simple average of the J results taken.
perff_proportional
should be used if the main goal is to classify
correctly as many observations as possible. The latter is associated with a preference for larger classes at the
expense of a lower classification accuracy with respect to smaller classes.
Youden's J statistic is calculated with respect to each class and subsequently a weighted average of these values is taken - with weights
proportional to the number of observations representing the respective classes in the training data.
perff_oneclass
should be used if it is merely relevant that observations
in class categ
can be distinguished as reliably as possible from observations not in class categ
.
Class categ
must be passed to perff_oneclass
via the argument categ
.
Youden's J statistic is calculated with respect to class categ
.
perff_custom
should be used if there is a particular ranking of the classes with respect to their importance.
Youden's J statistic is calculated with respect to each class. Subsequently, a weighted average
with user-specified weights (provided via the argument classweights
) is taken. In this way, classes with
higher weights are prioritized by the OF algorithm over classes with smaller weights.
References
Hornung R. (2020) Ordinal Forests. Journal of Classification 37, 4–17. <doi: 10.1007/s00357-018-9302-x>.
Examples
## Not run:
data(hearth)
set.seed(123)
trainind <- sort(sample(1:nrow(hearth), size=floor(nrow(hearth)*(1/2))))
testind <- sort(sample(setdiff(1:nrow(hearth), trainind), size=20))
datatrain <- hearth[trainind,]
datatest <- hearth[testind,]
ordforres <- ordfor(depvar="Class", data=datatrain, nsets=50, nbest=5, ntreeperdiv=100,
ntreefinal=1000)
# NOTE: nsets=50 is not enough, because the prediction performance of the resulting
# ordinal forest will be suboptimal!! In practice, nsets=1000 (default value) or a larger
# number should be used.
preds <- predict(ordforres, newdata=datatest)
table('true'=datatest$Class, 'predicted'=preds$ypred)
perff_equal(ytest=datatest$Class, ytestpred=preds$ypred)
perff_proportional(ytest=datatest$Class, ytestpred=preds$ypred)
perff_oneclass(ytest=datatest$Class, ytestpred=preds$ypred, categ="1")
perff_oneclass(ytest=datatest$Class, ytestpred=preds$ypred, categ="2")
perff_oneclass(ytest=datatest$Class, ytestpred=preds$ypred, categ="3")
perff_oneclass(ytest=datatest$Class, ytestpred=preds$ypred, categ="4")
perff_oneclass(ytest=datatest$Class, ytestpred=preds$ypred, categ="5")
perff_custom(ytest=datatest$Class, ytestpred=preds$ypred, classweights=c(1,2,1,1,1))
# perff_equal, perff_proportional, and perff_oneclass are special cases of perff_custom:
perff_custom(ytest=datatest$Class, ytestpred=preds$ypred, classweights=c(1,1,1,1,1))
perff_equal(ytest=datatest$Class, ytestpred=preds$ypred)
perff_custom(ytest=datatest$Class, ytestpred=preds$ypred, classweights=table(datatest$Class))
perff_proportional(ytest=datatest$Class, ytestpred=preds$ypred)
perff_custom(ytest=datatest$Class, ytestpred=preds$ypred, classweights=c(0,0,0,1,0))
perff_oneclass(ytest=datatest$Class, ytestpred=preds$ypred, categ="4")
## End(Not run)