baseline {utiml} | R Documentation |
Baseline reference for multilabel classification
Description
Create a baseline model for multilabel classification.
Usage
baseline(
mdata,
metric = c("general", "F1", "hamming-loss", "subset-accuracy", "ranking-loss"),
...
)
Arguments
mdata |
A mldr dataset used to train the binary models. |
metric |
Define the strategy used to predict the labels. The possible values are: |
... |
not used |
Details
Baseline is a naive multi-label classifier that maximize/minimize a specific measure without induces a learning model. It uses the general information about the labels in training dataset to estimate the labels in a test dataset.
The follow strategies are available:
general
Predict the k most frequent labels, where k is the integer most close of label cardinality.
F1
Predict the most frequent labels that obtain the best F1 measure in training data. In the original paper, the authors use the less frequent labels.
hamming-loss
Predict the labels that are associated with more than 50% of instances.
subset-accuracy
Predict the most common labelset.
ranking-loss
Predict a ranking based on the most frequent labels.
Value
An object of class BASELINEmodel
containing the set of fitted
models, including:
- labels
A vector with the label names.
- predict
A list with the labels that will be predicted.
References
Metz, J., Abreu, L. F. de, Cherman, E. A., & Monard, M. C. (2012). On the Estimation of Predictive Evaluation Measure Baselines for Multi-label Learning. In 13th Ibero-American Conference on AI (pp. 189-198). Cartagena de Indias, Colombia.
Examples
model <- baseline(toyml)
pred <- predict(model, toyml)
## Change the metric
model <- baseline(toyml, "F1")
model <- baseline(toyml, "subset-accuracy")