Ensemble_ranking_IW {adabag} | R Documentation |
Ensemble methods for ranking data: Item-Weighted Boosting and Bagging Algorithms
Description
The Ensemble_ranking_IW
function applies the item-weighted Boosting and Bagging algorithms to ranking data (Albano et al., 2023). These algorithms utilize classification trees as base classifiers to perform item-weighted ensemble methods for rankings.
Usage
Ensemble_ranking_IW(formula, data, iw, algo = "boosting",
mfinal = 100, coeflearn = "Breiman", control, bin = FALSE,
trace= TRUE, ...)
Arguments
formula |
a formula specifying the response ranking variable and predictors, similar to the |
data |
An N by (K+1) data frame containing the prepared item-weighted ranking data. The column "Label" should contain the transformed ranking responses, and the remaining columns should contain the predictors. Continuous variables are allowed, while the dummy coding should be used for categorical variables. The data frame must be the output of the |
iw |
a vector or matrix representing the item weights or dissimilarities for the ranking data. For a vector, it should be a row vector of length M, where M is the number of items. For a matrix, it should be a symmetric M by M matrix representing item dissimilarities. For coherence, |
algo |
the ensemble method to use. Possible values are "bagging" or "boosting". Defaults to "boosting". |
mfinal |
the number of trees to use for boosting or bagging. Defaults to 100 iterations. |
coeflearn |
the coefficient learning method to use. Possible values are "Breiman", "Freund", or "Zhu". Defaults to "Breiman". |
control |
an optional argument to control details of the classification tree algorithm. See |
bin |
a logical value indicating whether to use the binary logarithm function for updating weights at each iteration. Defaults to |
trace |
a logical value controling the display of additional information ( the number of trees and the average weighted tau_x) during execution. Defaults to |
... |
additional arguments passed to or from other methods. |
Details
The Ensemble_ranking_IW
function extends the Boosting and Bagging algorithms to handle item-weighted ranking data. It allows for the application of these ensemble methods to improve ranking predicting performance using classification trees as base classifiers.
Value
An object of class boosting
or bagging
, which is a list with the following components:
formula |
the used formula. |
trees |
the trees grown during the iterations. |
weights |
a vector of weights for each tree in all iterations. |
importance |
a measure of the relative importance of each predictor in the ranking task, taking into account the weighted gain of the variable's contribution in each tree. |
Author(s)
Alessandro Albano alessandro.albano@unipa.it, Mariangela Sciandra mariangela.sciandra@unipa.it, and Antonella Plaia antonella.plaia@unipa.it
References
Albano, A., Sciandra, M., and Plaia, A. (2023): "A weighted distance-based approach with boosted decision trees for label ranking." Expert Systems with Applications.
Alfaro, E., Gamez, M., and Garcia, N. (2013): "adabag: An R Package for Classification with Boosting and Bagging." Journal of Statistical Software, Vol. 54, 2, pp. 1–35.
Breiman, L. (1998): "Arcing classifiers." The Annals of Statistics, Vol. 26, 3, pp. 801–849.
D'Ambrosio, A.[aut, cre], Amodio, S. [ctb], Mazzeo, G. [ctb], Albano, A. [ctb], Plaia, A. [ctb] (2023). ConsRank: Compute the Median Ranking(s) According to the Kemeny's Axiomatic Approach. R package version 2.1.3, https://cran.r-project.org/package=ConsRank.
Freund, Y., and Schapire, R.E. (1996): "Experiments with a new boosting algorithm." In Proceedings of the Thirteenth International Conference on Machine Learning, pp. 148–156, Morgan Kaufmann.
Plaia, A., Buscemi, S., Furnkranz, J., and Mencıa, E.L. (2021): "Comparing boosting and bagging for decision trees of rankings." Journal of Classification, pages 1–22.
Zhu, J., Zou, H., Rosset, S., and Hastie, T. (2009): "Multi-class AdaBoost." Statistics and Its Interface, 2, pp. 349–360.
Examples
## Not run:
# Load simulated ranking data
data(simulatedRankingData)
x <- simulatedRankingData$x
y <- simulatedRankingData$y
# Prepare the data with item weights
dati <- prep_data(y, x, iw = c(2, 5, 5, 2))
# Divide the data into training and test sets
set.seed(12345)
samp <- sample(nrow(dati))
l <- length(dati[, 1])
sub <- sample(1:l, 2 * l / 3)
data_sub1 <- dati[sub, ]
data_test1 <- dati[-sub, ]
# Apply ensemble ranking with AdaBoost.M1
boosting_1 <- Ensemble_ranking_IW(
Label ~ .,
data = data_sub1,
iw = c(2, 5, 5, 2),
mfinal = 3,
coeflearn = "Breiman",
control = rpart.control(maxdepth = 4, cp = -1),
algo = "boosting",
bin = FALSE
)
# Evaluate the performance
test_boosting1 <- errorevol_ranking_vector_IW(boosting_1,
newdata = data_test1, iw=c(2,5,5,2), squared = FALSE)
test_boosting1.1 <- errorevol_ranking_vector_IW(boosting_1,
newdata = data_sub1, iw=c(2,5,5,2), squared = FALSE)
# Plot the error evolution
plot.errorevol(test_boosting1, test_boosting1.1)
## End(Not run)