imptree-package {imptree} | R Documentation |
imptree: Classification Trees with Imprecise Probabilities
Description
The imptree
package implements the creation of
imprecise classification trees based on algorithm developed by
Abellan and Moral.
The credal sets of the classification variable within each node
are estimated by either the imprecise Dirichlet model (IDM) or the
nonparametric predictive inference (NPI).
As split possible split criteria serve the 'information gain',
based on the maximal entropy distribution, and the adaptable
entropy-range based criterion propsed by Fink and Crossman.
It also implements different correction terms for the entropy.
The performance of the tree can be evaluated with respect to the common criteria in the context of imprecise classification trees.
It also provides the functionality for estimating credal sets via IDM or NPI and obtain their minimal/maximal entropy (distribution) to be used outside the tree growing process.
References
Abellán, J. and Moral, S. (2005), Upper entropy of credal sets. Applications to credal classification, International Journal of Approximate Reasoning 39, pp. 235–255.
Baker, R. M. (2010), Multinomial Nonparametric Predictive Inference: Selection, Classification and Subcategory Data, PhD thesis. Durham University, GB.
Strobl, C. (2005), Variable Selection in Classification Trees Based on Imprecise Probabilities, ISIPTA '05: Proceedings of the Fourth International Symposium on Imprecise Probabilities and Their Applications, 339–348.
Fink, P. and Crossman, R.J. (2013), Entropy based classification trees, ISIPTA '13: Proceedings of the Eighth International Symposium on Imprecise Probability: Theories and Applications, pp. 139–147.
See Also
imptree
for tree creation, probInterval
for the credal set
and entropy estimation functionality
Examples
data("carEvaluation")
## create a tree with IDM (s=1) to full size
## carEvaluation, leaving the first 10 observations out
ip <- imptree(acceptance~., data = carEvaluation[-(1:10),],
method="IDM", method.param = list(splitmetric = "globalmax", s = 1),
control = list(depth = NULL, minbucket = 1))
## summarize the tree and show performance on training data
summary(ip)
## predict the first 10 observations
## Note: The result of the prediction is return invisibly
pp <- predict(ip, dominance = "max", data = carEvaluation[(1:10),])
## print the general evaluation statistics
print(pp)
## display the predicted class labels
pp$classes