mlRpart {mlearning} | R Documentation |
Supervised classification and regression using recursive partitioning
Description
Unified (formula-based) interface version of the recursive partitioning
algorithm as implemented in rpart::rpart()
.
Usage
mlRpart(train, ...)
ml_rpart(train, ...)
## S3 method for class 'formula'
mlRpart(formula, data, ..., subset, na.action)
## Default S3 method:
mlRpart(train, response, ..., .args. = NULL)
## S3 method for class 'mlRpart'
predict(
object,
newdata,
type = c("class", "membership", "both"),
method = c("direct", "cv"),
...
)
Arguments
train |
a matrix or data frame with predictors. |
... |
further arguments passed to |
formula |
a formula with left term being the factor variable to predict
(for supervised classification), a vector of numbers (for regression) and the
right term with the list of independent, predictive variables, separated with
a plus sign. If the data frame provided contains only the dependent and
independent variables, one can use the |
data |
a data.frame to use as a training set. |
subset |
index vector with the cases to define the training set in use (this argument must be named, if provided). |
na.action |
function to specify the action to be taken if |
response |
a vector of factor (classification) or numeric (regression). |
.args. |
used internally, do not provide anything here. |
object |
an mlRpart object |
newdata |
a new dataset with same conformation as the training set (same variables, except may by the class for classification or dependent variable for regression). Usually a test set, or a new dataset to be predicted. |
type |
the type of prediction to return. |
method |
|
Value
ml_rpart()
/mlRpart()
creates an mlRpart, mlearning object
containing the classifier and a lot of additional metadata used by the
functions and methods you can apply to it like predict()
or
cvpredict()
. In case you want to program new functions or extract
specific components, inspect the "unclassed" object using unclass()
.
See Also
mlearning()
, cvpredict()
, confusion()
, also rpart::rpart()
that actually does the classification.
Examples
# Prepare data: split into training set (2/3) and test set (1/3)
data("iris", package = "datasets")
train <- c(1:34, 51:83, 101:133)
iris_train <- iris[train, ]
iris_test <- iris[-train, ]
# One case with missing data in train set, and another case in test set
iris_train[1, 1] <- NA
iris_test[25, 2] <- NA
iris_rpart <- ml_rpart(data = iris_train, Species ~ .)
summary(iris_rpart)
# Plot the decision tree for this classifier
plot(iris_rpart, margin = 0.03, uniform = TRUE)
text(iris_rpart, use.n = FALSE)
# Predictions
predict(iris_rpart) # Default type is class
predict(iris_rpart, type = "membership")
predict(iris_rpart, type = "both")
# Self-consistency, do not use for assessing classifier performances!
confusion(iris_rpart)
# Cross-validation prediction is a good choice when there is no test set
predict(iris_rpart, method = "cv") # Idem: cvpredict(res)
confusion(iris_rpart, method = "cv")
# Evaluation of performances using a separate test set
confusion(predict(iris_rpart, newdata = iris_test), iris_test$Species)