PredictionClassif {mlr3}R Documentation

Prediction Object for Classification

Description

This object wraps the predictions returned by a learner of class LearnerClassif, i.e. the predicted response and class probabilities.

If the response is not provided during construction, but class probabilities are, the response is calculated from the probabilities: the class label with the highest probability is chosen. In case of ties, a label is selected randomly.

Thresholding

If probabilities are stored, it is possible to change the threshold which determines the predicted class label. Usually, the label of the class with the highest predicted probability is selected. For binary classification problems, such an threshold defaults to 0.5. For cost-sensitive or imbalanced classification problems, manually adjusting the threshold can increase the predictive performance.

Super class

mlr3::Prediction -> PredictionClassif

Active bindings

response

(factor())
Access to the stored predicted class labels.

prob

(matrix())
Access to the stored probabilities.

confusion

(matrix())
Confusion matrix, as resulting from the comparison of truth and response. Truth is in columns, predicted response is in rows.

Methods

Public methods

Inherited methods

Method new()

Creates a new instance of this R6 class.

Usage
PredictionClassif$new(
  task = NULL,
  row_ids = task$row_ids,
  truth = task$truth(),
  response = NULL,
  prob = NULL,
  check = TRUE
)
Arguments
task

(TaskClassif)
Task, used to extract defaults for row_ids and truth.

row_ids

(integer())
Row ids of the predicted observations, i.e. the row ids of the test set.

truth

(factor())
True (observed) labels. See the note on manual construction.

response

(character() | factor())
Vector of predicted class labels. One element for each observation in the test set. Character vectors are automatically converted to factors. See the note on manual construction.

prob

(matrix())
Numeric matrix of posterior class probabilities with one column for each class and one row for each observation in the test set. Columns must be named with class labels, row names are automatically removed. If prob is provided, but response is not, the class labels are calculated from the probabilities using max.col() with ties.method set to "random".

check

(logical(1))
If TRUE, performs some argument checks and predict type conversions.


Method set_threshold()

Sets the prediction response based on the provided threshold. See the section on thresholding for more information.

Usage
PredictionClassif$set_threshold(threshold, ties_method = "random")
Arguments
threshold

(numeric()).

ties_method

(character(1))
One of "random", "first" or "last" (c.f. max.col()) to determine how to deal with tied probabilities.

Returns

Returns the object itself, but modified by reference. You need to explicitly ⁠$clone()⁠ the object beforehand if you want to keeps the object in its previous state.


Method clone()

The objects of this class are cloneable with this method.

Usage
PredictionClassif$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Note

If this object is constructed manually, make sure that the factor levels for truth have the same levels as the task, in the same order. In case of binary classification tasks, the positive class label must be the first level.

See Also

Other Prediction: Prediction, PredictionRegr

Examples

task = tsk("penguins")
learner = lrn("classif.rpart", predict_type = "prob")
learner$train(task)
p = learner$predict(task)
p$predict_types
head(as.data.table(p))

# confusion matrix
p$confusion

# change threshold
th = c(0.05, 0.9, 0.05)
names(th) = task$class_names

# new predictions
p$set_threshold(th)$response
p$score(measures = msr("classif.ce"))

[Package mlr3 version 0.20.2 Index]