| Predictor {iml} | R Documentation |
Predictor object
Description
A Predictor object holds any machine learning model (mlr, caret,
randomForest, ...) and the data to be used for analyzing the model. The
interpretation methods in the iml package need the machine learning model
to be wrapped in a Predictor object.
Details
A Predictor object is a container for the prediction model and the data. This ensures that the machine learning model can be analyzed in a robust way.
Note: In case of classification, the model should return one column per class with the class probability.
Public fields
datadata.frame
Data object with the data for the model interpretation.model(any)
The machine learning model.batch.sizenumeric(1)
The number of rows to be input the model for prediction at once.classcharacter(1)
The class column to be returned.prediction.colnamescharacter
The column names of the predictions.prediction.functionfunction
The function to predict newdata.taskcharacter(1)
The inferred prediction task:"classification"or"regression".
Methods
Public methods
Method new()
Create a Predictor object
Usage
Predictor$new( model = NULL, data = NULL, predict.function = NULL, y = NULL, class = NULL, type = NULL, batch.size = 1000 )
Arguments
modelany
The machine learning model. Recommended are models frommlrandcaret. Other machine learning with a S3 predict functions work as well, but less robust (e.g.randomForest).datadata.frame
The data to be used for analyzing the prediction model. Allowed column classes are: numeric, factor, integer, ordered and character For some models the data can be extracted automatically.Predictor$new()throws an error when it can't extract the data automatically.predict.functionfunction
The function to predict newdata. Only needed ifmodelis not a model frommlrorcaretpackage. The first argument ofpredict.funhas to be the model, the second thenewdata:function(model, newdata)
ycharacter(1)| numeric | factor
The target vector or (preferably) the name of the target column in thedataargument. Predictor tries to infer the target automatically from the model.classcharacter(1)
The class column to be returned. You should use the column name of the predicted class, e.g.class="setosa".typecharacter(1))
This argument is passed to the prediction function of the model. For regression models you usually don't have to provide the type argument. The classic use case is to saytype="prob"for classification models. Consult the documentation of the machine learning package you use to find which type options you have. If bothpredict.funandtypeare used, then type is passed as an argument topredict.fun.batch.sizenumeric(1)
The maximum number of rows to be input the model for prediction at once. Currently only respected for FeatureImp, Partial and Interaction.
Method predict()
Predict new data with the machine learning model.
Usage
Predictor$predict(newdata)
Arguments
newdatadata.frame
Data to predict on.
Method print()
Print the Predictor object.
Usage
Predictor$print()
Method clone()
The objects of this class are cloneable with this method.
Usage
Predictor$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
Examples
library("mlr")
task <- makeClassifTask(data = iris, target = "Species")
learner <- makeLearner("classif.rpart", minsplit = 7, predict.type = "prob")
mod.mlr <- train(learner, task)
mod <- Predictor$new(mod.mlr, data = iris)
mod$predict(iris[1:5, ])
mod <- Predictor$new(mod.mlr, data = iris, class = "setosa")
mod$predict(iris[1:5, ])
library("randomForest")
rf <- randomForest(Species ~ ., data = iris, ntree = 20)
mod <- Predictor$new(rf, data = iris, type = "prob")
mod$predict(iris[50:55, ])
# Feature importance needs the target vector, which needs to be supplied:
mod <- Predictor$new(rf, data = iris, y = "Species", type = "prob")