check_drift {drifter} | R Documentation |
This function executes all tests for drift between two datasets / models
Description
Currently three checks are implemented, covariate drift, residual drift and model drift.
Usage
check_drift(model_old, model_new, data_old, data_new, y_old, y_new,
predict_function = predict, max_obs = 100, bins = 20,
scale = sd(y_new, na.rm = TRUE))
Arguments
model_old |
model created on historical / 'old'data |
model_new |
model created on current / 'new'data |
data_old |
data frame with historical / 'old' data |
data_new |
data frame with current / 'new' data |
y_old |
true values of target variable for historical / 'old' data |
y_new |
true values of target variable for current / 'new' data |
predict_function |
function that takes two arguments: model and new data and returns numeric vector with predictions, by default it's 'predict' |
max_obs |
if negative, them all observations are used for calculation of PDP, is positive, then only 'max_obs' are used for calculation of PDP |
bins |
continuous variables are discretized to 'bins' intervals of equal sizes |
scale |
scale parameter for calculation of scaled drift |
Value
This function is executed for its side effects, all checks are being printed on the screen. Additionaly it returns list with particualr checks.
Examples
library("DALEX")
model_old <- lm(m2.price ~ ., data = apartments)
model_new <- lm(m2.price ~ ., data = apartments_test[1:1000,])
check_drift(model_old, model_new,
apartments, apartments_test,
apartments$m2.price, apartments_test$m2.price)
library("ranger")
predict_function <- function(m,x,...) predict(m, x, ...)$predictions
model_old <- ranger(m2.price ~ ., data = apartments)
model_new <- ranger(m2.price ~ ., data = apartments_test)
check_drift(model_old, model_new,
apartments, apartments_test,
apartments$m2.price, apartments_test$m2.price,
predict_function = predict_function)