pipeline {pipeliner} | R Documentation |
Build machine learning pipelines - functional API
Description
Building machine learning models often requires pre- and post-transformation of the input and/or response variables, prior to training (or fitting) the models. For example, a model may require training on the logarithm of the response and input variables. As a consequence, fitting and then generating predictions from these models requires repeated application of transformation and inverse-transormation functions, to go from the original input to original output variables (via the model).
Usage
pipeline(.data, ...)
Arguments
.data |
A data.frame containing the input variables required to fit the pipeline. |
... |
Functions of class |
Details
This function that takes individual pipeline sections - functions with class
"ml_pipeline_section"
- together with the data required to estimate the inner models,
returning a machine pipeline capable of predicting (scoring) data end-to-end, without having to
repeatedly apply input variable (feature and response) transformation and their inverses.
Value
A "ml_pipeline"
object contaiing the pipeline prediction function
ml_pipeline$predict()
and the estimated machine learning model nested within it
ml_pipeline$inner_model()
.
Examples
data <- faithful
lm_pipeline <-
pipeline(
data,
transform_features(function(df) {
data.frame(x1 = (df$waiting - mean(df$waiting)) / sd(df$waiting))
}),
transform_response(function(df) {
data.frame(y = (df$eruptions - mean(df$eruptions)) / sd(df$eruptions))
}),
estimate_model(function(df) {
lm(y ~ 1 + x1, df)
}),
inv_transform_response(function(df) {
data.frame(pred_eruptions = df$pred_model * sd(df$eruptions) + mean(df$eruptions))
})
)