R: Build machine learning pipelines

pipeline {pipeliner}

R Documentation

Build machine learning pipelines - functional API

Description

Building machine learning models often requires pre- and post-transformation of the input and/or response variables, prior to training (or fitting) the models. For example, a model may require training on the logarithm of the response and input variables. As a consequence, fitting and then generating predictions from these models requires repeated application of transformation and inverse-transormation functions, to go from the original input to original output variables (via the model).

Usage

pipeline(.data, ...)

Arguments

`.data`	A data.frame containing the input variables required to fit the pipeline.
`...`	Functions of class `"ml_pipeline_section"` - e.g. `transform_features()`, `transform_response()`, `inv_transform_response()` or `estimate_model()`.

Details

This function that takes individual pipeline sections - functions with class "ml_pipeline_section" - together with the data required to estimate the inner models, returning a machine pipeline capable of predicting (scoring) data end-to-end, without having to repeatedly apply input variable (feature and response) transformation and their inverses.

Value

A "ml_pipeline" object contaiing the pipeline prediction function ml_pipeline$predict() and the estimated machine learning model nested within it ml_pipeline$inner_model().

Examples

data <- faithful

lm_pipeline <-
  pipeline(
    data,
    transform_features(function(df) {
      data.frame(x1 = (df$waiting - mean(df$waiting)) / sd(df$waiting))
    }),

    transform_response(function(df) {
      data.frame(y = (df$eruptions - mean(df$eruptions)) / sd(df$eruptions))
    }),

    estimate_model(function(df) {
      lm(y ~ 1 + x1, df)
    }),

    inv_transform_response(function(df) {
      data.frame(pred_eruptions = df$pred_model * sd(df$eruptions) + mean(df$eruptions))
    })
  )

[Package pipeliner version 0.1.1 Index]