R: Chain multiple transformations

Pipeline {rearrr}

R Documentation

Chain multiple transformations

Description

Build a pipeline of transformations to be applied sequentially.

Uses the same arguments for all groups in `data`.

Groupings are reset between each transformation. See group_cols.

Standard workflow: Instantiate pipeline -> Add transformations -> Apply to data

To apply different argument values to each group, see GeneratedPipeline for generating argument values for an arbitrary number of groups and FixedGroupsPipeline for specifying specific values for a fixed set of groups.

Public fields

transformations: list of transformations to apply.
names: Names of the transformations.

Methods

Method `add_transformation()`

Add a transformation to the pipeline.

Usage

Pipeline$add_transformation(fn, args, name, group_cols = NULL)

Arguments

fn

Function that performs the transformation.

args

Named list with arguments for the `fn` function.

name

Name of the transformation step. Must be unique.

group_cols

Names of the columns to group the input data by before applying the transformation.

Note that the transformation function is applied separately to each group (subset). If the `fn` function requires access to the entire data.frame, the grouping columns should be specified as part of `args` and handled by the `fn` function.

Returns

The pipeline. To allow chaining of methods.

Method `apply()`

Apply the pipeline to a data.frame.

Usage

Pipeline$apply(data, verbose = FALSE)

Arguments

data

data.frame.

A grouped data.frame will raise a warning and the grouping will be ignored. Use the `group_cols` argument in the `add_transformation` method to specify how `data` should be grouped for each transformation.

verbose

Whether to print the progress.

Returns

Transformed version of `data`.

Method `print()`

Print an overview of the pipeline.

Usage

Pipeline$print(...)

Arguments

...: further arguments passed to or from other methods.

Returns

The pipeline. To allow chaining of methods.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

Pipeline$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Author(s)

Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk

Examples

# Attach package
library(rearrr)

# Create a data frame
df <- data.frame(
  "Index" = 1:12,
  "A" = c(1:4, 9:12, 15:18),
  "G" = rep(1:3, each = 4)
)

# Create new pipeline
pipe <- Pipeline$new()

# Add 2D rotation transformation
# Note that we specify the grouping via `group_cols`
pipe$add_transformation(
  fn = rotate_2d,
  args = list(
    x_col = "Index",
    y_col = "A",
    origin = c(0, 0),
    degrees = 45,
    suffix = "",
    overwrite = TRUE
  ),
  name = "Rotate",
  group_cols = "G"
)

# Add the `cluster_group` transformation
# Note that this function requires the entire input data
# to properly scale the groups. We therefore specify `group_cols`
# as part of `args`. This works as `cluster_groups()` accepts that
# argument.
pipe$add_transformation(
  fn = cluster_groups,
  args = list(
    cols = c("Index", "A"),
    suffix = "",
    overwrite = TRUE,
    multiplier = 0.05,
    group_cols = "G"
  ),
  name = "Cluster"
)

# Check pipeline object
pipe

# Apply pipeline to data.frame
# Enable `verbose` to print progress
pipe$apply(df, verbose = TRUE)

[Package rearrr version 0.3.4 Index]

Chain multiple transformations

Description

Public fields

Methods

Public methods

Method add_transformation()

Usage

Arguments

Returns

Method apply()

Usage

Arguments

Returns

Method print()

Usage

Arguments

Returns

Method clone()

Usage

Arguments

Author(s)

See Also

Examples

Method `add_transformation()`

Method `apply()`

Method `print()`

Method `clone()`