GeneratedPipeline {rearrr}R Documentation

Chain multiple transformations and generate argument values per group

Description

[Experimental]

Build a pipeline of transformations to be applied sequentially.

Generate argument values for selected arguments with a given set of generators. E.g. randomly generate argument values for each group in a data.frame.

Groupings are reset between each transformation. See group_cols.

Standard workflow: Instantiate pipeline -> Add transformations -> Apply to data

To apply the same arguments to all groups, see Pipeline.

To apply different but specified argument values to a fixed set of groups, see FixedGroupsPipeline.

Super class

rearrr::Pipeline -> GeneratedPipeline

Public fields

transformations

list of transformations to apply.

names

Names of the transformations.

Methods

Public methods

Inherited methods

Method add_transformation()

Add a transformation to the pipeline.

Usage
GeneratedPipeline$add_transformation(
  fn,
  args,
  generators,
  name,
  group_cols = NULL
)
Arguments
fn

Function that performs the transformation.

args

Named list with arguments for the `fn` function.

generators

Named list of functions for generating argument values for a single call of `fn`.

It is possible to include an apply generator for deciding whether the transformation should be applied to the current group or not. This is done by adding a function with the name `.apply` to the `generators` list. E.g. ".apply" = function(){sample(c(TRUE, FALSE), 1)}.

name

Name of the transformation step. Must be unique.

group_cols

Names of the columns to group the input data by before applying the transformation.

Note that the transformation function is applied separately to each group (subset). If the `fn` function requires access to the entire data.frame, the grouping columns should be specified as part of `args` and handled by the `fn` function.

Returns

The pipeline. To allow chaining of methods.

Examples
# `generators` is a list of functions for generating
# argument values for a chosen set of arguments
# `.apply` can be used to disable the transformation
generators = list(degrees = function(){sample.int(360, 1)},
                  origin = function(){rnorm(2)},
                  .apply = function(){sample(c(TRUE, FALSE), 1)})

Method print()

Print an overview of the pipeline.

Usage
GeneratedPipeline$print(...)
Arguments
...

further arguments passed to or from other methods.

Returns

The pipeline. To allow chaining of methods.


Method clone()

The objects of this class are cloneable with this method.

Usage
GeneratedPipeline$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Author(s)

Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk

See Also

Other pipelines: FixedGroupsPipeline, Pipeline

Examples

# Attach package
library(rearrr)

# Create a data frame
df <- data.frame(
  "Index" = 1:12,
  "A" = c(1:4, 9:12, 15:18),
  "G" = rep(1:3, each = 4)
)

# Create new pipeline
pipe <- GeneratedPipeline$new()

# Add 2D rotation transformation
# Note that we specify the grouping via `group_cols`
pipe$add_transformation(
  fn = rotate_2d,
  args = list(
    x_col = "Index",
    y_col = "A",
    suffix = "",
    overwrite = TRUE
  ),
  generators = list(degrees = function(){sample.int(360, 1)},
                    origin = function(){rnorm(2)}),
  name = "Rotate",
  group_cols = "G"
)

# Add the `cluster_group` transformation
# Note that this function requires the entire input data
# to properly scale the groups. We therefore specify `group_cols`
# as part of `args`. This works as `cluster_groups()` accepts that
# argument.
# Also note the `.apply` generator which generates a TRUE/FALSE scalar
# for whether the transformation should be applied to the current group
pipe$add_transformation(
  fn = cluster_groups,
  args = list(
    cols = c("Index", "A"),
    suffix = "",
    overwrite = TRUE,
    group_cols = "G"
  ),
  generators = list(
    multiplier = function() {
      0.1 * runif(1) * 3 ^ sample.int(5, 1)
    },
    .apply = function(){sample(c(TRUE, FALSE), 1)}
  ),
  name = "Cluster"
)

# Check pipeline object
pipe

# Apply pipeline to data.frame
# Enable `verbose` to print progress
pipe$apply(df, verbose = TRUE)


## ------------------------------------------------
## Method `GeneratedPipeline$add_transformation`
## ------------------------------------------------

# `generators` is a list of functions for generating
# argument values for a chosen set of arguments
# `.apply` can be used to disable the transformation
generators = list(degrees = function(){sample.int(360, 1)},
                  origin = function(){rnorm(2)},
                  .apply = function(){sample(c(TRUE, FALSE), 1)})

[Package rearrr version 0.3.4 Index]