R: Chain multiple transformations and generate argument values...

GeneratedPipeline {rearrr}

R Documentation

Chain multiple transformations and generate argument values per group

Description

Build a pipeline of transformations to be applied sequentially.

Generate argument values for selected arguments with a given set of generators. E.g. randomly generate argument values for each group in a data.frame.

Groupings are reset between each transformation. See group_cols.

Standard workflow: Instantiate pipeline -> Add transformations -> Apply to data

To apply the same arguments to all groups, see Pipeline.

To apply different but specified argument values to a fixed set of groups, see FixedGroupsPipeline.

Super class

rearrr::Pipeline -> GeneratedPipeline

Public fields

transformations: list of transformations to apply.
names: Names of the transformations.

Methods

Inherited methods

rearrr::Pipeline$apply()

Method `add_transformation()`

Add a transformation to the pipeline.

Usage

GeneratedPipeline$add_transformation(
  fn,
  args,
  generators,
  name,
  group_cols = NULL
)

Arguments

fn

Function that performs the transformation.

args

Named list with arguments for the `fn` function.

generators

Named list of functions for generating argument values for a single call of `fn`.

It is possible to include an apply generator for deciding whether the transformation should be applied to the current group or not. This is done by adding a function with the name `.apply` to the `generators` list. E.g. ".apply" = function(){sample(c(TRUE, FALSE), 1)}.

name

Name of the transformation step. Must be unique.

group_cols

Names of the columns to group the input data by before applying the transformation.

Note that the transformation function is applied separately to each group (subset). If the `fn` function requires access to the entire data.frame, the grouping columns should be specified as part of `args` and handled by the `fn` function.

Returns

The pipeline. To allow chaining of methods.

Examples

# `generators` is a list of functions for generating
# argument values for a chosen set of arguments
# `.apply` can be used to disable the transformation
generators = list(degrees = function(){sample.int(360, 1)},
                  origin = function(){rnorm(2)},
                  .apply = function(){sample(c(TRUE, FALSE), 1)})

Method `print()`

Print an overview of the pipeline.

Usage

GeneratedPipeline$print(...)

Arguments

...: further arguments passed to or from other methods.

Returns

The pipeline. To allow chaining of methods.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

GeneratedPipeline$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Author(s)

Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk

Examples

# Attach package
library(rearrr)

# Create a data frame
df <- data.frame(
  "Index" = 1:12,
  "A" = c(1:4, 9:12, 15:18),
  "G" = rep(1:3, each = 4)
)

# Create new pipeline
pipe <- GeneratedPipeline$new()

# Add 2D rotation transformation
# Note that we specify the grouping via `group_cols`
pipe$add_transformation(
  fn = rotate_2d,
  args = list(
    x_col = "Index",
    y_col = "A",
    suffix = "",
    overwrite = TRUE
  ),
  generators = list(degrees = function(){sample.int(360, 1)},
                    origin = function(){rnorm(2)}),
  name = "Rotate",
  group_cols = "G"
)

# Add the `cluster_group` transformation
# Note that this function requires the entire input data
# to properly scale the groups. We therefore specify `group_cols`
# as part of `args`. This works as `cluster_groups()` accepts that
# argument.
# Also note the `.apply` generator which generates a TRUE/FALSE scalar
# for whether the transformation should be applied to the current group
pipe$add_transformation(
  fn = cluster_groups,
  args = list(
    cols = c("Index", "A"),
    suffix = "",
    overwrite = TRUE,
    group_cols = "G"
  ),
  generators = list(
    multiplier = function() {
      0.1 * runif(1) * 3 ^ sample.int(5, 1)
    },
    .apply = function(){sample(c(TRUE, FALSE), 1)}
  ),
  name = "Cluster"
)

# Check pipeline object
pipe

# Apply pipeline to data.frame
# Enable `verbose` to print progress
pipe$apply(df, verbose = TRUE)


## ------------------------------------------------
## Method `GeneratedPipeline$add_transformation`
## ------------------------------------------------

# `generators` is a list of functions for generating
# argument values for a chosen set of arguments
# `.apply` can be used to disable the transformation
generators = list(degrees = function(){sample.int(360, 1)},
                  origin = function(){rnorm(2)},
                  .apply = function(){sample(c(TRUE, FALSE), 1)})

[Package rearrr version 0.3.4 Index]

Chain multiple transformations and generate argument values per group

Description

Super class

Public fields

Methods

Public methods

Method add_transformation()

Usage

Arguments

Returns

Examples

Method print()

Usage

Arguments

Returns

Method clone()

Usage

Arguments

Author(s)

See Also

Examples

Method `add_transformation()`

Method `print()`

Method `clone()`