FixedGroupsPipeline {rearrr} | R Documentation |
Chain multiple transformations with different argument values per group
Description
Build a pipeline of transformations to be applied sequentially.
Specify different argument values for each group in a fixed set of groups.
E.g. if your data.frame
contains 5 groups, you provide 5 argument values
for each of the non-constant arguments (see `var_args`
).
The number of expected groups is specified during initialization and the input
`data`
must be grouped such that it contains that exact number of groups.
Transformations are applied to groups separately, why the given transformation function
only receives the subset of `data`
belonging to the current group.
Standard workflow: Instantiate pipeline -> Add transformations -> Apply to data
To apply the same arguments to all groups, see
Pipeline
.
To apply generated argument values to an arbitrary number of groups,
see GeneratedPipeline
.
Super class
rearrr::Pipeline
-> FixedGroupsPipeline
Public fields
transformations
list
of transformations to apply.names
Names of the transformations.
num_groups
Number of groups the pipeline will be applied to.
Methods
Public methods
Method new()
Initialize the pipeline with the number of groups the pipeline will be applied to.
Usage
FixedGroupsPipeline$new(num_groups)
Arguments
num_groups
Number of groups the pipeline will be applied to.
Method add_transformation()
Add a transformation to the pipeline.
Usage
FixedGroupsPipeline$add_transformation(fn, args, var_args, name)
Arguments
fn
Function that performs the transformation.
args
Named
list
with arguments for the`fn`
function.var_args
Named
list
of arguments withlist
of differing values for each group.E.g.
list("a" = list(1, 2, 3), "b" = list("a", "b", "c"))
given 3 groups.By adding
".apply"
with a list ofTRUE
/FALSE
flags, the transformation can be disabled for a specific group.E.g.
list(".apply" = list(TRUE, FALSE, TRUE), ...
.name
Name of the transformation step. Must be unique.
Returns
The pipeline. To allow chaining of methods.
Method apply()
Apply the pipeline to a data.frame
.
Usage
FixedGroupsPipeline$apply(data, verbose = FALSE)
Arguments
data
data.frame
with the same number of groups as pre-registered in the pipeline.You can find the number of groups in
`data`
with`dplyr::n_groups(data)`
. The number of groups expected by the pipeline can be accessed with`pipe$num_groups`
.verbose
Whether to print the progress.
Returns
Transformed version of `data`
.
Method print()
Print an overview of the pipeline.
Usage
FixedGroupsPipeline$print(...)
Arguments
...
further arguments passed to or from other methods.
Returns
The pipeline. To allow chaining of methods.
Method clone()
The objects of this class are cloneable with this method.
Usage
FixedGroupsPipeline$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other pipelines:
GeneratedPipeline
,
Pipeline
Examples
# Attach package
library(rearrr)
library(dplyr)
# Create a data frame
# We group it by G so we have 3 groups
df <- data.frame(
"Index" = 1:12,
"A" = c(1:4, 9:12, 15:18),
"G" = rep(1:3, each = 4)
) %>%
dplyr::group_by(G)
# Create new pipeline
pipe <- FixedGroupsPipeline$new(num_groups = 3)
# Add 2D rotation transformation
pipe$add_transformation(
fn = rotate_2d,
args = list(
x_col = "Index",
y_col = "A",
suffix = "",
overwrite = TRUE
),
var_args = list(
degrees = list(45, 90, 180),
origin = list(c(0, 0), c(1, 2), c(-1, 0))
),
name = "Rotate"
)
# Add the `cluster_group` transformation
# As the function is fed an ungrouped subset of `data`,
# i.e. the rows of that group, we need to specify `group_cols` in `args`
# That is specific to `cluster_groups()` though
# Also note `.apply` in `var_args` which tells the pipeline *not*
# to apply this transformation to the second group
pipe$add_transformation(
fn = cluster_groups,
args = list(
cols = c("Index", "A"),
suffix = "",
overwrite = TRUE,
group_cols = "G"
),
var_args = list(
multiplier = list(0.5, 1, 5),
.apply = list(TRUE, FALSE, TRUE)
),
name = "Cluster"
)
# Check pipeline object
pipe
# Apply pipeline to already grouped data.frame
# Enable `verbose` to print progress
pipe$apply(df, verbose = TRUE)