process_transform_throw_error {pipeliner}R Documentation

Validate and clean transform function output

Description

Helper function that ensures the output of applying a transform function is a data.frame and that this data frame does not duplicate variables from the original (input data) data frame. If duplicates are found they are automatically dropped from the data.frame that is returned by this function.

Usage

process_transform_throw_error(input_df, output_df, func_name)

Arguments

input_df

The original (input data) data.frame - the transform function's argument.

output_df

The the transform function's output.

func_name

The name of the ml_pipeline_builder trandform method.

Value

If the transform function is not NULL then a copy of the transform function's output data.frame, with any duplicated inputs removed.

Examples

## Not run: 
transform_method <- function(df) cbind_fast(df, q = df$y * df$y)
data <- data.frame(y = c(1, 2), x = c(0.1, 0.2))
data_transformed <- transform_method(data)
process_transform_throw_error(data, data_transformed, "transform_method")
# transform_method yields data.frame that duplicates input vars - dropping the following
columns: 'y', 'x'
# q
# 1 1
# 2 4

## End(Not run)

[Package pipeliner version 0.1.1 Index]