p_status {dtrackr}R Documentation

Add a summary to the dtrackr history graph

Description

In the middle of a pipeline you may wish to document something about the data that is more complex than the simple counts. status is essentially a dplyr summarisation step which is connected to a glue specification output, that is recorded in the data frame history. This means you can do an arbitrary interim summarisation and put the result into the flowchart without disrupting the pipeline flow.

Usage

p_status(
  .data,
  ...,
  .messages = .defaultMessage(),
  .headline = .defaultHeadline(),
  .type = "info",
  .asOffshoot = FALSE,
  .tag = NULL
)

Arguments

.data

a dataframe which may be grouped

...

any normal dplyr::summarise specification, e.g. count=n() or av=mean(x), etcetera.

.messages

a character vector of glue specifications. A glue specification can refer to the summary outputs, any grouping variables of .data, the {.strata}, or any variables defined in the calling environment

.headline

a glue specification which can refer to grouping variables of .data, or any variables defined in the calling environment

.type

one of "info","exclusion": used to define formatting

.asOffshoot

do you want this comment to be an offshoot of the main flow (default = FALSE).

.tag

if you want the summary data from this step in the future then give it a name with .tag.

Details

Because of the ... summary specification parameters MUST BE NAMED.

Value

the same .data dataframe with the history metadata updated with the status inserted as a new stage

Examples

library(dplyr)
library(dtrackr)
tmp = iris %>% track() %>% group_by(Species)
tmp %>% status(
      long = p_count_if(Petal.Length>5),
      short = p_count_if(Petal.Length<2),
      .messages="{Species}: {long} long ones & {short} short ones"
) %>% history()

[Package dtrackr version 0.4.4 Index]