p_status {dtrackr} | R Documentation |
Add a summary to the dtrackr history graph
Description
In the middle of a pipeline you may wish to document something about the data
that is more complex than the simple counts. status
is essentially a
dplyr
summarisation step which is connected to a glue
specification
output, that is recorded in the data frame history. This means you can do an
arbitrary interim summarisation and put the result into the flowchart without
disrupting the pipeline flow.
Usage
p_status(
.data,
...,
.messages = .defaultMessage(),
.headline = .defaultHeadline(),
.type = "info",
.asOffshoot = FALSE,
.tag = NULL
)
Arguments
.data |
a dataframe which may be grouped |
... |
any normal dplyr::summarise specification, e.g. |
.messages |
a character vector of glue specifications. A glue specification can refer to the summary outputs, any grouping variables of .data, the {.strata}, or any variables defined in the calling environment |
.headline |
a glue specification which can refer to grouping variables of .data, or any variables defined in the calling environment |
.type |
one of "info","exclusion": used to define formatting |
.asOffshoot |
do you want this comment to be an offshoot of the main flow (default = FALSE). |
.tag |
if you want the summary data from this step in the future then give it a name with .tag. |
Details
Because of the ... summary specification parameters MUST BE NAMED.
Value
the same .data dataframe with the history metadata updated with the status inserted as a new stage
Examples
library(dplyr)
library(dtrackr)
tmp = iris %>% track() %>% group_by(Species)
tmp %>% status(
long = p_count_if(Petal.Length>5),
short = p_count_if(Petal.Length<2),
.messages="{Species}: {long} long ones & {short} short ones"
) %>% history()