drake_plan {drake}R Documentation

Create a drake plan for the plan argument of make(). [Stable]

Description

A drake plan is a data frame with columns "target" and "command". Each target is an R object produced in your workflow, and each command is the R code to produce it.

Usage

drake_plan(
  ...,
  list = NULL,
  file_targets = NULL,
  strings_in_dots = NULL,
  tidy_evaluation = NULL,
  transform = TRUE,
  trace = FALSE,
  envir = parent.frame(),
  tidy_eval = TRUE,
  max_expand = NULL
)

Arguments

...

A collection of symbols/targets with commands assigned to them. See the examples for details.

list

Deprecated

file_targets

Deprecated.

strings_in_dots

Deprecated.

tidy_evaluation

Deprecated. Use tidy_eval instead.

transform

Logical, whether to transform the plan into a larger plan with more targets. Requires the transform field in target(). See the examples for details.

trace

Logical, whether to add columns to show what happens during target transformations.

envir

Environment for tidy evaluation.

tidy_eval

Logical, whether to use tidy evaluation (e.g. unquoting/⁠!!⁠) when resolving commands. Tidy evaluation in transformations is always turned on regardless of the value you supply to this argument.

max_expand

Positive integer, optional. max_expand is the maximum number of targets to generate in each map(), split(), or cross() transform. Useful if you have a massive plan and you want to test and visualize a strategic subset of targets before scaling up. Note: the max_expand argument of drake_plan() and transform_plan() is for static branching only. The dynamic branching max_expand is an argument of make() and drake_config().

Details

Besides "target" and "command", drake_plan() understands a special set of optional columns. For details, visit ⁠https://books.ropensci.org/drake/plans.html#special-custom-columns-in-your-plan⁠ # nolint

Value

A data frame of targets, commands, and optional custom columns.

Columns

drake_plan() creates a special data frame. At minimum, that data frame must have columns target and command with the target names and the R code chunks to build them, respectively.

You can add custom columns yourself, either with target() (e.g. drake_plan(y = target(f(x), transform = map(c(1, 2)), format = "fst"))) or by appending columns post-hoc (e.g. plan$col <- vals).

Some of these custom columns are special. They are optional, but drake looks for them at various points in the workflow.

Formats

Specialized target formats increase efficiency and flexibility. Some allow you to save specialized objects like keras models, while others increase the speed while conserving storage and memory. You can declare target-specific formats in the plan (e.g. drake_plan(x = target(big_data_frame, format = "fst"))) or supply a global default format for all targets in make(). Either way, most formats have specialized installation requirements (e.g. R packages) that are not installed with drake by default. You will need to install them separately yourself. Available formats:

Keywords

drake_plan() understands special keyword functions for your commands. With the exception of target(), each one is a proper function with its own help file.

Transformations

drake has special syntax for generating large plans. Your code will look something like ⁠drake_plan(y = target(f(x), transform = map(x = c(1, 2, 3)))⁠ You can read about this interface at ⁠https://books.ropensci.org/drake/plans.html#large-plans⁠. # nolint

Static branching

In static branching, you define batches of targets based on information you know in advance. Overall usage looks like ⁠drake_plan(<x> = target(<...>, transform = <call>)⁠, where

Transformation function usage:

Dynamic branching

map() and cross() create dynamic sub-targets from the variables supplied to the dots. As with static branching, the variables supplied to map() must all have equal length. group(f(data), .by = x) makes new dynamic sub-targets from data. Here, data can be either static or dynamic. If data is dynamic, group() aggregates existing sub-targets. If data is static, group() splits data into multiple subsets based on the groupings from .by.

Differences from static branching:

See Also

make, drake_config, transform_plan, map, split, cross, combine

Examples

## Not run: 
isolate_example("contain side effects", {
# For more examples, visit
# https://books.ropensci.org/drake/plans.html.

# Create drake plans:
mtcars_plan <- drake_plan(
  write.csv(mtcars[, c("mpg", "cyl")], file_out("mtcars.csv")),
  value = read.csv(file_in("mtcars.csv"))
)
if (requireNamespace("visNetwork", quietly = TRUE)) {
  plot(mtcars_plan) # fast simplified call to vis_drake_graph()
}
mtcars_plan
make(mtcars_plan) # Makes `mtcars.csv` and then `value`
head(readd(value))
# You can use knitr inputs too. See the top command below.

load_mtcars_example()
head(my_plan)
if (requireNamespace("knitr", quietly = TRUE)) {
  plot(my_plan)
}
# The `knitr_in("report.Rmd")` tells `drake` to dive into the active
# code chunks to find dependencies.
# There, `drake` sees that `small`, `large`, and `coef_regression2_small`
# are loaded in with calls to `loadd()` and `readd()`.
deps_code("report.Rmd")

# Formats are great for big data: https://github.com/ropensci/drake/pull/977
# Below, each target is 1.6 GB in memory.
# Run make() on this plan to see how much faster fst is!
n <- 1e8
plan <- drake_plan(
  data_fst = target(
    data.frame(x = runif(n), y = runif(n)),
    format = "fst"
  ),
  data_old = data.frame(x = runif(n), y = runif(n))
)

# Use transformations to generate large plans.
# Read more at
# `https://books.ropensci.org/drake/plans.html#create-large-plans-the-easy-way`. # nolint
drake_plan(
  data = target(
    simulate(nrows),
    transform = map(nrows = c(48, 64)),
    custom_column = 123
  ),
  reg = target(
    reg_fun(data),
   transform = cross(reg_fun = c(reg1, reg2), data)
  ),
  summ = target(
    sum_fun(data, reg),
   transform = cross(sum_fun = c(coef, residuals), reg)
  ),
  winners = target(
    min(summ),
    transform = combine(summ, .by = c(data, sum_fun))
  )
)

# Split data among multiple targets.
drake_plan(
  large_data = get_data(),
  slice_analysis = target(
    analyze(large_data),
    transform = split(large_data, slices = 4)
  ),
  results = target(
    rbind(slice_analysis),
    transform = combine(slice_analysis)
  )
)

# Set trace = TRUE to show what happened during the transformation process.
drake_plan(
  data = target(
    simulate(nrows),
    transform = map(nrows = c(48, 64)),
    custom_column = 123
  ),
  reg = target(
    reg_fun(data),
   transform = cross(reg_fun = c(reg1, reg2), data)
  ),
  summ = target(
    sum_fun(data, reg),
   transform = cross(sum_fun = c(coef, residuals), reg)
  ),
  winners = target(
    min(summ),
    transform = combine(summ, .by = c(data, sum_fun))
  ),
  trace = TRUE
)

# You can create your own custom columns too.
# See ?triggers for more on triggers.
drake_plan(
  website_data = target(
    command = download_data("www.your_url.com"),
    trigger = "always",
    custom_column = 5
  ),
  analysis = analyze(website_data)
)

# Tidy evaluation can help generate super large plans.
sms <- rlang::syms(letters) # To sub in character args, skip this.
drake_plan(x = target(f(char), transform = map(char = !!sms)))

# Dynamic branching
# Get the mean mpg for each cyl in the mtcars dataset.
plan <- drake_plan(
  raw = mtcars,
  group_index = raw$cyl,
  munged = target(raw[, c("mpg", "cyl")], dynamic = map(raw)),
  mean_mpg_by_cyl = target(
    data.frame(mpg = mean(munged$mpg), cyl = munged$cyl[1]),
    dynamic = group(munged, .by = group_index)
  )
)
make(plan)
readd(mean_mpg_by_cyl)
})

## End(Not run)

[Package drake version 7.13.10 Index]