tar_files_input_raw {tarchetypes} | R Documentation |
Dynamic branching over input files or URLs (raw version).
Description
Dynamic branching over input files or URLs.
Usage
tar_files_input_raw(
name,
files,
batches = length(files),
format = c("file", "file_fast", "url", "aws_file"),
repository = targets::tar_option_get("repository"),
iteration = targets::tar_option_get("iteration"),
error = targets::tar_option_get("error"),
memory = targets::tar_option_get("memory"),
garbage_collection = targets::tar_option_get("garbage_collection"),
priority = targets::tar_option_get("priority"),
resources = targets::tar_option_get("resources"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")
)
Arguments
name |
Symbol, name of the target. A target
name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
files |
Nonempty character vector of known existing input files to track for changes. |
batches |
Positive integer of length 1, number of batches to partition the files. The default is one file per batch (maximum number of batches) which is simplest to handle but could cause a lot of overhead and consume a lot of computing resources. Consider reducing the number of batches below the number of files for heavy workloads. |
format |
Character, either |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
iteration |
Character, iteration method. Must be a method
supported by the |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy.
If |
garbage_collection |
Logical, whether to run |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
Details
tar_files_input_raw()
is similar to tar_files_input()
except the name
argument must be a character string.
tar_files_input_raw()
creates a pair of targets, one upstream
and one downstream. The upstream target does some work
and returns some file paths, and the downstream
target is a pattern that applies format = "file"
or format = "url"
.
This is the correct way to dynamically
iterate over file/url targets. It makes sure any downstream patterns
only rerun some of their branches if the files/urls change.
For more information, visit
https://github.com/ropensci/targets/issues/136 and
https://github.com/ropensci/drake/issues/1302.
Value
A list of two targets, one upstream and one downstream.
The upstream one does some work and returns some file paths,
and the downstream target is a pattern that applies format = "file"
or format = "url"
.
See the "Target objects" section for background.
Target objects
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
See Also
Other Dynamic branching over files:
tar_files()
,
tar_files_input()
,
tar_files_raw()
Examples
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) {
targets::tar_dir({ # tar_dir() runs code from a temporary directory.
targets::tar_script({
# Do not use temp files in real projects
# or else your targets will always rerun.
paths <- unlist(replicate(4, tempfile()))
file.create(paths)
list(
tarchetypes::tar_files_input_raw(
"x",
paths,
batches = 2
)
)
})
targets::tar_make()
targets::tar_read(x)
targets::tar_read(x, branches = 1)
})
}