to_flowdef {flowr}R Documentation

Flow Definition defines how to stitch steps into a (work)flow.

Description

This function enables creation of a skeleton flow definition with several default values, using a flowmat. To customize the flowdef, one may supply parameters such as sub_type and dep_type upfront. As such, these params must be of the same length as number of unique jobs using in the flowmat.

Each row in this table refers to one step of the pipeline. It describes the resources used by the step and also its relationship with other steps, especially, the step immediately prior to it. <br><br>

Submission types: This refers to the sub_type column in flow definition.<br>

Consider an example with three steps A, B and C. A has 10 commands from A1 to A10, similarly B has 10 commands B1 through B10 and C has a single command, C1. Consider another step D (with D1-D3), which comes after C.

step (number of sub-processes) A (10) —-> B (10) —–> C (1) —–> D (3)

Dependency types

This refers to the dep_type column in flow definition.

Usage

to_flowdef(x, ...)

## S3 method for class 'flowmat'
to_flowdef(
  x,
  sub_type,
  dep_type,
  prev_jobs,
  queue = "short",
  platform = "torque",
  memory_reserved = "2000",
  cpu_reserved = "1",
  nodes = "1",
  walltime = "1:00",
  guess = FALSE,
  verbose = opts_flow$get("verbose"),
  ...
)

## S3 method for class 'flow'
to_flowdef(x, ...)

## S3 method for class 'character'
to_flowdef(x, ...)

as.flowdef(x, ...)

is.flowdef(x)

Arguments

x

can a path to a flowmat, flowmat or flow object.

...

not used

sub_type

submission type, one of: scatter, serial. Character, of length one or same as the number of jobnames

dep_type

dependency type, one of: gather, serial or burst. Character, of length one or same as the number of jobnames

prev_jobs

previous job name

queue

Cluster queue to be used

platform

platform of the cluster: lsf, sge, moab, torque, slurm etc.

memory_reserved

amount of memory required.

cpu_reserved

number of cpu's required. [1]

nodes

if you tool can use multiple nodes, you may reserve multiple nodes for it. [1]

walltime

amount of walltime required

guess

should the function, guess submission and dependency types. See details.

verbose

A numeric value indicating the amount of messages to produce. Values are integers varying from 0, 1, 2, 3, .... Please refer to the verbose page for more details. opts_flow$get("verbose")

Format

This is a tab separated file, with a minimum of 4 columns:<br>

required columns:<br>

resource columns (recommended):<br>

Additionally, one may customize resource requirements used by each step. The format used varies and depends to the computing platform. Thus its best to refer to your institutions guide to specify these.

Details

NOTE: Guessing is an experimental feature, please check the definition carefully. it is provided to help but not replace your best judgement. <br>

Optionally, one may provide the previous jobs and flowr can try guessing the appropriate submission and dependency types. If there are multiple commands, default is submitting them as scatter, else as serial. Further, if previous job has multiple commands and current job has single; its assumed that all of the previous need to complete, suggesting a gather type dependency.

Examples

# see ?to_flow for more examples

# read in a tsv; check and confirm format
ex = file.path(system.file(package = "flowr"), "pipelines")

# read in a flowdef from file
flowdef = as.flowdef(file.path(ex, "sleep_pipe.def"))

# check if this a flowdef
is.flowdef(flowdef)

# use a flowmat, to create a sample flowdef
flowmat = as.flowmat(file.path(ex, "sleep_pipe.tsv"))
to_flowdef(flowmat)

# change the platform
to_flowdef(flowmat, platform = "lsf")

# change the queue name
def = to_flowdef(flowmat, 
 platform = "lsf", 
 queue = "long")
plot_flow(def)

# guess submission and dependency types
def2 = to_flowdef(flowmat, 
 platform = "lsf", 
 queue = "long", 
 guess = TRUE)
plot_flow(def2)




[Package flowr version 0.9.11 Index]