slurm_map {rslurm} | R Documentation |
Parallel execution of a function over a list on the Slurm cluster
Description
Use slurm_map
to compute function over a list
in parallel, spread across multiple nodes of a Slurm cluster,
with similar syntax to lapply
.
Usage
slurm_map(
x,
f,
...,
jobname = NA,
nodes = 2,
cpus_per_node = 2,
processes_per_node = cpus_per_node,
preschedule_cores = TRUE,
job_array_task_limit = NULL,
global_objects = NULL,
pkgs = rev(.packages()),
libPaths = NULL,
rscript_path = NULL,
r_template = NULL,
sh_template = NULL,
slurm_options = list(),
submit = TRUE
)
Arguments
x |
A list to apply |
f |
A function that accepts one element of |
... |
Additional arguments to |
jobname |
The name of the Slurm job; if |
nodes |
The (maximum) number of cluster nodes to spread the calculation
over. |
cpus_per_node |
The number of CPUs requested per node. This argument is
mapped to the Slurm parameter |
processes_per_node |
The number of logical CPUs to utilize per node,
i.e. how many processes to run in parallel per node. This can exceed
|
preschedule_cores |
Corresponds to the |
job_array_task_limit |
The maximum number of job array tasks to run at
the same time. Defaults to |
global_objects |
A character vector containing the name of R objects to be
saved in a .RData file and loaded on each cluster node prior to calling
|
pkgs |
A character vector containing the names of packages that must
be loaded on each cluster node. By default, it includes all packages
loaded by the user when |
libPaths |
A character vector describing the location of additional R
library trees to search through, or NULL. The default value of NULL
corresponds to libraries returned by |
rscript_path |
The location of the Rscript command. If not specified, defaults to the location of Rscript within the R installation being run. |
r_template |
The path to the template file for the R script run on each node. If NULL, uses the default template "rslurm/templates/slurm_run_R.txt". |
sh_template |
The path to the template file for the sbatch submission script. If NULL, uses the default template "rslurm/templates/submit_sh.txt". |
slurm_options |
A named list of options recognized by |
submit |
Whether or not to submit the job to the cluster with
|
Details
This function creates a temporary folder ("_rslurm_[jobname]") in the current directory, holding .RData and .RDS data files, the R script to run and the Bash submission script generated for the Slurm job.
The set of input parameters is divided in equal chunks sent to each node, and
f
is evaluated in parallel within each node using functions from the
parallel
R package. The names of any other R objects (besides
x
) that f
needs to access should be included in
global_objects
or passed as additional arguments through ...
.
Use slurm_options
to set any option recognized by sbatch
, e.g.
slurm_options = list(time = "1:00:00", share = TRUE)
.
See http://slurm.schedmd.com/sbatch.html for details on possible options.
Note that full names must be used (e.g. "time" rather than "t") and that flags
(such as "share") must be specified as TRUE. The "array", "job-name", "nodes",
"cpus-per-task" and "output" options are already determined by
slurm_map
and should not be manually set.
When processing the computation job, the Slurm cluster will output two types of files in the temporary folder: those containing the return values of the function for each subset of parameters ("results_[node_id].RDS") and those containing any console or error output produced by R on each node ("slurm_[node_id].out").
If submit = TRUE
, the job is sent to the cluster and a confirmation
message (or error) is output to the console. If submit = FALSE
,
a message indicates the location of the saved data and script files; the
job can be submitted manually by running the shell command
sbatch submit.sh
from that directory.
After sending the job to the Slurm cluster, slurm_map
returns a
slurm_job
object which can be used to cancel the job, get the job
status or output, and delete the temporary files associated with it. See
the description of the related functions for more details.
Value
A slurm_job
object containing the jobname
and the
number of nodes
effectively used.
See Also
slurm_call
to evaluate a single function call.
slurm_apply
to evaluate a function row-wise over a
data frame of parameters.
cancel_slurm
, cleanup_files
,
get_slurm_out
and get_job_status
which use the output of this function.
Examples
## Not run:
sjob <- slurm_map(func, list)
get_job_status(sjob) # Prints console/error output once job is completed.
func_result <- get_slurm_out(sjob, "table") # Loads output data into R.
cleanup_files(sjob)
## End(Not run)