reduceResults {BatchJobs}R Documentation

Reduce results from result directory.

Description

The following functions provide ways to reduce result files into either specific R objects (like vectors, lists, matrices or data.frames) or to arbitrarily aggregate them, which is a more general operation.

Usage

reduceResults(reg, ids, part = NA_character_, fun, init, impute.val,
  progressbar = TRUE, ...)

reduceResultsList(reg, ids, part = NA_character_, fun, ...,
  use.names = "ids", impute.val, progressbar = TRUE)

reduceResultsVector(reg, ids, part = NA_character_, fun, ...,
  use.names = "ids", impute.val)

reduceResultsMatrix(reg, ids, part = NA_character_, fun, ...,
  rows = TRUE, use.names = "ids", impute.val)

reduceResultsDataFrame(reg, ids, part = NA_character_, fun, ...,
  use.names = "ids", impute.val,
  strings.as.factors = default.stringsAsFactors())

reduceResultsDataTable(reg, ids, part = NA_character_, fun, ...,
  use.names = "ids", impute.val)

Arguments

reg

[Registry]
Registry.

ids

[integer]
Ids of selected jobs. Default is all jobs for which results are available.

part

[character] Only useful for multiple result files, then defines which result file part(s) should be loaded. NA means all parts are loaded, which is the default.

fun

[function]
For reduceResults, a function function(aggr, job, res, ...) to reduce things, for all others, a function function(job, res, ...) to select stuff. Here, job is the current job descriptor (see Job), result is the current result object and aggr are the so far aggregated results. When using reduceResults, your function should add the stuff you want to have from job and result to aggr and return that. When using the other reductions, you should select the stuff you want to have from job and result and return something that can be coerced to an element of the selected return data structure (reasonable conversion is tried internally). Default behavior for this argument is to return res, except for reduceResults where no default is available.

init

[ANY]
Initial element, as used in Reduce. Default is first result.

impute.val

[any]
For reduceResults: If not missing, the value of impute.val is passed to function fun as argument res for jobs with missing results.
For the specialized reduction functions reduceResults[Type]: If not missing, impute.val is used as a replacement for the return value of fun on missing results.

progressbar

[logical(1)]
Set to FALSE to disable the progress bar. To disable all progress bars, see makeProgressBar.

...

[any]
Additional arguments to fun.

use.names

[character(1)]
Name the results with job ids (“ids”), stored job names (“names”) or return a unnamed result (“none”). Default is ids.

rows

[logical(1)]
Should the selected vectors be used as rows (or columns) in the result matrix? Default is TRUE.

strings.as.factors

[logical(1)] Should all character columns in result be converted to factors? Default is default.stringsAsFactors().

Value

Aggregated results, return type depends on function. If ids is empty: reduceResults returns init (if available) or NULL, reduceResultsVector returns c(), reduceResultsList returns list(), reduceResultsMatrix returns matrix(0,0,0), reduceResultsDataFrame returns data.frame().

Examples

# generate results:
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) x^2
batchMap(reg, f, 1:5)
submitJobs(reg)
waitForJobs(reg)

# reduce results to a vector
reduceResultsVector(reg)
# reduce results to sum
reduceResults(reg, fun = function(aggr, job, res) aggr+res)

reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) list(a = x, b = as.character(2*x), c = x^2)
batchMap(reg, f, 1:5)
submitJobs(reg)
waitForJobs(reg)

# reduce results to a vector
reduceResultsVector(reg, fun = function(job, res) res$a)
reduceResultsVector(reg, fun = function(job, res) res$b)
# reduce results to a list
reduceResultsList(reg)
# reduce results to a matrix
reduceResultsMatrix(reg, fun = function(job, res) res[c(1,3)])
reduceResultsMatrix(reg, fun = function(job, res) c(foo = res$a, bar = res$c), rows = TRUE)
reduceResultsMatrix(reg, fun = function(job, res) c(foo = res$a, bar = res$c), rows = FALSE)
# reduce results to a data.frame
print(str(reduceResultsDataFrame(reg)))
# reduce results to a sum
reduceResults(reg, fun = function(aggr, job, res) aggr+res$a, init = 0)

[Package BatchJobs version 1.8 Index]