| MALDIquant-parallel {MALDIquant} | R Documentation |
Parallel Support in Package MALDIquant
Description
MALDIquant offers multi-core support using
mclapply and mcmapply. This
approach is limited to unix-based platforms.
Please note that not all functions benfit from parallelisation. Often the
overhead to create/copy objects outrun the time saving of parallel runs. This
is true for functions that are very fast to compute (e.g.
sqrt-transformation). That's why the default value for the
mc.cores argument in all functions is 1L.
It depends on the size of the dataset which step (often only
removeBaseline and
detectPeaks) benefits from parallelisation.
In general it is faster to encapsulate the complete workflow into a function
and parallelise it using mclapply instead of using the
mc.cores argument of each method. The reason is the reduced overhead
for object management (only one split/combine is needed instead of doing these
operations in each function again and again).
Details
The following functions/methods support the mc.cores argument:
See Also
Examples
## Not run:
## load package
library("MALDIquant")
## load example data
data("fiedler2009subset", package="MALDIquant")
## run single-core baseline correction
print(system.time(
b1 <- removeBaseline(fiedler2009subset, method="SNIP")
))
if(.Platform$OS.type == "unix") {
## run multi-core baseline correction
print(system.time(
b2 <- removeBaseline(fiedler2009subset, method="SNIP", mc.cores=2)
))
stopifnot(all.equal(b1, b2))
}
## parallelise complete workflow
workflow <- function(spectra, cores) {
s <- transformIntensity(spectra, method="sqrt", mc.cores=cores)
s <- smoothIntensity(s, method="SavitzkyGolay", halfWindowSize=10,
mc.cores=cores)
s <- removeBaseline(s, method="SNIP", iterations=100, mc.cores=cores)
s <- calibrateIntensity(s, method="TIC", mc.cores=cores)
detectPeaks(s, method="MAD", halfWindowSize=20, SNR=2, mc.cores=cores)
}
if(.Platform$OS.type == "unix") {
## parallelise the complete workflow is often faster because the overhead is
## reduced
print(system.time(
p1 <- unlist(parallel::mclapply(fiedler2009subset,
function(x)workflow(list(x), cores=1),
mc.cores=2), use.names=FALSE)
))
print(system.time(
p2 <- workflow(fiedler2009subset, cores=2)
))
stopifnot(all.equal(p1, p2))
}
## End(Not run)