batch.pulsar {pulsar} | R Documentation |
pulsar: batch mode
Description
Run pulsar using stability selection, or another criteria, to select an undirected graphical model over a lambda-path.
Usage
batch.pulsar(
data,
fun = huge::huge,
fargs = list(),
criterion = c("stars"),
thresh = 0.1,
subsample.ratio = NULL,
lb.stars = FALSE,
ub.stars = FALSE,
rep.num = 20,
seed = NULL,
wkdir = getwd(),
regdir = NA,
init = "init",
conffile = "",
job.res = list(),
cleanup = FALSE,
refit = TRUE
)
Arguments
data |
A |
fun |
pass in a function that returns a list representing |
fargs |
arguments to argument |
criterion |
A character vector of selection statistics. Multiple criteria can be supplied. Only StARS can be used to automatically select an optimal index for the lambda path. See details for additional statistics. |
thresh |
threshold (referred to as scalar |
subsample.ratio |
determine the size of the subsamples (referred to as |
lb.stars |
Should the lower bound be computed after the first |
ub.stars |
Should the upper bound be computed after the first |
rep.num |
number of random subsamples |
seed |
A numeric seed to force predictable subsampling. Default is NULL. Use for testing purposes only. |
wkdir |
set the working directory if different than |
regdir |
directory to store intermediate batch job files. Default will be a tempory directory |
init |
text string appended to basename of the regdir path to store the batch jobs for the initial StARS variability estimate (ignored if 'regdir' is NA) |
conffile |
path to or string that identifies a |
job.res |
named list of resources needed for each job (e.g. for PBS submission script). The format and members depends on configuration and template. See examples section for a Torque example |
cleanup |
Flag for removing batchtools registry files. Recommended FALSE unless you're sure intermediate data shouldn't be saved. |
refit |
Boolean flag to refit on the full dataset after pulsar is run. (see also |
Value
an S3 object of class batch.pulsar
with a named member for each stability criterion/metric. Within each of these are:
summary: the summary criterion over
rep.num
graphs at each value of lambdacriterion: the stability metric
merge: the raw criterion merged over the
rep.num
graphs (constructed fromrep.num
subsamples), prior to summarizationopt.ind: index (along the path) of optimal lambda selected by the criterion at the desired threshold. Will return
if no optimum is found or
NULL
if selection for the criterion is not implemented.
If stars
is included as a criterion then additional arguments include
lb.index: the lambda index of the lower bound at
samples if
lb.stars
flag is set to TRUEub.index: the lambda index of the upper bound at
samples if
ub.stars
flag is set to TRUE
reg: Registry object. See batchtools::makeRegistry
id: Identifier for mapping graph estimation function. See batchtools::batchMap
call: the original function call
References
Müller, C. L., Bonneau, R., & Kurtz, Z. (2016). Generalized Stability Approach for Regularized Graphical Models. arXiv https://arxiv.org/abs/1605.07072
Liu, H., Roeder, K., & Wasserman, L. (2010). Stability approach to regularization selection (stars) for high dimensional graphical models. Proceedings of the Twenty-Third Annual Conference on Neural Information Processing Systems (NIPS).
Zhao, T., Liu, H., Roeder, K., Lafferty, J., & Wasserman, L. (2012). The huge Package for High-dimensional Undirected Graph Estimation in R. The Journal of Machine Learning Research, 13, 1059–1062.
Michel Lang, Bernd Bischl, Dirk Surmann (2017). batchtools: Tools for R to work on batch systems. The Journal of Open Source Software, 2(10). URL https://doi.org/10.21105/joss.00135.
See Also
Examples
## Not run:
## Generate the data with huge:
library(huge)
set.seed(10010)
p <- 400 ; n <- 1200
dat <- huge.generator(n, p, "hub", verbose=FALSE, v=.1, u=.3)
lams <- getLamPath(.2, .01, len=40)
hugeargs <- list(lambda=lams, verbose=FALSE)
## Run batch.pulsar using snow on 5 cores, and show progress.
options(mc.cores=5)
options(batchtools.progress=TRUE, batchtools.verbose=FALSE)
out <- batch.pulsar(dat$data, fun=huge::huge, fargs=hugeargs,
rep.num=20, criterion='stars', conffile='snow')
## Run batch.pulsar on a Torque cluster
## Give each job 1gb of memory and a limit of 30 minutes
resources <- list(mem="1GB", nodes="1", walltime="00:30:00")
out.p <- batch.pulsar(dat$data, fun=huge::huge, fargs=hugeargs,
rep.num=100, criterion=c('stars', 'gcd'), conffile='torque'
job.res=resources, regdir=file.path(getwd(), "testtorq"))
plot(out.p)
## take a look at the default torque config and template files we just used
file.show(findConfFile('torque'))
file.show(findTemplateFile('simpletorque'))
## End(Not run)