eval_tibbles {simTool} | R Documentation |
Workhorse for simulation studies
Description
Generates data according to all provided
constellations in data_tibble
and applies
all provided constellations in proc_tibble
to them.
Usage
eval_tibbles(
data_grid,
proc_grid = expand_tibble(proc = "length"),
replications = 1,
discard_generated_data = FALSE,
post_analyze = identity,
summary_fun = NULL,
group_for_summary = NULL,
ncpus = 1L,
cluster = NULL,
cluster_seed = rep(12345, 6),
cluster_libraries = NULL,
cluster_global_objects = NULL,
envir = globalenv(),
simplify = TRUE
)
Arguments
data_grid |
a |
proc_grid |
similar as |
replications |
number of replications for the simulation |
discard_generated_data |
if |
post_analyze |
this is a convenience function, that is applied
directly after the data analyzing function. If this function has an
argument |
summary_fun |
named list of univariate function to summarize the results (numeric or logical) over the replications, e.g. list(mean = mean, sd = sd). |
group_for_summary |
if the result returned by the data analyzing
function or |
ncpus |
a cluster of |
cluster |
a cluster generated by the |
cluster_seed |
if the simulation is done in parallel
manner, then the combined multiple-recursive generator from L'Ecuyer (1999)
is used to generate random numbers. Thus |
cluster_libraries |
a character vector specifying the packages that should be loaded by the workers. |
cluster_global_objects |
a character vector specifying the names of R objects in the global environment that should be exported to the global environment of every worker. |
envir |
must be provided if the functions specified
in |
simplify |
usually the result column is nested, by default it is tried to unnest it. |
Value
The returned object list of the class
eval_tibbles
, where the element simulations
contain
the results of the simulation.
Note
If cluster
is provided by the user the
function eval_tibbles
will NOT stop the cluster.
This has to be done by the user. Conducting parallel
simulations by specifying ncpus
will internally
create a cluster and stop it after the simulation
is done.
Author(s)
Marsel Scheer
Examples
rng <- function(data, ...) {
ret <- range(data)
names(ret) <- c("min", "max")
ret
}
### The following line is only necessary
### if the examples are not executed in the global
### environment, which for instance is the case when
### the oneline-documentation
### http://marselscheer.github.io/simTool/reference/eval_tibbles.html
### is build. In such case eval_tibble() would search the
### above defined function rng() in the global environment where
### it does not exist!
eval_tibbles <- purrr::partial(eval_tibbles, envir = environment())
dg <- expand_tibble(fun = "rnorm", n = c(5L, 10L))
pg <- expand_tibble(proc = c("rng", "median", "length"))
eval_tibbles(dg, pg, rep = 2, simplify = FALSE)
eval_tibbles(dg, pg, rep = 2)
eval_tibbles(dg, pg,
rep = 2,
post_analyze = purrr::compose(as.data.frame, t)
)
eval_tibbles(dg, pg, rep = 2, summary_fun = list(mean = mean, sd = sd))
regData <- function(n, SD) {
data.frame(
x = seq(0, 1, length = n),
y = rnorm(n, sd = SD)
)
}
eg <- eval_tibbles(
expand_tibble(fun = "regData", n = 5L, SD = 1:2),
expand_tibble(proc = "lm", formula = c("y~x", "y~I(x^2)")),
replications = 3
)
eg
presever_rownames <- function(mat) {
rn <- rownames(mat)
ret <- tibble::as_tibble(mat)
ret$term <- rn
ret
}
eg <- eval_tibbles(
expand_tibble(fun = "regData", n = 5L, SD = 1:2),
expand_tibble(proc = "lm", formula = c("y~x", "y~I(x^2)")),
post_analyze = purrr::compose(presever_rownames, coef, summary),
# post_analyze = broom::tidy, # is a nice out of the box alternative
summary_fun = list(mean = mean, sd = sd),
group_for_summary = "term",
replications = 3
)
eg$simulation
dg <- expand_tibble(fun = "rexp", rate = c(10, 100), n = c(50L, 100L))
pg <- expand_tibble(proc = c("t.test"), conf.level = c(0.8, 0.9, 0.95))
et <- eval_tibbles(dg, pg,
ncpus = 1,
replications = 10^1,
post_analyze = function(ttest, .truth) {
mu <- 1 / .truth$rate
ttest$conf.int[1] <= mu && mu <= ttest$conf.int[2]
},
summary_fun = list(mean = mean, sd = sd)
)
et
dg <- dplyr::bind_rows(
expand_tibble(fun = "rexp", rate = 10, .truth = 1 / 10, n = c(50L, 100L)),
expand_tibble(fun = "rnorm", .truth = 0, n = c(50L, 100L))
)
pg <- expand_tibble(proc = c("t.test"), conf.level = c(0.8, 0.9, 0.95))
et <- eval_tibbles(dg, pg,
ncpus = 1,
replications = 10^1,
post_analyze = function(ttest, .truth) {
ttest$conf.int[1] <= .truth && .truth <= ttest$conf.int[2]
},
summary_fun = list(mean = mean, sd = sd)
)
et
### need to remove the locally adapted eval_tibbles()
### otherwise executing the examples would mask
### eval_tibbles from simTool-namespace.
rm(eval_tibbles)