futuremice {mice} | R Documentation |
Wrapper function that runs MICE in parallel
Description
This is a wrapper function for mice
, using multiple cores to
execute mice
in parallel. As a result, the imputation
procedure can be sped up, which may be useful in general. By default,
futuremice
distributes the number of imputations m
about equally over the cores.
Usage
futuremice(
data,
m = 5,
parallelseed = NA,
n.core = NULL,
seed = NA,
use.logical = TRUE,
future.plan = "multisession",
packages = NULL,
globals = NULL,
...
)
Arguments
data |
A data frame or matrix containing the incomplete data. Similar to
the first argument of |
m |
The number of desired imputated datasets. By default $m=5$ as with
|
parallelseed |
A scalar to be used to obtain reproducible results over
the futures. The default |
n.core |
A scalar indicating the number of cores that should be used. |
seed |
A scalar to be used as the seed value for the mice algorithm
within each parallel stream. Please note that the imputations will be the
same for all streams and, hence, this should be used if and only if
|
use.logical |
A logical indicating whether logical ( |
future.plan |
A character indicating how |
packages |
A character vector with additional packages to be used in
|
globals |
A character string with additional functions to be exported to each future (e.g., user-written imputation functions). |
... |
Named arguments that are passed down to function |
Details
This function relies on package furrr
, which is a
package for R versions 3.2.0 and later. We have chosen to use furrr function
future_map
to allow the use of futuremice
on Mac, Linux and
Windows systems.
This wrapper function combines the output of future_map
with
function ibind
from the mice
package. A
mids
object is returned and can be used for further analyses.
A seed value can be specified in the global environment, which will yield
reproducible results. A seed value can also be specified within the
futuremice
call, through specifying the argument
parallelseed
. If parallelseed
is not specified, a seed value is
drawn randomly by default, and accessible through $parallelseed
in the
output object. Hence, results will always be reproducible, regardless of
whether the seed is specified in the global environment, or by setting the
same seed within the function (potentially by extracting the seed from the
futuremice
output object.
Value
A mids object as defined by mids-class
Author(s)
Thom Benjamin Volker, Gerko Vink
References
Volker, T.B. and Vink, G. (2022). futuremice: The future starts today. https://www.gerkovink.com/miceVignettes/futuremice/Vignette_futuremice.html
#'Van Buuren, S. (2018). Flexible Imputation of Missing Data. Second Edition. Chapman & Hall/CRC. Boca Raton, FL.
See Also
future
, furrr
, future_map
,
plan
, mice
, mids-class
Examples
# 150 imputations in dataset nhanes, performed by 3 cores
## Not run:
imp1 <- futuremice(data = nhanes, m = 150, n.core = 3)
# Making use of arguments in mice.
imp2 <- futuremice(data = nhanes, m = 100, method = "norm.nob")
imp2$method
fit <- with(imp2, lm(bmi ~ hyp))
pool(fit)
## End(Not run)