mice.impute.rfpred.emp {RfEmpImp} | R Documentation |
Univariate sampler function for continuous variables using the empirical error distributions
Description
Please note that functions with names starting with "mice.impute" are exported to be visible for the mice sampler functions. Please do not call these functions directly unless you know exactly what you are doing.
For continuous variables only.
This function is for RfPred.Emp
multiple imputation method, adapter
for mice
samplers. In the mice()
function, set
method = "rfpred.emp"
to call it.
The function performs multiple imputation based on the empirical distribution of out-of-bag prediction errors of random forests.
Usage
mice.impute.rfpred.emp(
y,
ry,
x,
wy = NULL,
num.trees.cont = 10,
sym.dist = TRUE,
alpha.emp = 0,
pre.boot = TRUE,
num.threads = NULL,
...
)
Arguments
y |
Vector to be imputed. |
ry |
Logical vector of length |
x |
Numeric design matrix with |
wy |
Logical vector of length |
num.trees.cont |
Number of trees to build for continuous variables.
The default is |
sym.dist |
If |
alpha.emp |
The "significance level" for the empirical distribution of
out-of-bag prediction errors, can be used for prevention for outliers
(useful for highly skewed variables).
For example, set alpha = 0.05 to use 95% confidence level.
The default is |
pre.boot |
If |
num.threads |
Number of threads for parallel computing. The default is
|
... |
Other arguments to pass down. |
num.trees |
Number of trees to build. The default is
|
Details
RfPred.Emp
imputation sampler.
Value
Vector with imputed data, same type as y
, and of length
sum(wy)
.
Author(s)
Shangzhi Hong
References
Hong, Shangzhi, et al. "Multiple imputation using chained random forests." Preprint, submitted April 30, 2020. https://arxiv.org/abs/2004.14823.
Zhang, Haozhe, et al. "Random Forest Prediction Intervals." The American Statistician (2019): 1-20.
Shah, Anoop D., et al. "Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study." American journal of epidemiology 179.6 (2014): 764-774.
Malley, James D., et al. "Probability machines." Methods of information in medicine 51.01 (2012): 74-81.
Examples
# Users can set method = "rfpred.emp" in call to mice to use this method
data("airquality")
impObj <- mice(airquality, method = "rfpred.emp", m = 5,
maxit = 5, maxcor = 1.0, eps = 0,
remove.collinear = FALSE, remove.constant = FALSE,
printFlag = FALSE)