mlim.preimpute {mlim}R Documentation

carries out preimputation

Description

instead of replacing missing data with mean and mode, a smarter start-point would be to use fast imputation algorithms and then optimize the imputed dataset with mlim. this procedure usually requires less iterations and will savea lot of computation resources.

Usage

mlim.preimpute(data, preimpute = "RF", seed = NULL)

Arguments

data

data.frame with missing values

preimpute

character. specify the algorithm for preimputation. the supported options are "RF" (Random Forest), "mm" (mean-mode replacement), and "random" (random sampling from available data). the default is "RF", which carries a parallel random forest imputation, using all the CPUs available. the other alternative is "mm" which performs mean/mode imputation.

seed

integer. specify the random generator seed

Value

imputed data.frame

Author(s)

E. F. Haghish

Examples

## Not run: 
data(iris)

# add 10% stratified missing values to one factor variable
irisNA <- iris
irisNA$Species <- mlim.na(irisNA$Species, p = 0.1, stratify = TRUE, seed = 2022)

# run the default random forest preimputation
MLIM <- mlim.preimpute(irisNA)
mlim.error(MLIM, irisNA, iris)

## End(Not run)

[Package mlim version 0.3.0 Index]