R: General Imputation Framework in R

impute {imputeR}

R Documentation

General Imputation Framework in R

Description

Impute missing values under the general framework in R

Usage

impute(missdata, lmFun = NULL, cFun = NULL, ini = NULL,
  maxiter = 100, verbose = TRUE, conv = TRUE)

Arguments

`missdata`	data matrix with missing values encoded as NA.
`lmFun`	the variable selection method for continuous data.
`cFun`	the variable selection method for categorical data.
`ini`	the method for initilisation. It is a length one character if missdata contains only one type of variables only. For continous only data, ini can be "mean" (mean imputation), "median" (median imputation) or "random" (random guess), the default is "mean". For categorical data, it can be either "majority" or "random", the default is "majority". If missdata is mixed of continuous and categorical data, then ini has to be a vector of two characters, with the first element indicating the method for continous variables and the other element for categorical variables, and the default is c("mean", "majority".)
`maxiter`	is the maximum number of interations
`verbose`	is logical, if TRUE then detailed information will be printed in the console while running.
`conv`	logical, if TRUE, the convergence details will be returned

Details

This function can impute several kinds of data, including continuous-only data, categorical-only data and mixed-type data. Many methods can be used, including regularisation method like LASSO and ridge regression, tree-based model and dimensionality reduction method like PCA and PLS.

Value

if conv = FALSE, it returns a completed data matrix with no missing values; if TRUE, it rrturns a list of components including:

`imp`	the imputed data matrix with no missing values
`conv`	the convergence status during the imputation

Examples

data(parkinson)
# introduce 10% random missing values into the parkinson data
missdata <- SimIm(parkinson, 0.1)
# impute the missing values by LASSO

impdata <- impute(missdata, lmFun = "lassoR")
# calculate the normalised RMSE for the imputation
Rmse(impdata$imp, missdata, parkinson, norm = TRUE)

[Package imputeR version 2.2 Index]