seqKNNimp {multiUS} | R Documentation |
Sequential KNN imputation method
Description
This function estimates missing values sequentially from the units that has least missing rate, using weighted mean of k nearest neighbors.
Usage
seqKNNimp(data, k = 10)
Arguments
data |
A data frame with the data set. |
k |
The number of nearest neighbours to use (defaults to 10). |
Details
The function separates the dataset into an incomplete set with missing values and into a complete set without missing values. The values in an incomplete set are imputed in the order of the number of missing values. A missing value is filled by the weighted mean value of a corresponding column of the nearest neighbour units in the complete set. Once all missing values for a given unit are imputed, the unit is moved into the complete set and used for the imputation of the rest of units in the incomplete set. In this process, all missing values for one unit can be imputed simultaneously from the selected neighbour units in the complete set. This reduces execution time from previously developed KNN method that selects nearest neighbours for each imputation.
Value
A dataframe with imputed values.
Note
This is the function from package SeqKNN
by Ki-Yeol Kim and Gwan-Su Yi.
Author(s)
Ki-Yeol Kim and Gwan-Su Yi
References
Ki-Yeol Kim, Byoung-Jin Kim, Gwan-Su Yi (2004.Oct.26) "Reuse of imputed data in microarray analysis increases imputation efficiency", BMC Bioinformatics 5:160.
See Also
KNNimp
Examples
mtcars$mpg[sample(1:nrow(mtcars), size = 5, replace = FALSE)] <- NA
seqKNNimp(data = mtcars)