impute_LS_gene {missMethods}R Documentation

LSimpute_gene

Description

Perform LSimpute_gene as described by Bo et al. (2004)

Usage

impute_LS_gene(
  ds,
  k = 10,
  eps = 1e-06,
  min_common_obs = 5,
  return_r_max = FALSE,
  verbose = FALSE
)

Arguments

ds

A data frame or matrix with missing values.

k

Number of most correlated genes used for the imputation of a gene.

eps

Used in the calculation of the weights (Bo et al. (2004) used eps = 1e-6).

min_common_obs

A row can only take part in the imputation of another row, if both rows share at least min_common_obs columns with no missing values.

return_r_max

Logical; normally, this should be FALSE. TRUE is used inside of impute_LS_adaptive() to speed up some computations.

verbose

Should messages be given for special cases (see details)?

Details

This function performs LSimpute_gene as described by Bo et al. (2004).The function assumes that the genes are the rows of ds.

Bo et al. (2004) seem to have chosen min_common_obs = 5. However, they did not document this behavior. This value emerged from inspecting imputation results from the original jar-file, which is provided by Bo et al. (2004).

If there are less than min_common_obs observed values in a row and at least one observed value, the mean of the observed row values is imputed. If no value is observed in a row, the observed column means are imputed for the missing row values. This is the only known difference between this function and the original one from Bo et al. (2004). The original function would not impute such a row and return a dataset with missing values in this row. There is one more case that needs a special treatment: If no suitable row can be found to impute a row, the mean of the observed values is imputed, too. If verbose = TRUE, a message will be given for the encountered instances of the described special cases. If verbose = FALSE, the function will deal with theses cases silently.

Value

An object of the same class as ds with imputed missing values.

If return_r_max = TRUE, a list with the imputed dataset and r_max.

References

Bo, T. H., Dysvik, B., & Jonassen, I. (2004). LSimpute: accurate estimation of missing values in microarray data with least squares methods. Nucleic acids research, 32(3), e34

See Also

Other LSimpute functions: impute_LS_adaptive(), impute_LS_array(), impute_LS_combined()

Examples

set.seed(123)
ds_mis <- delete_MCAR(mvtnorm::rmvnorm(100, rep(0, 10)), 0.1)
ds_imp <- impute_LS_gene(ds_mis)

[Package missMethods version 0.4.0 Index]