impute.svd {bcv} R Documentation

## Missing value imputation via the SVDImpute algorithm

### Description

Given a matrix with missing values, impute the missing entries using a low-rank SVD approximation estimated by the EM algorithm.

### Usage

impute.svd(x, k = min(n, p), tol = max(n, p) * 1e-10, maxiter = 100)


### Arguments

 x a matrix to impute the missing entries of. k the rank of the SVD approximation. tol the convergence tolerance for the EM algorithm. maxiter the maximum number of EM steps to take.

### Details

Impute the missing values of x as follows: First, initialize all NA values to the column means, or 0 if all entries in the column are missing. Then, until convergence, compute the first k terms of the SVD of the completed matrix. Replace the previously missing values with their approximations from the SVD, and compute the RSS between the non-missing values and the SVD.

Declare convergence if  abs(rss0 - rss1) / (.Machine$double.eps + rss1) < tol , where rss0 and rss1 are the RSS values computed from successive iterations. Stop early after maxiter iterations and issue a warning. ### Value  x the completed version of the matrix. rss the sum of squares between the SVD approximation and the non-missing values in x. iter the number of EM iterations before algorithm stopped. ### Author(s) Patrick O. Perry ### References Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D. and Altman, R.B. (2001). Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525. ### See Also cv.svd.wold ### Examples  # Generate a matrix with missing entries n <- 20 p <- 10 u <- rnorm( n ) v <- rnorm( p ) xfull <- u %*% rbind( v ) + rnorm( n*p ) miss <- sample( seq_len( n*p ), n ) x <- xfull x[miss] <- NA # impute the missing entries with a rank-1 SVD approximation xhat <- impute.svd( x, 1 )$x

# compute the prediction error for the missing entries
sum( ( xfull-xhat )^2 )



[Package bcv version 1.0.2 Index]