iprcomp {statVisual} | R Documentation |
Improved Function for Obtaining Principal Components
Description
Calculate principal components when data contains missing values.
Usage
iprcomp(dat, center = TRUE, scale. = FALSE)
Arguments
dat |
n by p matrix. rows are subjects and columns are variables |
center |
logical. Indicates if each row of |
scale. |
logical. Indicates if each row of |
Details
We first set missing values as median of the corresponding variable, then call the function prcomp
.
This is a very simple solution. The user can use their own imputation methods before calling prcomp
.
Value
A list of 3 elements
sdev |
square root of the eigen values |
rotation |
a matrix with columns are eigen vectors, i.e., projection direction |
x |
a matrix with columns are principal components |
Author(s)
Wenfei Zhang <Wenfei.Zhang@sanofi.com>, Weiliang Qiu <Weiliang.Qiu@sanofi.com>, Xuan Lin <Xuan.Lin@sanofi.com>, Donghui Zhang <Donghui.Zhang@sanofi.com>
Examples
# generate simulated data
set.seed(1234567)
dat.x = matrix(rnorm(500), nrow = 100, ncol = 5)
dat.y = matrix(rnorm(500, mean = 2), nrow = 100, ncol = 5)
dat = rbind(dat.x, dat.y)
grp = c(rep(0, 100), rep(1, 100))
print(dim(dat))
res = iprcomp(dat, center = TRUE, scale. = FALSE)
# for each row, set one artificial missing value
dat.na=dat
nr=nrow(dat.na)
nc=ncol(dat.na)
for(i in 1:nr)
{
posi=sample(x=1:nc, size=1)
dat.na[i,posi]=NA
}
res.na = iprcomp(dat.na, center = TRUE, scale. = FALSE)
##
# pca plot
##
par(mfrow = c(3,1))
# original data without missing values
plot(x = res$x[,1], y = res$x[,2], xlab = "PC1", ylab = "PC2")
# perturbed data with one NA per probe
# the pattern of original data is captured
plot(x = res.na$x[,1], y = res.na$x[,2], xlab = "PC1", ylab = "PC2", main = "with missing values")
par(mfrow = c(1,1))