do.rndproj {Rdimtools} | R Documentation |
Random Projection
Description
do.rndproj
is a linear dimensionality reduction method based on
random projection technique, featured by the celebrated Johnson–Lindenstrauss lemma.
Usage
do.rndproj(
X,
ndim = 2,
preprocess = c("null", "center", "scale", "cscale", "whiten", "decorrelate"),
type = c("gaussian", "achlioptas", "sparse"),
s = max(sqrt(ncol(X)), 3)
)
Arguments
X |
an |
ndim |
an integer-valued target dimension. |
preprocess |
an additional option for preprocessing the data.
Default is "null". See also |
type |
a type of random projection, one of "gaussian","achlioptas" or "sparse". |
s |
a tuning parameter for determining values in projection matrix. While default
is to use |
Details
The Johnson-Lindenstrauss(JL) lemma states that given , for a set
of
points in
and a number
,
there is a linear map
to R^n such that
for all in
.
Three types of random projections are supported for an (p-by-ndim)
projection matrix .
Conventional approach is to use normalized Gaussian random vectors sampled from unit sphere
.
Achlioptas suggested to employ a sparse approach using samples from
with probability
.
Li et al proposed to sample from
with probability
for
to incorporate sparsity while attaining speedup with little loss in accuracy. While the original suggsetion from the authors is to use
or
for
, any user-supported
is allowed.
Value
a named list containing
- Y
an
matrix whose rows are embedded observations.
- projection
a
whose columns are basis for projection.
- epsilon
an estimated error
in accordance with JL lemma.
- trfinfo
a list containing information for out-of-sample prediction.
References
Johnson WB, Lindenstrauss J (1984). “Extensions of Lipschitz Mappings into a Hilbert Space.” In Beals R, Beck A, Bellow A, Hajian A (eds.), Contemporary Mathematics, volume 26, 189–206. American Mathematical Society, Providence, Rhode Island. ISBN 978-0-8218-5030-5 978-0-8218-7611-4.
Achlioptas D (2003). “Database-Friendly Random Projections: Johnson-Lindenstrauss with Binary Coins.” Journal of Computer and System Sciences, 66(4), 671–687.
Li P, Hastie TJ, Church KW (2006). “Very Sparse Random Projections.” In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '06, 287–296.
Examples
## use iris data
data(iris)
set.seed(100)
subid = sample(1:150, 50)
X = as.matrix(iris[subid,1:4])
label = as.factor(iris[subid,5])
## 1. Gaussian projection
output1 <- do.rndproj(X,ndim=2)
## 2. Achlioptas projection
output2 <- do.rndproj(X,ndim=2,type="achlioptas")
## 3. Sparse projection
output3 <- do.rndproj(X,type="sparse")
## Visualize three different projections
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(output1$Y, pch=19, col=label, main="RNDPROJ::Gaussian")
plot(output2$Y, pch=19, col=label, main="RNDPROJ::Arclioptas")
plot(output3$Y, pch=19, col=label, main="RNDPROJ::Sparse")
par(opar)