do.sda {Rdimtools} | R Documentation |
Semi-Supervised Discriminant Analysis
Description
Semi-Supervised Discriminant Analysis (SDA) is a linear dimension reduction method
when label is partially missing, i.e., semi-supervised. The labeled data
points are used to maximize the separability between classes while
the unlabeled ones to estimate the intrinsic structure of the data.
Regularization in case of rank-deficient case is also supported via an \ell_2
scheme via beta
.
Usage
do.sda(X, label, ndim = 2, type = c("proportion", 0.1), alpha = 1, beta = 1)
Arguments
X |
an |
label |
a length- |
ndim |
an integer-valued target dimension. |
type |
a vector of neighborhood graph construction. Following types are supported;
|
alpha |
balancing parameter between model complexity and empirical loss. |
beta |
Tikhonov regularization parameter. |
Value
a named list containing
- Y
an
(n\times ndim)
matrix whose rows are embedded observations.- trfinfo
a list containing information for out-of-sample prediction.
- projection
a
(p\times ndim)
whose columns are basis for projection.
Author(s)
Kisung You
References
Cai D, He X, Han J (2007). “Semi-Supervised Discriminant Analysis.” In 2007 IEEE 11th International Conference on Computer Vision, 1–7.
Examples
## use iris data
data(iris)
X = as.matrix(iris[,1:4])
label = as.integer(iris$Species)
## copy a label and let 20% of elements be missing
nlabel = length(label)
nmissing = round(nlabel*0.20)
label_missing = label
label_missing[sample(1:nlabel, nmissing)]=NA
## compare true case with missing-label case
out1 = do.sda(X, label)
out2 = do.sda(X, label_missing)
## visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,2))
plot(out1$Y, col=label, main="true projection")
plot(out2$Y, col=label, main="20% missing labels")
par(opar)