SelfLearning {RSSL} | R Documentation |
Self-Learning approach to Semi-supervised Learning
Description
Use self-learning (also known as Yarowsky's algorithm or pseudo-labeling) to turn any supervised classifier into a semi-supervised method by iteratively labeling the unlabeled objects and adding these predictions to the set of labeled objects until the classifier converges.
Usage
SelfLearning(X, y, X_u = NULL, method, prob = FALSE, cautious = FALSE,
max_iter = 100, ...)
Arguments
X |
matrix; Design matrix for labeled data |
y |
factor or integer vector; Label vector |
X_u |
matrix; Design matrix for unlabeled data |
method |
Supervised classifier to use. Any function that accepts as its first argument a design matrix X and as its second argument a vector of labels y and that has a predict method. |
prob |
Not used |
cautious |
Not used |
max_iter |
integer; Maximum number of iterations |
... |
additional arguments to be passed to method |
References
McLachlan, G.J., 1975. Iterative Reclassification Procedure for Constructing an Asymptotically Optimal Rule of Allocation in Discriminant Analysis. Journal of the American Statistical Association, 70(350), pp.365-369.
Yarowsky, D., 1995. Unsupervised word sense disambiguation rivaling supervised methods. Proceedings of the 33rd annual meeting on Association for Computational Linguistics, pp.189-196.
See Also
Other RSSL classifiers:
EMLeastSquaresClassifier
,
EMLinearDiscriminantClassifier
,
GRFClassifier
,
ICLeastSquaresClassifier
,
ICLinearDiscriminantClassifier
,
KernelLeastSquaresClassifier
,
LaplacianKernelLeastSquaresClassifier()
,
LaplacianSVM
,
LeastSquaresClassifier
,
LinearDiscriminantClassifier
,
LinearSVM
,
LinearTSVM()
,
LogisticLossClassifier
,
LogisticRegression
,
MCLinearDiscriminantClassifier
,
MCNearestMeanClassifier
,
MCPLDA
,
MajorityClassClassifier
,
NearestMeanClassifier
,
QuadraticDiscriminantClassifier
,
S4VM
,
SVM
,
TSVM
,
USMLeastSquaresClassifier
,
WellSVM
,
svmlin()
Examples
data(testdata)
t_self <- SelfLearning(testdata$X,testdata$y,testdata$X_u,method=NearestMeanClassifier)
t_sup <- NearestMeanClassifier(testdata$X,testdata$y)
# Classification Error
1-mean(predict(t_self, testdata$X_test)==testdata$y_test)
1-mean(predict(t_sup, testdata$X_test)==testdata$y_test)
loss(t_self, testdata$X_test, testdata$y_test)