pcalda {mt} | R Documentation |
Classification with PCADA
Description
Classification with combination of principal component analysis (PCA) and linear discriminant analysis (LDA).
Usage
pcalda(x, ...)
## Default S3 method:
pcalda(x, y, center = TRUE, scale. = FALSE, ncomp = NULL,
tune=FALSE,...)
## S3 method for class 'formula'
pcalda(formula, data = NULL, ..., subset, na.action = na.omit)
Arguments
formula |
A formula of the form |
data |
Data frame from which variables specified in |
x |
A matrix or data frame containing the explanatory variables if no formula is given as the principal argument. |
y |
A factor specifying the class for each observation if no formula principal argument is given. |
center |
A logical value indicating whether |
scale. |
A logical value indicating whether |
ncomp |
The number of principal components to be used in the classification. If
|
tune |
A logical value indicating whether the best number of components should be tuned. |
... |
Arguments passed to or from other methods. |
subset |
An index vector specifying the cases to be used in the training sample. |
na.action |
A function to specify the action to be taken if |
Details
A critical issue of applying linear discriminant analysis (LDA) is both the
singularity and instability of the within-class scatter matrix. In practice,
there are often a large number of features available, but the total number of
training patterns is limited and commonly less than the dimension of the feature
space. To tackle this issue, pcalda
combines PCA and LDA for
classification. It uses PCA for dimension reduction. The rotated data resulted
from PCA will be the input variable to LDA for classification.
Value
An object of class pcalda
containing the following components:
x |
The rotated data on discriminant variables. |
cl |
The observed class labels of training data. |
pred |
The predicted class labels of training data. |
posterior |
The posterior probabilities for the predicted classes. |
conf |
The confusion matrix based on training data. |
acc |
The accuracy rate of training data. |
ncomp |
The number of principal components used for classification. |
pca.out |
The output of PCA. |
lda.out |
The output of LDA. |
call |
The (matched) function call. |
Note
This function may be called giving either a formula and optional data frame, or a matrix and grouping factor as the first two arguments.
Author(s)
Wanchang Lin
See Also
predict.pcalda
, plot.pcalda
, tune.func
Examples
data(abr1)
cl <- factor(abr1$fact$class)
dat <- abr1$pos
## divide data as training and test data
idx <- sample(1:nrow(dat), round((2/3)*nrow(dat)), replace=FALSE)
## construct train and test data
train.dat <- dat[idx,]
train.t <- cl[idx]
test.dat <- dat[-idx,]
test.t <- cl[-idx]
## apply pcalda
model <- pcalda(train.dat,train.t)
model
summary(model)
## plot
plot(model,dimen=c(1,2),main = "Training data",abbrev = TRUE)
plot(model,main = "Training data",abbrev = TRUE)
## confusion matrix
pred.te <- predict(model, test.dat)$class
table(test.t,pred.te)