ordASDA {accSDA} | R Documentation |
Ordinal Accelerated Sparse Discriminant Analysis
Description
Applies accelerated proximal gradient algorithm to
the optimal scoring formulation of sparse discriminant analysis proposed
by Clemmensen et al. 2011. The problem is further casted to a binary
classification problem as described in "Learning to Classify Ordinal Data:
The Data Replication Method" by Cardoso and da Costa to handle the ordinal labels.
This function serves as a wrapper for the ASDA
function, where the
appropriate data augmentation is performed. Since the problem is casted into
a binary classication problem, only a single discriminant vector comes from the
result. The first *p* entries correspond to the variables/coefficients for
the predictors, while the following K-1 entries correspond to biases for the
found hyperplane, to separate the classes. The resulting object is of class ordASDA
and has an accompanying predict function. The paper by Cardoso and dat Costa can
be found here: (http://www.jmlr.org/papers/volume8/cardoso07a/cardoso07a.pdf).
Usage
ordASDA(Xt, ...)
## Default S3 method:
ordASDA(
Xt,
Yt,
s = 1,
Om,
gam = 0.001,
lam = 1e-06,
method = "SDAAP",
control,
...
)
Arguments
Xt |
n by p data matrix, (can also be a data.frame that can be coerced to a matrix) |
... |
Additional arguments for |
Yt |
vector of length n, equal to the number of samples. The classes should be 1,2,...,K where K is the number of classes. Yt needs to be a numeric vector. |
s |
We need to find a hyperplane that separates all classes with different biases. For each new bias we define a binary classification problem, where a maximum of s ordinal classes or contained in each of the two classes. A higher value of s means that more data will be copied in the data augmentation step. BY default s is 1. |
Om |
p by p parameter matrix Omega in generalized elastic net penalty, where p is the number of variables. |
gam |
Regularization parameter for elastic net penalty, must be greater than zero. |
lam |
Regularization parameter for l1 penalty, must be greater than zero. |
method |
String to select method, now either SDAD or SDAAP, see ?ASDA for more info. |
control |
List of control arguments further passed to ASDA. See |
Value
ordASDA
returns an object of class
"ordASDA
" including a list
with the same components as an ASDA objects and:
h
Scalar value for biases.
K
Number of classes.
NULL
Note
Remember to normalize the data.
See Also
ASDA
.
Examples
set.seed(123)
# You can play around with these values to generate some 2D data to test one
numClasses <- 5
sigma <- matrix(c(1,-0.2,-0.2,1),2,2)
mu <- c(0,0)
numObsPerClass <- 5
# Generate the data, can access with train$X and train$Y
train <- accSDA::genDat(numClasses,numObsPerClass,mu,sigma)
test <- accSDA::genDat(numClasses,numObsPerClass*2,mu,sigma)
# Visualize it, only using the first variable gives very good separation
plot(train$X[,1],train$X[,2],col = factor(train$Y),asp=1,main="Training Data")
# Train the ordinal based model
res <- accSDA::ordASDA(train$X,train$Y,s=2,h=1, gam=1e-6, lam=1e-3)
vals <- predict(object = res,newdata = test$X) # Takes a while to run ~ 10 seconds
sum(vals==test$Y)/length(vals) # Get accuracy on test set
#plot(test$X[,1],test$X[,2],col = factor(test$Y),asp=1,
# main="Test Data with correct labels")
#plot(test$X[,1],test$X[,2],col = factor(vals),asp=1,
# main="Test Data with predictions from ordinal classifier")