R: Linear Discriminant Analysis (LDA)

LDA {biClassify}

R Documentation

Linear Discriminant Analysis (LDA)

Description

A wrapper function for the various LDA implementations available in this package.

Generates class predictions for TestData.

Usage

LDA(TrainData, TrainCat, TestData, Method = "Full", Mode = "Automatic",
  m1 = NULL, m2 = NULL, m = NULL, s = NULL, gamma = 1e-05,
  type = "Rademacher")

Arguments

`TrainData`	A (n x p) numeric matrix without missing values consisting of n training samples each with p features.
`TrainCat`	A vector of length n consisting of group labels of the n training samples in `TrainData`. Must consist of 1s and 2s.
`TestData`	A (m x p) numeric matrix without missing values consisting of m training samples each with p features. The number of features must equal the number of features in `TrainData`.
`Method`	A string of characters which determines which version of LDA to use. Must be either "Full", "Compressed", "Subsampled", "Projected", or "fastRandomFisher". Default is "Full".
`Mode`	A string of characters which determines how the reduced sample paramters will be inputted for each method. Must be either "Research", "Interactive", or "Automatic". Default is "Automatic".
`m1`	The number of class 1 compressed samples to be generated. Must be a positive integer.
`m2`	The number of class 2 compressed samples to be generated. Must be a positive integer.
`m`	The number of total compressed samples to be generated. Must be a positive integer.
`s`	The sparsity level used in compression. Must satify 0 < s < 1.
`gamma`	A numeric value for the stabilization amount gamma * I added to the covariance matrixed used in the LDA decision rule. Default amount is 1E-5. Cannot be negative.
`type`	A string of characters determining the type of compression matrix used. The accepted values are `Rademacher`, `Gaussian`, and `Count`.

Details

Function which handles all implementations of LDA.

Value

A list containing

`Predictions`	(m x 1) Vector of predicted class labels for the data points in `TestData`.
`Dvec`	(px1) Discriminant vector used to predict the class labels.

References

Lapanowski, Alexander F., and Gaynanova, Irina. “Compressing large sample data for discriminant analysis” arXiv preprint arXiv:2005.03858 (2020).

Ye, Haishan, Yujun Li, Cheng Chen, and Zhihua Zhang. “Fast Fisher discriminant analysis with randomized algorithms.” Pattern Recognition 72 (2017): 82-92.

Examples

TrainData <- LDA_Data$TrainData
TrainCat <- LDA_Data$TrainCat
TestData <- LDA_Data$TestData
plot(TrainData[,2]~TrainData[,1], col = c("blue","orange")[as.factor(TrainCat)])

#----- Full LDA -------
LDA(TrainData = TrainData,
    TrainCat = TrainCat,
    TestData = TestData,
    Method = "Full",
    gamma = 1E-5)
  
#----- Compressed LDA -------  
 m1 <- 700
 m2 <- 300
 s <- 0.01
 LDA(TrainData = TrainData,
     TrainCat = TrainCat,
     TestData = TestData,
     Method = "Compressed",
     Mode = "Research",
     m1 = m1,
     m2 = m2,
     s = s,
     gamma = 1E-5)
     
 LDA(TrainData = TrainData,
     TrainCat = TrainCat,
     TestData = TestData,
     Method = "Compressed",
     Mode = "Automatic",
     gamma = 1E-5)
 
 #----- Sub-sampled LDA ------
 m1 <- 700
 m2 <- 300
 LDA(TrainData = TrainData,
     TrainCat = TrainCat,
     TestData = TestData,
     Method = "Subsampled",
     Mode = "Research",
     m1 = m1,
     m2 = m2,
     gamma = 1E-5)
 
  LDA(TrainData = TrainData,
     TrainCat = TrainCat,
     TestData = TestData,
     Method = "Subsampled",
     Mode = "Automatic",
     gamma = 1E-5)
     
 #----- Projected LDA ------
  m1 <- 700
  m2 <- 300
  s <- 0.01
  LDA(TrainData = TrainData,
      TrainCat = TrainCat,
      TestData = TestData,
      Method = "Projected",
      Mode = "Research",
      m1 = m1, 
      m2 = m2, 
      s = s,
      gamma = 1E-5)
      
   LDA(TrainData = TrainData,
      TrainCat = TrainCat,
      TestData = TestData,
      Method = "Projected",
      Mode = "Automatic",
      gamma = 1E-5)
      
 #----- Fast Random Fisher ------    
  m <- 1000 
  s <- 0.01
  LDA(TrainData = TrainData,
      TrainCat = TrainCat,
      TestData = TestData,
      Method = "fastRandomFisher",
      Mode = "Research",
      m = m, 
      s = s,
      gamma = 1E-5)
      
   LDA(TrainData = TrainData,
      TrainCat = TrainCat,
      TestData = TestData,
      Method = "fastRandomFisher",
      Mode = "Automatic",
      gamma = 1E-5)

[Package biClassify version 1.3 Index]