classifSample.lda {MorphoTools2} | R Documentation |
Classificatory Discriminant Analysis
Description
These functions compute discriminant function based on an independent training set and classify observations in sample set.
Linear discriminant function (classifSample.lda
), quadratic discriminant function (classifSample.qda
), or nonparametric k-nearest neighbour classification method (classifSample.knn
) can be used.
Usage
classifSample.lda(sampleData, trainingData)
classifSample.qda(sampleData, trainingData)
classifSample.knn(sampleData, trainingData, k)
Arguments
sampleData |
observations which should be classified. An object of class |
trainingData |
observations for computing discriminant function. An object of class |
k |
number of neighbours considered. |
Details
The classifSample.lda
and classifSample.qda
performs classification using linear and quadratic discriminant function using the lda
and qda
functions from the package MASS
. Nonparametric classification method classifSample.knn
(k-nearest neighbours) is performed using the knn
functions from the package class
. The classifSample
functions are designed to classify hybrid populations, type herbarium specimens, atypical samples, entirely new data, etc. Discriminant criterion is developed from the original (training) dataset and applied to the specific sample (set).
LDA and QDA analyses have some requirements: (1) no character can be a linear combination of any other character; (2) no pair of characters can be highly correlated; (3) no character can be invariant in any taxon (group); (4) for the number of taxa (g), characters (p) and total number of samples (n) should hold: 0 <
p <
(n - g), and (5) there must be at least two groups (taxa), and in each group there must be at least two objects. Violation of some of these assumptions may result in warnings or error messages (rank deficiency).
Value
an object of class classifdata
with the following elements:
ID |
IDs of each row. |
Population |
population membership of each row. |
Taxon |
taxon membership of each row. |
classif |
classification from discriminant analysis. |
prob |
posterior probabilities of classification into each taxon (if calculated by |
correct |
logical, correctness of classification. |
See Also
classif.lda
,
classif.matrix
,
knn.select
Examples
data(centaurea)
# remove NAs and linearly dependent characters (characters with unique contributions
# can be identified by stepwise discriminant analysis.)
centaurea = naMeanSubst(centaurea)
centaurea = removePopulation(centaurea, populationName = c("LIP", "PREL"))
centaurea = keepCharacter(centaurea, c("MLW", "ML", "IW", "LS", "IV", "MW", "MF",
"AP", "IS", "LBA", "LW", "AL", "ILW", "LBS",
"SFT", "CG", "IL", "LM", "ALW", "AW", "SF") )
# add a small constant to characters witch are invariant within taxa
centaurea$data[ centaurea$Taxon == "hybr", "LM" ][1] =
centaurea$data[ centaurea$Taxon == "hybr", "LM" ][1] + 0.000001
centaurea$data[ centaurea$Taxon == "ph", "IV" ][1] =
centaurea$data[ centaurea$Taxon == "ph", "IV" ][1] + 0.000001
centaurea$data[ centaurea$Taxon == "st", "LBS"][1] =
centaurea$data[ centaurea$Taxon == "st", "LBS"][1] + 0.000001
trainingSet = removePopulation(centaurea, populationName = "LES")
LES = keepPopulation(centaurea, populationName = "LES")
# classification by linear discriminant function
classifSample.lda(LES, trainingSet)
# classification by quadratic discriminant function
classifSample.qda(LES, trainingSet)
# classification by nonparametric k-nearest neighbour method
# use knn.select to find the optimal K.
knn.select(trainingSet)
classifSample.knn(LES, trainingSet, k = 12)