loclda {klaR} | R Documentation |
Localized Linear Discriminant Analysis (LocLDA)
Description
A localized version of Linear Discriminant Analysis.
Usage
loclda(x, ...)
## S3 method for class 'formula'
loclda(formula, data, ..., subset, na.action)
## Default S3 method:
loclda(x, grouping, weight.func = function(x) 1/exp(x),
k = nrow(x), weighted.apriori = TRUE, ...)
## S3 method for class 'data.frame'
loclda(x, ...)
## S3 method for class 'matrix'
loclda(x, grouping, ..., subset, na.action)
Arguments
formula |
Formula of the form ‘ |
data |
Data frame from which variables specified in |
x |
Matrix or data frame containing the explanatory variables
(required, if |
grouping |
(required if no |
weight.func |
Function used to compute local weights. Must be finite over the interval [0,1]. See Details below. |
k |
Number of nearest neighbours used to construct localized classification rules. See Details below. |
weighted.apriori |
Logical: if |
subset |
An index vector specifying the cases to be used in the training sample. |
na.action |
A function to specify the action to be taken if |
... |
Further arguments to be passed to |
Details
This is an approach to apply the concept of localization described by Tutz and Binder (2005)
to Linear Discriminant Analysis. The function loclda
generates an object of class loclda
(see Value below). As localization makes it necessary to build an
individual decision rule for each test observation,
this rule construction has to be handled by predict.loclda
.
For convenience, the rule building procedure is still described here.
To classify a test observation x_s
, only the k
nearest neighbours of
x_s
within the train data are used. Each of these k train observations
x_i, i = 1,\dots,k
, is assigned a weight w_i
according to
w_i = K\left(\frac{||x_i-x_s||}{d_k}\right), i=1,\dots,k
where K is the weighting function given by weight.func
, ||x_i-x_s||
is the euclidian distance of x_i
and x_s
and d_k
is the euclidian distance of x_s
to its k
-th nearest neighbour.
With these weights for each class A_g, g=1,\dots,G
,
its weighted empirical mean \hat{\mu}_g
and weighted empirical
covariance matrix are computed. The estimated pooled (weighted) covariance matrix
\hat{\Sigma}
is then calculated from the individual weighted
empirical class covariance matrices. If weighted.apriori
is TRUE
(the default),
prior class probabilities are estimated according to:
prior_g := \frac{\sum_{i=1}^k \left(w_i \cdot I (x_i \in A_g)\right)}{\sum_{i=1}^k \left( w_i \right)}
where I is the indicator function. If FALSE
, equal priors for all classes are used.
In analogy to Linear Discriminant Analysis, the decision rule for x_s
is
\hat{A} := argmax_{g \in 1,\dots,G} (posterior_g)
where
posterior_g := prior_g \cdot \exp{\left( (-\frac{1}{2}) t(x_s-\hat{\mu}_g)\hat{\Sigma}^{-1}(x_s-\hat{\mu}_g)\right)}
If posterior_g < 10^{(-150)} \forall g \in \{1,\dots,G\}
,
posterior_g
is set to \frac{1}{G}
for all g \in 1,\dots,G
and the test observation x_s
is simply assigned to the class whose weighted mean has the lowest
euclidian distance to x_s
.
Value
A list of class loclda
containing the following components:
call |
The (matched) function call. |
learn |
Matrix containing the values of the explanatory variables for all train observations. |
grouping |
Factor specifying the class for each train observation. |
weight.func |
Value of the argument |
k |
Value of the argument |
weighted.apriori |
Value of the argument |
Author(s)
Marc Zentgraf (marc-zentgraf@gmx.de) and Karsten Luebke (karsten.luebke@fom.de)
References
Tutz, G. and Binder, H. (2005): Localized classification. Statistics and Computing 15, 155-166.
See Also
Examples
benchB3("lda")$l1co.error
benchB3("loclda")$l1co.error