knnda {rchemo}R Documentation

KNN-DA

Description

KNN weighted discrimination. For each new observation to predict, a number of k nearest neighbors is selected and the prediction is calculated by the most frequent class in y in this neighborhood.

Usage


knnda(X, y,
    nlvdis, diss = c("eucl", "mahal"),
    h, k)

## S3 method for class 'Knnda'
predict(object, X, ...)  

Arguments

X

For the main function: Training X-data (n, p). — For the auxiliary functions: New X-data (m, p) to consider.

y

Training class membership (n). Note: If y is a factor, it is replaced by a character vector.

nlvdis

The number of LVs to consider in the global PLS used for the dimension reduction before calculating the dissimilarities. If nlvdis = 0, there is no dimension reduction. (see details)

diss

The type of dissimilarity used for defining the neighbors. Possible values are "eucl" (default; Euclidean distance), "mahal" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)).

h

A scale scalar defining the shape of the weight function. Lower is h, sharper is the function. See wdist.

k

The number of nearest neighbors to select for each observation to predict.

object

For the auxiliary functions: A fitted model, output of a call to the main function.

...

For the auxiliary functions: Optional arguments. Not used.

Details

In function knnda, the dissimilarities used for computing the neighborhood and the weights can be calculated from the original X-data or after a dimension reduction (argument nlvdis). In the last case, global PLS scores are computed from (X, Y) and the dissimilarities are calculated on these scores. For high dimension X-data, the dimension reduction is in general required for using the Mahalanobis distance.

Value

For knndalist with input arguments

For predict.Knnda:

pred

prediction calculated for each observation by the most frequent class in y in its neighborhood.

listnn

list with the neighbors used for each observation to be predicted

listd

list with the distances to the neighbors used for each observation to be predicted

listw

list with the weights attributed to the neighbors used for each observation to be predicted

References

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

Examples


n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)

m <- 5
Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m]

nlvdis <- 5 ; diss <- "mahal"
h <- 2 ; k <- 10
fm <- knnda(
    Xtrain, ytrain, 
    nlvdis = nlvdis, diss = diss,
    h = h, k = k
    )
res <- predict(fm, Xtest)
names(res)
res$pred
err(res$pred, ytest)


[Package rchemo version 0.1-1 Index]