getknn {rchemo}R Documentation

KNN selection

Description

Function getknn selects the k nearest neighbours of each row observation of a new data set (= query) within a training data set, based on a dissimilarity measure.

getknn uses function get.knnx of package FNN (Beygelzimer et al.) available on CRAN.

Usage


getknn(Xtrain, X, k = NULL, diss = c("eucl", "mahal"), 
  algorithm = "brute", list = TRUE)

Arguments

Xtrain

Training X-data (n, p).

X

New X-data (m, p) to consider.

k

The number of nearest neighbors to select in Xtrain for each observation of X.

diss

The type of dissimilarity used. Possible values are "eucl" (default; Euclidean distance) or "mahal" (Mahalanobis distance).

algorithm

Search algorithm used for Euclidean and Mahalanobis distances. Default to "brute". See get.knnx.

list

If TRUE (default), a list format is also returned for the outputs.

Value

A list of outputs, such as:

nn

A dataframe (m x k) with the indexes of the neighbors.

d

A dataframe (m x k) with the dissimilarities between the neighbors and the new observations.

listnn

Same as $nn but in a list format.

listd

Same as $d but in a list format.

Examples


n <- 10
p <- 4
X <- matrix(rnorm(n * p), ncol = p)
Xtrain <- X
Xtest <- X[c(1, 3), ]
m <- nrow(Xtest)

k <- 3
getknn(Xtrain, Xtest, k = k)

fm <- pcasvd(Xtrain, nlv = 2)
Ttrain <- fm$T
Ttest <- transform(fm, Xtest)
getknn(Ttrain, Ttest, k = k, diss = "mahal")


[Package rchemo version 0.1-1 Index]