kNN {DMwR2} | R Documentation |
k-Nearest Neighbour Classification
Description
This function provides a formula interface to the existing
knn()
function of package class
. On top of this type of
convinient interface, the function also allows standardization of the
given data.
Usage
kNN(form, train, test, stand = TRUE, stand.stats = NULL, ...)
Arguments
form |
An object of the class |
train |
The data to be used as training set. |
test |
The data set for which we want to obtain the k-NN classification, i.e. the test set. |
stand |
A boolean indicating whether the training data should be previously normalized before obtaining the k-NN predictions (defaults to TRUE). |
stand.stats |
This argument allows the user to supply the centrality and spread
statistics that will drive the standardization. If not supplied they
will default to the statistics used in the function |
... |
Any other parameters that will be forward to the |
Details
This function is essentially a convenience function that provides a
formula-based interface to the already existing knn()
function of package class
. On top of this type of interface it
also incorporates some facilities in terms of standardization of the
data before the k-nearest neighbour classification algorithm is
applied. This algorithm is based on the distances between
observations, which are known to be very sensitive to different scales
of the variables and thus the usefulness of standardization.
Value
The return value is the same as in the knn()
function of package class
. This is a factor of classifications
of the test set cases.
Author(s)
Luis Torgo ltorgo@dcc.fc.up.pt
References
Torgo, L. (2016) Data Mining using R: learning with case studies, second edition, Chapman & Hall/CRC (ISBN-13: 978-1482234893).
See Also
Examples
## A small example with the IRIS data set
data(iris)
## Split in train + test set
idxs <- sample(1:nrow(iris),as.integer(0.7*nrow(iris)))
trainIris <- iris[idxs,]
testIris <- iris[-idxs,]
## A 3-nearest neighbours model with no standardization
nn3 <- kNN(Species ~ .,trainIris,testIris,stand=FALSE,k=3)
## The resulting confusion matrix
table(testIris[,'Species'],nn3)
## Now a 5-nearest neighbours model with standardization
nn5 <- kNN(Species ~ .,trainIris,testIris,stand=TRUE,k=5)
## The resulting confusion matrix
table(testIris[,'Species'],nn5)