OKNNE {OkNNE} | R Documentation |
Optimal k-Nearest Neighbours Ensemble
Description
Optimal k-Nearest Neighbours Ensemble "OkNNE" is an ensemble of base k-NN models each constructed on a bootstrap sample with a random subset of features. k closest observations are identified for a test point "x" (say), in each base k-NN model to fit a stepwise regression to predict the output value of "x". The final predicted value of "x" is the mean of estimates given by all the models. OkNNE takes training and test datasets and trains the model on training data to predict the test data.
Usage
OKNNE(xtrain, ytrain, xtest = NULL, ytest = NULL, k = 10, B = 100,
direction = "forward", q = trunc(sqrt(ncol(xtrain))), algorithm =
c("kd_tree", "cover_tree", "CR", "brute"))
Arguments
xtrain |
The features space of the training dataset. |
ytrain |
The response variable of training dataset. |
xtest |
The test dataset to be predicted. |
ytest |
The response variable of test dataset. |
k |
The maximum number of nearest neighbors to search. The default value is set to 10. |
B |
The number of bootstrap samples. |
direction |
Method used to fit stepwise models. By default |
q |
The number of features to be selected for each base k-NN model. |
algorithm |
Method used for searching nearest neighbors. |
Value
PREDICTIONS |
Predicted values for test data response variable |
RMSE |
Root mean square error estimate based on test data |
R.SQUARE |
Coefficient of determination estimate based on test data |
CORRELATION |
Correlation estimate based on test data |
Author(s)
Amjad Ali, Muhammad Hamraz, Zardad Khan
Maintainer: Amjad Ali <aalistat1@gmail.com>
References
A. Ali et al., "A k-Nearest Nieghbours Based Ensemble Via Optimal Model Selection For Regression," in IEEE Access, doi: 10.1109/ACCESS.2020.3010099.
Li, S. (2009). Random KNN modeling and variable selection for high dimensional data.
Shengqiao Li, E James Harner and Donald A Adjeroh. (2011). Random KNN feature selection - a fast and stable alternative to Random Forests. BMC Bioinformatics , 12:450.
Alina Beygelzimer, Sham Kakadet, John Langford, Sunil Arya, David Mount and Shengqiao Li (2019). FNN: Fast Nearest Neighbor Search Algorithms and Applications. R package version 1.1.3.
Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. New York: Springer (4th ed).
Examples
data(SMSA)
anyNA(SMSA)
#[1] FALSE
dim(SMSA)
#[1] 59 15
n=nrow(SMSA)
X <- SMSA[names(SMSA)!="NOx"]
Y <- SMSA[names(SMSA)=="NOx"]
set.seed(25)
train.obs <- sample(1:n, 0.7*n, replace = FALSE)
test.obs <- (1:n)[-train.obs]
xtrain <- X[train.obs,]; ytrain <- Y[train.obs,];
xtest <- X[test.obs,]; ytest <- Y[test.obs,]
OkNNE.MODEL = OKNNE(xtrain = xtrain, ytrain = ytrain, xtest = xtest, ytest
= ytest, k = 10, B = 5, q = trunc(sqrt(ncol(xtrain))), direction = "both",
algorithm=c("kd_tree", "cover_tree", "CR", "brute"))
OkNNE.MODEL