SL.kernelKnn {SuperLearner} | R Documentation |
SL wrapper for KernelKNN
Description
Wrapper for a configurable implementation of k-nearest neighbors. Supports both binomial and gaussian outcome distributions.
Usage
SL.kernelKnn(Y, X, newX, family, k = 10, method = "euclidean",
weights_function = NULL, extrema = F, h = 1, ...)
Arguments
Y |
Outcome variable |
X |
Training dataframe |
newX |
Test dataframe |
family |
Gaussian or binomial |
k |
Number of nearest neighbors to use |
method |
Distance method, can be 'euclidean' (default), 'manhattan', 'chebyshev', 'canberra', 'braycurtis', 'pearson_correlation', 'simple_matching_coefficient', 'minkowski' (by default the order 'p' of the minkowski parameter equals k), 'hamming', 'mahalanobis', 'jaccard_coefficient', 'Rao_coefficient' |
weights_function |
Weighting method for combining the nearest neighbors. Can be 'uniform' (default), 'triangular', 'epanechnikov', 'biweight', 'triweight', 'tricube', 'gaussian', 'cosine', 'logistic', 'gaussianSimple', 'silverman', 'inverse', 'exponential'. |
extrema |
if TRUE then the minimum and maximum values from the k-nearest-neighbors will be removed (can be thought as outlier removal). |
h |
the bandwidth, applicable if the weights_function is not NULL. Defaults to 1.0. |
... |
Any additional parameters, not currently passed through. |
Value
List with predictions and the original training data & hyperparameters.
Examples
# Load a test dataset.
data(PimaIndiansDiabetes2, package = "mlbench")
data = PimaIndiansDiabetes2
# Omit observations with missing data.
data = na.omit(data)
Y_bin = as.numeric(data$diabetes)
X = subset(data, select = -diabetes)
set.seed(1)
sl = SuperLearner(Y_bin, X, family = binomial(),
SL.library = c("SL.mean", "SL.kernelKnn"))
sl