ADASYN {SMOTEWB} | R Documentation |
Adaptive Synthetic Sampling
Description
Generates synthetic data for minority class to balance imbalanced datasets using ADASYN.
Usage
ADASYN(x, y, k = 5)
Arguments
x |
feature matrix or data.frame. |
y |
a factor class variable with two classes. |
k |
number of neighbors. Default is 5. |
Details
Adaptive Synthetic Sampling (ADASYN) is an extension of the Synthetic Minority Over-sampling Technique (SMOTE) algorithm, which is used to generate synthetic examples for the minority class (He et al., 2008). In contrast to SMOTE, ADASYN adaptively generates synthetic examples by focusing on the minority class examples that are harder to learn, meaning those examples that are closer to the decision boundary.
Note: Much faster than smotefamily::ADAS()
.
Value
a list with resampled dataset.
x_new |
Resampled feature matrix. |
y_new |
Resampled target variable. |
x_syn |
Generated synthetic data. |
C |
Number of synthetic samples for each positive class samples. |
Author(s)
Fatih Saglam, saglamf89@gmail.com
References
He, H., Bai, Y., Garcia, E. A., & Li, S. (2008, June). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322-1328). IEEE.
Examples
set.seed(1)
x <- rbind(matrix(rnorm(2000, 3, 1), ncol = 2, nrow = 1000),
matrix(rnorm(100, 5, 1), ncol = 2, nrow = 50))
y <- as.factor(c(rep("negative", 1000), rep("positive", 50)))
plot(x, col = y)
# resampling
m <- ADASYN(x = x, y = y, k = 3)
plot(m$x_new, col = m$y_new)