ADASYN {SMOTEWB}R Documentation

Adaptive Synthetic Sampling

Description

Generates synthetic data for minority class to balance imbalanced datasets using ADASYN.

Usage

ADASYN(x, y, k = 5)

Arguments

x

feature matrix or data.frame.

y

a factor class variable with two classes.

k

number of neighbors. Default is 5.

Details

Adaptive Synthetic Sampling (ADASYN) is an extension of the Synthetic Minority Over-sampling Technique (SMOTE) algorithm, which is used to generate synthetic examples for the minority class (He et al., 2008). In contrast to SMOTE, ADASYN adaptively generates synthetic examples by focusing on the minority class examples that are harder to learn, meaning those examples that are closer to the decision boundary.

Note: Much faster than smotefamily::ADAS().

Value

a list with resampled dataset.

x_new

Resampled feature matrix.

y_new

Resampled target variable.

x_syn

Generated synthetic data.

C

Number of synthetic samples for each positive class samples.

Author(s)

Fatih Saglam, saglamf89@gmail.com

References

He, H., Bai, Y., Garcia, E. A., & Li, S. (2008, June). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322-1328). IEEE.

Examples


set.seed(1)
x <- rbind(matrix(rnorm(2000, 3, 1), ncol = 2, nrow = 1000),
           matrix(rnorm(100, 5, 1), ncol = 2, nrow = 50))
y <- as.factor(c(rep("negative", 1000), rep("positive", 50)))

plot(x, col = y)

# resampling
m <- ADASYN(x = x, y = y, k = 3)

plot(m$x_new, col = m$y_new)


[Package SMOTEWB version 1.2.0 Index]