GSMOTE {SMOTEWB} | R Documentation |
Geometric Synthetic Minority Oversamplnig Technique (GSMOTE)
Description
Resampling with GSMOTE.
Usage
GSMOTE(x, y, k = 5, alpha_sel = "combined", alpha_trunc = 0.5, alpha_def = 0.5)
Arguments
x |
feature matrix. |
y |
a factor class variable with two classes. |
k |
number of neighbors. Default is 5. |
alpha_sel |
selection method. Can be "minority", "majority" or "combined". Default is "combined". |
alpha_trunc |
truncation factor. A numeric value in |
alpha_def |
deformation factor. A numeric value in |
Details
GSMOTE (Douzas & Bacao, 2019) is an oversampling method which creates synthetic samples geometrically around selected minority samples. Details are in the paper (Douzas & Bacao, 2019).
NOTE: Can not work with classes more than 2. Only numerical variables are allowed.
Value
a list with resampled dataset.
x_new |
Resampled feature matrix. |
y_new |
Resampled target variable. |
x_syn |
Generated synthetic feature data. |
y_syn |
Generated synthetic label data. |
Author(s)
Fatih Saglam, saglamf89@gmail.com
References
Douzas, G., & Bacao, F. (2019). Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE. Information sciences, 501, 118-135.
Examples
set.seed(1)
x <- rbind(matrix(rnorm(2000, 3, 1), ncol = 2, nrow = 1000),
matrix(rnorm(100, 5, 1), ncol = 2, nrow = 50))
y <- as.factor(c(rep("negative", 1000), rep("positive", 50)))
plot(x, col = y)
# resampling
m <- GSMOTE(x = x, y = y, k = 7)
plot(m$x_new, col = m$y_new)