GSMOTE {SMOTEWB}R Documentation

Geometric Synthetic Minority Oversamplnig Technique (GSMOTE)

Description

Resampling with GSMOTE.

Usage

GSMOTE(x, y, k = 5, alpha_sel = "combined", alpha_trunc = 0.5, alpha_def = 0.5)

Arguments

x

feature matrix.

y

a factor class variable with two classes.

k

number of neighbors. Default is 5.

alpha_sel

selection method. Can be "minority", "majority" or "combined". Default is "combined".

alpha_trunc

truncation factor. A numeric value in [-1,1]. Default is 0.5.

alpha_def

deformation factor. A numeric value in [0,1]. Default is 0.5

Details

GSMOTE (Douzas & Bacao, 2019) is an oversampling method which creates synthetic samples geometrically around selected minority samples. Details are in the paper (Douzas & Bacao, 2019).

NOTE: Can not work with classes more than 2. Only numerical variables are allowed.

Value

a list with resampled dataset.

x_new

Resampled feature matrix.

y_new

Resampled target variable.

x_syn

Generated synthetic feature data.

y_syn

Generated synthetic label data.

Author(s)

Fatih Saglam, saglamf89@gmail.com

References

Douzas, G., & Bacao, F. (2019). Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE. Information sciences, 501, 118-135.

Examples


set.seed(1)
x <- rbind(matrix(rnorm(2000, 3, 1), ncol = 2, nrow = 1000),
           matrix(rnorm(100, 5, 1), ncol = 2, nrow = 50))
y <- as.factor(c(rep("negative", 1000), rep("positive", 50)))

plot(x, col = y)

# resampling
m <- GSMOTE(x = x, y = y, k = 7)

plot(m$x_new, col = m$y_new)


[Package SMOTEWB version 1.2.0 Index]