R: Synthetic minority oversampling (SMOTE)

smote {SmartMeterAnalytics}

R Documentation

Synthetic minority oversampling (SMOTE)

Description

Performs oversampling by creating new instances.

Usage

smote(
  Variables,
  Classes,
  subset_use = NULL,
  k = 5,
  use_nearest = TRUE,
  proportions = 0.9,
  equalise_with_undersampling = FALSE,
  safe = FALSE
)

Arguments

`Variables`	the data.frame of independent variables that should be used to create new instances
`Classes`	the class labels in the prediction problem
`subset_use`	a specific subset only is used for the oversampling. If NULL, everything is used.
`k`	the number of neigbours for generation
`use_nearest`	should only the nearest neighbours be used? (very slow)
`proportions`	to which proportion (of the biggest class) should the classes be equalized
`equalise_with_undersampling`	should additional undersampling be performed?
`safe`	should a safe version of SMOTE be used?

Details

SMOTE is used to generate synthetic datapoints of a smaller class, for example to overcome the problem of imbalanced classes in classification.

Value

a list containing new independent variables data.frame and new class labels

Author(s)

Ilya Kozlovskiy, Konstantin Hopf konstantin.hopf@uni-bamberg.de

[Package SmartMeterAnalytics version 1.0.3 Index]