smote {SmartMeterAnalytics}R Documentation

Synthetic minority oversampling (SMOTE)

Description

Performs oversampling by creating new instances.

Usage

smote(
  Variables,
  Classes,
  subset_use = NULL,
  k = 5,
  use_nearest = TRUE,
  proportions = 0.9,
  equalise_with_undersampling = FALSE,
  safe = FALSE
)

Arguments

Variables

the data.frame of independent variables that should be used to create new instances

Classes

the class labels in the prediction problem

subset_use

a specific subset only is used for the oversampling. If NULL, everything is used.

k

the number of neigbours for generation

use_nearest

should only the nearest neighbours be used? (very slow)

proportions

to which proportion (of the biggest class) should the classes be equalized

equalise_with_undersampling

should additional undersampling be performed?

safe

should a safe version of SMOTE be used?

Details

SMOTE is used to generate synthetic datapoints of a smaller class, for example to overcome the problem of imbalanced classes in classification.

Value

a list containing new independent variables data.frame and new class labels

Author(s)

Ilya Kozlovskiy, Konstantin Hopf konstantin.hopf@uni-bamberg.de


[Package SmartMeterAnalytics version 1.0.3 Index]