neater {imbalance}R Documentation

Fitering of oversampled data based on non-cooperative game theory

Description

Filters oversampled examples from a binary class dataset using game theory to find out if keeping an example is worthy enough.

Usage

neater(
  dataset,
  newSamples,
  k = 3,
  iterations = 100,
  smoothFactor = 1,
  classAttr = "Class"
)

Arguments

dataset

The original data.frame. All columns, except classAttr one, have to be numeric or coercible to numeric.

newSamples

A data.frame containing the samples to be filtered. Must have the same structure as dataset.

k

Integer. Number of nearest neighbours to use in KNN algorithm to rule out samples. By default, 3.

iterations

Integer. Number of iterations for the algorithm. By default, 100.

smoothFactor

A positive numeric. By default, 1.

classAttr

character. Indicates the class attribute from dataset and newSamples. Must exist in them.

Details

Uses game theory and Nash equilibriums to calculate the minority examples probability of trully belonging to the minority class. It discards examples which at the final stage of the algorithm have more probability of being a majority example than a minority one.

Value

Filtered samples as a data.frame with same structure as newSamples.

References

Almogahed, B.A.; Kakadiaris, I.A. Neater: Filtering of Over-Sampled Data Using Non-Cooperative Game Theory. Soft Computing 19 (2014), Nr. 11, p. 3301–3322.

Examples

data(iris0)

newSamples <- smotefamily::SMOTE(iris0[,-5], iris0[,5])$syn_data
# SMOTE overrides Class attr turning it into class
# and dataset must have same class attribute as newSamples
names(newSamples) <- c(names(newSamples)[-5], "Class")

neater(iris0, newSamples, k = 5, iterations = 100,
       smoothFactor = 1, classAttr = "Class")

[Package imbalance version 1.0.2.1 Index]