R: Cluster Up Sampling using SMOTE for minority cluster

ClusterUpsamplingMinority {FCPS}

R Documentation

Cluster Up Sampling using SMOTE for minority cluster

Description

Wrapper for one specific internal function of L. Torgo who implemented there the relevant part of the SMOTE algorithm [Chawla et al., 2002].

Usage

ClusterUpsamplingMinority(Cls, Data, MinorityCluster,

Percentage = 200, knn = 5, PlotIt = FALSE)

Arguments

`Cls`	1:n numerical vector of numbers defining the classification as the main output of the clustering algorithm for the n cases of data. It has k unique numbers representing the arbitrary labels of the clustering.
`Data`	[1:n,1:d] datamatrix of n cases and d features
`MinorityCluster`	scalar defining the number of the cluster to be upsampeled
`Percentage`	pecentage above 100 of who many samples should be taken
`knn`	k nearest neighbors of SMOTE algorithm
`PlotIt`	TRUE: plots the result using `ClusterPlotMDS`

Details

the number of items m is defined by the scalar Percentage and the up sampling is combined with the Data and the Cls to DataExt and ClsExt such that the sample is placed thereafter.

Value

List with

`ClsExt`	1:(n+m) numerical vector of numbers defining the classification as the main output of the clustering algorithm for the n cases of data. It has k unique numbers representing the arbitrary labels of the clustering.
`DataExt`	[1:(n+m),1:d] datamatrix of n cases and d features

Author(s)

L. Torgo

References

[Chawla et al., 2002] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P.: SMOTE: synthetic minority over-sampling technique, Journal of artificial intelligence research, Vol. 16, pp. 321-357. 2002.

Examples

data(Lsun3D)
Data=Lsun3D$Data
Cls=Lsun3D$Cls
table(Cls)

V=ClusterUpsamplingMinority(Cls,Data,4,1000)
table(V$ClsExt)

[Package FCPS version 1.3.4 Index]