ClusterUpsamplingMinority {FCPS} | R Documentation |
Cluster Up Sampling using SMOTE for minority cluster
Description
Wrapper for one specific internal function of L. Torgo who implemented there the relevant part of the SMOTE algorithm [Chawla et al., 2002].
Usage
ClusterUpsamplingMinority(Cls, Data, MinorityCluster,
Percentage = 200, knn = 5, PlotIt = FALSE)
Arguments
Cls |
1:n numerical vector of numbers defining the classification as the main output of the clustering algorithm for the n cases of data. It has k unique numbers representing the arbitrary labels of the clustering. |
Data |
[1:n,1:d] datamatrix of n cases and d features |
MinorityCluster |
scalar defining the number of the cluster to be upsampeled |
Percentage |
pecentage above 100 of who many samples should be taken |
knn |
k nearest neighbors of SMOTE algorithm |
PlotIt |
TRUE: plots the result using |
Details
the number of items m
is defined by the scalar Percentage
and the up sampling is combined with the Data
and the Cls
to DataExt
and ClsExt
such that the sample is placed thereafter.
Value
List with
ClsExt |
1:(n+m) numerical vector of numbers defining the classification as the main output of the clustering algorithm for the n cases of data. It has k unique numbers representing the arbitrary labels of the clustering. |
DataExt |
[1:(n+m),1:d] datamatrix of n cases and d features |
.
Author(s)
L. Torgo
References
[Chawla et al., 2002] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P.: SMOTE: synthetic minority over-sampling technique, Journal of artificial intelligence research, Vol. 16, pp. 321-357. 2002.
Examples
data(Lsun3D)
Data=Lsun3D$Data
Cls=Lsun3D$Cls
table(Cls)
V=ClusterUpsamplingMinority(Cls,Data,4,1000)
table(V$ClsExt)