Balancing Multiclass Datasets for Classification Tasks


[Up] [Top]

Documentation for package ‘scutr’ version 0.2.0

Help Pages

bullseye An imbalanced dataset with a minor class centered around the origin with a majority class surrounding the center.
imbalance An imbalanced dataset with randomly placed normal distributions around the origin. The nth class has n * 10 observations.
oversample_smote Oversample a dataset by SMOTE.
resample_random Randomly resample a dataset.
sample_classes Stratified index sample of different values in a vector.
SCUT SMOTE and cluster-based undersampling technique.
SCUT_parallel SMOTE and cluster-based undersampling technique.
undersample_hclust Undersample a dataset by hierarchical clustering.
undersample_kmeans Undersample a dataset by kmeans clustering.
undersample_mclust Undersample a dataset by expectation-maximization clustering
undersample_mindist Undersample a dataset by iteratively removing the observation with the lowest total distance to its neighbors of the same class.
undersample_tomek Undersample a dataset by removing Tomek links.
validate_dataset Validate a dataset for resampling.
wine Type and chemical analysis of three different kinds of wine.