Augmentation Based Synthetic Sampling and Ensemble Techniques for Imbalanced Data Classification
Abstract
The imbalance data problem appears in data mining fields and has recently attracted the attention of researchers. In order to solve this problem, scholars proposed various approaches such as undersampling majority class, oversampling minority class, synthetic Minority Oversampling (SMOTE) technique, Proximity Weighted Random Affine Shadowsampling (ProWRAS), etc. However, this work proposes a new method called Augmentation Based Synthetic Sampling (ABS) for imbalanced data classification that concatenates data to predict features with imbalance problems. The proposed study integrates sampling and concatenated features to generate synthetic data. This study shows the ability of the proposed method and the average of the AUC (area under the curve) to generate good data samples while experimenting compared to the previous study. In addition, this study merged the proposed method with the boosting to create a technique known as ABSBoost. Therefore, the experimental outcomes show that the proposed ABS method and ABSBoost are effective on the given datasets.