当前位置: X-MOL 学术Inf. Syst. Front. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Framework for the Classification of Imbalanced Structured Data Using Under-sampling and Convolutional Neural Network
Information Systems Frontiers ( IF 6.9 ) Pub Date : 2021-09-17 , DOI: 10.1007/s10796-021-10195-9
Yoon Sang Lee 1 , Chulhwan Chris Bang 2
Affiliation  

Among machine learning techniques, classification techniques are useful for various business applications, but classification algorithms perform poorly with imbalanced data. In this study, we propose a classification technique with improved binary classification performance on both the minority and majority classes of imbalanced structured data. The proposed framework is composed of three steps. In the first step, a balanced training set is created via under-sampling. Then, each example is converted into an image depicting a line graph. In the last step, a Convolutional Neural Network (CNN) is trained using the images. In the experiments, we selected six datasets from the UCI Repository and applied the proposed framework to them. The proposed model achieved the best receiver operating characteristic (ROC) curve and Balanced Accuracy (BA) on all the datasets and five datasets, respectively. This demonstrates that the combination of under-sampling and CNNs is a viable approach for imbalanced structure data classification.



中文翻译:

使用欠采样和卷积神经网络对不平衡结构化数据进行分类的框架

在机器学习技术中,分类技术可用于各种业务应用,但分类算法在处理不平衡数据时表现不佳。在这项研究中,我们提出了一种分类技术,在不平衡结构化数据的少数类和多数类上都具有改进的二元分类性能。拟议的框架由三个步骤组成。第一步,通过欠采样创建平衡的训练集。然后,将每个示例转换为描绘折线图的图像。在最后一步中,使用图像训练卷积神经网络 (CNN)。在实验中,我们从 UCI 存储库中选择了六个数据集,并将提出的框架应用于它们。所提出的模型分别在所有数据集和五个数据集上实现了最佳接收器操作特性 (ROC) 曲线和平衡准确度 (BA)。这表明欠采样和 CNN 的组合是不平衡结构数据分类的可行方法。

更新日期:2021-09-19
down
wechat
bug