当前位置: X-MOL 学术J. Biomed. Opt. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Classification of imbalanced oral cancer image data from high-risk population
Journal of Biomedical Optics ( IF 3.5 ) Pub Date : 2021-10-01 , DOI: 10.1117/1.jbo.26.10.105001
Bofan Song 1 , Shaobai Li 1 , Sumsum Sunny 2 , Keerthi Gurushanth 3 , Pramila Mendonca 4 , Nirza Mukhia 3 , Sanjana Patrick 5 , Shubha Gurudath 3 , Subhashini Raghavan 3 , Imchen Tsusennaro 6 , Shirley T Leivon 6 , Trupti Kolur 4 , Vivek Shetty 4 , Vidya Bushan 4 , Rohan Ramesh 6 , Tyler Peterson 1 , Vijay Pillai 4 , Petra Wilder-Smith 7 , Alben Sigamani 4 , Amritha Suresh 2, 4 , Moni Abraham Kuriakose 8 , Praveen Birur 3, 5 , Rongguang Liang 1
Affiliation  

Significance: Early detection of oral cancer is vital for high-risk patients, and machine learning-based automatic classification is ideal for disease screening. However, current datasets collected from high-risk populations are unbalanced and often have detrimental effects on the performance of classification. Aim: To reduce the class bias caused by data imbalance. Approach: We collected 3851 polarized white light cheek mucosa images using our customized oral cancer screening device. We use weight balancing, data augmentation, undersampling, focal loss, and ensemble methods to improve the neural network performance of oral cancer image classification with the imbalanced multi-class datasets captured from high-risk populations during oral cancer screening in low-resource settings. Results: By applying both data-level and algorithm-level approaches to the deep learning training process, the performance of the minority classes, which were difficult to distinguish at the beginning, has been improved. The accuracy of “premalignancy” class is also increased, which is ideal for screening applications. Conclusions: Experimental results show that the class bias induced by imbalanced oral cancer image datasets could be reduced using both data- and algorithm-level methods. Our study may provide an important basis for helping understand the influence of unbalanced datasets on oral cancer deep learning classifiers and how to mitigate.

中文翻译:

高危人群不平衡口腔癌影像数据分类

意义:早期发现口腔癌对高危患者至关重要,基于机器学习的自动分类是疾病筛查的理想选择。然而,当前从高风险人群收集的数据集是不平衡的,并且通常对分类的性能产生不利影响。目的:减少数据不平衡引起的类偏差。方法:我们使用我们定制的口腔癌筛查设备收集了 3851 张偏振白光面颊粘膜图像。我们使用权重平衡、数据增强、欠采样、焦点损失和集成方法来提高口腔癌图像分类的神经网络性能,其中在资源匮乏的环境中进行口腔癌筛查时从高危人群中捕获不平衡的多类数据集。结果:通过在深度学习训练过程中应用数据级和算法级的方法,一开始难以区分的少数类的性能得到了改善。“癌前”分类的准确性也提高了,非常适合筛查应用。结论:实验结果表明,使用数据和算法级别的方法可以减少由不平衡的口腔癌图像数据集引起的类别偏差。我们的研究可能为帮助理解不平衡数据集对口腔癌深度学习分类器的影响以及如何减轻影响提供重要基础。这是筛选应用的理想选择。结论:实验结果表明,使用数据和算法级别的方法可以减少由不平衡的口腔癌图像数据集引起的类别偏差。我们的研究可能为帮助理解不平衡数据集对口腔癌深度学习分类器的影响以及如何减轻影响提供重要基础。这是筛选应用的理想选择。结论:实验结果表明,使用数据和算法级别的方法可以减少由不平衡的口腔癌图像数据集引起的类别偏差。我们的研究可能为帮助理解不平衡数据集对口腔癌深度学习分类器的影响以及如何减轻影响提供重要基础。
更新日期:2021-10-24
down
wechat
bug