Two-Stage Resampling for Convolutional Neural Network Training in the Imbalanced Colorectal Cancer Image Classification,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Two-Stage Resampling for Convolutional Neural Network Training in the Imbalanced Colorectal Cancer Image Classification
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-04-07 , DOI: arxiv-2004.03332
Micha{\l} Koziarski

Data imbalance remains one of the open challenges in the contemporary machine learning. It is especially prevalent in case of medical data, such as histopathological images. Traditional data-level approaches for dealing with data imbalance are ill-suited for image data: oversampling methods such as SMOTE and its derivatives lead to creation of unrealistic synthetic observations, whereas undersampling reduces the amount of available data, critical for successful training of convolutional neural networks. To alleviate the problems associated with over- and undersampling we propose a novel two-stage resampling methodology, in which we initially use the oversampling techniques in the image space to leverage a large amount of data for training of a convolutional neural network, and afterwards apply undersampling in the feature space to fine-tune the last layers of the network. Experiments conducted on a colorectal cancer image dataset indicate the usefulness of the proposed approach.

中文翻译：

不平衡结直肠癌图像分类中卷积神经网络训练的两阶段重采样

数据不平衡仍然是当代机器学习面临的开放挑战之一。它在医学数据的情况下尤其普遍，例如组织病理学图像。处理数据不平衡的传统数据级方法不适用于图像数据：过采样方法（如 SMOTE 及其衍生物）导致创建不切实际的合成观察，而欠采样减少了可用数据量，这对于成功训练卷积神经网络至关重要网络。为了缓解与过采样和欠采样相关的问题，我们提出了一种新的两阶段重采样方法，其中我们最初在图像空间中使用过采样技术来利用大量数据来训练卷积神经网络，然后在特征空间中应用欠采样来微调网络的最后一层。在结直肠癌图像数据集上进行的实验表明了所提出方法的有用性。

更新日期：2020-04-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文