当前位置:
X-MOL 学术
›
arXiv.cs.CV
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Two-Stage Resampling for Convolutional Neural Network Training in the Imbalanced Colorectal Cancer Image Classification
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-04-07 , DOI: arxiv-2004.03332 Micha{\l} Koziarski
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-04-07 , DOI: arxiv-2004.03332 Micha{\l} Koziarski
Data imbalance remains one of the open challenges in the contemporary machine
learning. It is especially prevalent in case of medical data, such as
histopathological images. Traditional data-level approaches for dealing with
data imbalance are ill-suited for image data: oversampling methods such as
SMOTE and its derivatives lead to creation of unrealistic synthetic
observations, whereas undersampling reduces the amount of available data,
critical for successful training of convolutional neural networks. To alleviate
the problems associated with over- and undersampling we propose a novel
two-stage resampling methodology, in which we initially use the oversampling
techniques in the image space to leverage a large amount of data for training
of a convolutional neural network, and afterwards apply undersampling in the
feature space to fine-tune the last layers of the network. Experiments
conducted on a colorectal cancer image dataset indicate the usefulness of the
proposed approach.
中文翻译:
不平衡结直肠癌图像分类中卷积神经网络训练的两阶段重采样
数据不平衡仍然是当代机器学习面临的开放挑战之一。它在医学数据的情况下尤其普遍,例如组织病理学图像。处理数据不平衡的传统数据级方法不适用于图像数据:过采样方法(如 SMOTE 及其衍生物)导致创建不切实际的合成观察,而欠采样减少了可用数据量,这对于成功训练卷积神经网络至关重要网络。为了缓解与过采样和欠采样相关的问题,我们提出了一种新的两阶段重采样方法,其中我们最初在图像空间中使用过采样技术来利用大量数据来训练卷积神经网络,然后在特征空间中应用欠采样来微调网络的最后一层。在结直肠癌图像数据集上进行的实验表明了所提出方法的有用性。
更新日期:2020-04-08
中文翻译:
不平衡结直肠癌图像分类中卷积神经网络训练的两阶段重采样
数据不平衡仍然是当代机器学习面临的开放挑战之一。它在医学数据的情况下尤其普遍,例如组织病理学图像。处理数据不平衡的传统数据级方法不适用于图像数据:过采样方法(如 SMOTE 及其衍生物)导致创建不切实际的合成观察,而欠采样减少了可用数据量,这对于成功训练卷积神经网络至关重要网络。为了缓解与过采样和欠采样相关的问题,我们提出了一种新的两阶段重采样方法,其中我们最初在图像空间中使用过采样技术来利用大量数据来训练卷积神经网络,然后在特征空间中应用欠采样来微调网络的最后一层。在结直肠癌图像数据集上进行的实验表明了所提出方法的有用性。