Streaming Convolutional Neural Networks for End-to-End Learning With Multi-Megapixel Images,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Streaming Convolutional Neural Networks for End-to-End Learning With Multi-Megapixel Images
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 8-26-2020 , DOI: 10.1109/tpami.2020.3019563
Hans Pinckaers ₁ , Bram van Ginneken ₁ , Geert Litjens ₁

Affiliation

Due to memory constraints on current hardware, most convolution neural networks (CNN) are trained on sub-megapixel images. For example, most popular datasets in computer vision contain images much less than a megapixel in size (0.09MP for ImageNet and 0.001MP for CIFAR-10). In some domains such as medical imaging, multi-megapixel images are needed to identify the presence of disease accurately. We propose a novel method to directly train convolutional neural networks using any input image size end-to-end. This method exploits the locality of most operations in modern convolutional neural networks by performing the forward and backward pass on smaller tiles of the image. In this work, we show a proof of concept using images of up to 66-megapixels (8192×8192), saving approximately 50GB of memory per image. Using two public challenge datasets, we demonstrate that CNNs can learn to extract relevant information from these large images and benefit from increasing resolution. We improved the area under the receiver-operating characteristic curve from 0.580 (4MP) to 0.706 (66MP) for metastasis detection in breast cancer (CAMELYON17). We also obtained a Spearman correlation metric approaching state-of-the-art performance on the TUPAC16 dataset, from 0.485 (1MP) to 0.570 (16MP). Code to reproduce a subset of the experiments is available at https://github.com/DIAGNijmegen/StreamingCNN.

中文翻译：

用于百万像素图像端到端学习的流式卷积神经网络

由于当前硬件的内存限制，大多数卷积神经网络 (CNN) 都是在亚百万像素图像上进行训练的。例如，计算机视觉中最流行的数据集包含的图像大小远小于百万像素（ImageNet 为 0.09MP，CIFAR-10 为 0.001MP）。在医学成像等某些领域，需要数百万像素图像来准确识别疾病的存在。我们提出了一种使用任何输入图像大小端到端直接训练卷积神经网络的新方法。该方法通过在图像的较小块上执行前向和后向传递来利用现代卷积神经网络中大多数操作的局部性。在这项工作中，我们使用高达 66 兆像素 (8192×8192) 的图像展示了概念验证，每个图像节省了大约 50GB 的内存。使用两个公共挑战数据集，我们证明 CNN 可以学习从这些大图像中提取相关信息，并从分辨率的提高中受益。我们将乳腺癌转移检测的受试者工作特征曲线下面积从 0.580 (4MP) 提高到 0.706 (66MP) (CAMELYON17)。我们还获得了 Spearman 相关性度量，接近 TUPAC16 数据集上最先进的性能，从 0.485 (1MP) 到 0.570 (16MP)。重现部分实验的代码可在 https://github.com/DIAGNijmegen/StreamingCNN 上找到。

更新日期：2024-08-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11