当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Streaming Convolutional Neural Networks for End-to-End Learning With Multi-Megapixel Images
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2020-08-26 , DOI: 10.1109/tpami.2020.3019563
Hans Pinckaers 1 , Bram van Ginneken 1 , Geert Litjens 1
Affiliation  

Due to memory constraints on current hardware, most convolution neural networks (CNN) are trained on sub-megapixel images. For example, most popular datasets in computer vision contain images much less than a megapixel in size (0.09MP for ImageNet and 0.001MP for CIFAR-10). In some domains such as medical imaging, multi-megapixel images are needed to identify the presence of disease accurately. We propose a novel method to directly train convolutional neural networks using any input image size end-to-end. This method exploits the locality of most operations in modern convolutional neural networks by performing the forward and backward pass on smaller tiles of the image. In this work, we show a proof of concept using images of up to 66-megapixels (8192×8192), saving approximately 50GB of memory per image. Using two public challenge datasets, we demonstrate that CNNs can learn to extract relevant information from these large images and benefit from increasing resolution. We improved the area under the receiver-operating characteristic curve from 0.580 (4MP) to 0.706 (66MP) for metastasis detection in breast cancer (CAMELYON17). We also obtained a Spearman correlation metric approaching state-of-the-art performance on the TUPAC16 dataset, from 0.485 (1MP) to 0.570 (16MP). Code to reproduce a subset of the experiments is available at https://github.com/DIAGNijmegen/StreamingCNN .

中文翻译:

用于多像素图像端到端学习的流式卷积神经网络

由于当前硬件的内存限制,大多数卷积神经网络 (CNN) 都是在亚百万像素图像上训练的。例如,计算机视觉中最流行的数据集包含的图像尺寸远小于百万像素(ImageNet 为 0.09MP,CIFAR-10 为 0.001MP)。在某些领域,例如医学成像,需要数百万像素的图像来准确识别疾病的存在。我们提出了一种新的方法来直接使用任何输入图像大小端到端直接训练卷积神经网络。该方法通过对图像的较小图块执行前向和后向传递来利用现代卷积神经网络中大多数操作的局部性。在这项工作中,我们使用高达 66 兆像素 (8192×8192) 的图像展示了概念验证,每张图像节省了大约 50GB 的内存。使用两个公共挑战数据集,我们证明 CNN 可以学习从这些大图像中提取相关信息,并从提高分辨率中受益。我们将接受者操作特征曲线下的面积从 0.580 (4MP) 提高到 0.706 (66MP),用于乳腺癌转移检测 (CAMELYON17)。我们还在 TUPAC16 数据集上获得了接近最先进性能的 Spearman 相关度量,从 0.485 (1MP) 到 0.570 (16MP)。重现实验子集的代码可在 我们还在 TUPAC16 数据集上获得了接近最先进性能的 Spearman 相关度量,从 0.485 (1MP) 到 0.570 (16MP)。重现实验子集的代码可在 我们还在 TUPAC16 数据集上获得了接近最先进性能的 Spearman 相关度量,从 0.485 (1MP) 到 0.570 (16MP)。重现实验子集的代码可在https://github.com/DIAGNijmegen/StreamingCNN .
更新日期:2020-08-26
down
wechat
bug