A whole-slide foundation model for digital pathology from real-world data,Nature

当前位置： X-MOL 学术 › Nature › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A whole-slide foundation model for digital pathology from real-world data
Nature ( IF 64.8 ) Pub Date : 2024-05-22 , DOI: 10.1038/s41586-024-07441-w
Hanwen Xu , Naoto Usuyama , Jaspreet Bagga , Sheng Zhang , Rajesh Rao , Tristan Naumann , Cliff Wong , Zelalem Gero , Javier González , Yu Gu , Yanbo Xu , Mu Wei , Wenhui Wang , Shuming Ma , Furu Wei , Jianwei Yang , Chunyuan Li , Jianfeng Gao , Jaylen Rosemon , Tucker Bower , Soohee Lee , Roshanthi Weerasinghe , Bill J. Wright , Ari Robicsek , Brian Piening , Carlo Bifulco , Sheng Wang , Hoifung Poon

Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles^1,2,3. Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context⁴. Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet⁵ method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data⁶. With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision–language pretraining for pathology^7,8 by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling.

中文翻译：

来自真实世界数据的数字病理学的全幻灯片基础模型

数字病理学提出了独特的计算挑战，因为标准的十亿像素幻灯片可能包含数以万计的图像块 ^1,2,3 。先前的模型通常对每张幻灯片的一小部分图块进行二次采样，从而错过了重要的幻灯片级别上下文 ⁴ 。在这里，我们介绍 Prov-GigaPath，这是一种全幻灯片病理学基础模型，在来自普罗维登斯（由 28 个癌症中心组成的美国大型健康网络）的 171,189 张全幻灯片中的 13 亿张 256 × 256 病理图像图块上进行了预训练。这些幻灯片来自 30,000 多名患者，涵盖 31 种主要组织类型。为了预训练 Prov-GigaPath，我们提出了 GigaPath，这是一种新颖的视觉转换器架构，用于预训练十亿像素病理切片。为了将 GigaPath 扩展到具有数万个图像块的幻灯片级学习，GigaPath 将新开发的 LongNet ⁵ 方法应用于数字病理学。为了评估 Prov-GigaPath，我们使用 Providence 和 TCGA 数据 ⁶ 构建了一个数字病理学基准，其中包括 9 项癌症亚型分型任务和 17 项病理组学任务。通过大规模预训练和超大上下文建模，Prov-GigaPath 在 26 项任务中的 25 项上取得了最先进的性能，在 18 项任务上比第二好的方法有了显着改进。通过结合病理学报告，我们进一步证明了 Prov-GigaPath 在病理学视觉语言预训练方面的潜力 ^7,8 。总之，Prov-GigaPath 是一种开放权重基础模型，在各种数字病理学任务上实现了最先进的性能，展示了真实世界数据和全幻灯片建模的重要性。

更新日期：2024-05-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>