当前位置: X-MOL 学术Sci. Rep. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparing deep learning and pathologist quantification of cell-level PD-L1 expression in non-small cell lung cancer whole-slide images
Scientific Reports ( IF 4.6 ) Pub Date : 2024-03-26 , DOI: 10.1038/s41598-024-57067-1
Leander van Eekelen , Joey Spronck , Monika Looijen-Salamon , Shoko Vos , Enrico Munari , Ilaria Girolami , Albino Eccher , Balazs Acs , Ceren Boyaci , Gabriel Silva de Souza , Muradije Demirel-Andishmand , Luca Dulce Meesters , Daan Zegers , Lieke van der Woude , Willemijn Theelen , Michel van den Heuvel , Katrien Grünberg , Bram van Ginneken , Jeroen van der Laak , Francesco Ciompi

Programmed death-ligand 1 (PD-L1) expression is currently used in the clinic to assess eligibility for immune-checkpoint inhibitors via the tumor proportion score (TPS), but its efficacy is limited by high interobserver variability. Multiple papers have presented systems for the automatic quantification of TPS, but none report on the task of determining cell-level PD-L1 expression and often reserve their evaluation to a single PD-L1 monoclonal antibody or clinical center. In this paper, we report on a deep learning algorithm for detecting PD-L1 negative and positive tumor cells at a cellular level and evaluate it on a cell-level reference standard established by six readers on a multi-centric, multi PD-L1 assay dataset. This reference standard also provides for the first time a benchmark for computer vision algorithms. In addition, in line with other papers, we also evaluate our algorithm at slide-level by measuring the agreement between the algorithm and six pathologists on TPS quantification. We find a moderately low interobserver agreement at cell-level level (mean reader-reader F1 score = 0.68) which our algorithm sits slightly under (mean reader-AI F1 score = 0.55), especially for cases from the clinical center not included in the training set. Despite this, we find good AI-pathologist agreement on quantifying TPS compared to the interobserver agreement (mean reader-reader Cohen’s kappa = 0.54, 95% CI 0.26–0.81, mean reader-AI kappa = 0.49, 95% CI 0.27—0.72). In conclusion, our deep learning algorithm demonstrates promise in detecting PD-L1 expression at a cellular level and exhibits favorable agreement with pathologists in quantifying the tumor proportion score (TPS). We publicly release our models for use via the Grand-Challenge platform.



中文翻译:

比较深度学习和病理学家对非小细胞肺癌全切片图像中细胞水平 PD-L1 表达的定量

程序性死亡配体 1 (PD-L1) 表达目前在临床上用于通过肿瘤比例评分 (TPS) 评估免疫检查点抑制剂的资格,但其疗效因观察者间的高变异性而受到限制。多篇论文提出了自动定量 TPS 的系统,但没有一篇报告确定细胞水平 PD-L1 表达的任务,并且通常将其评估保留给单个 PD-L1 单克隆抗体或临床中心。在本文中,我们报告了一种在细胞水平上检测 PD-L1 阴性和阳性肿瘤细胞的深度学习算法,并根据六位读者在多中心、多 PD-L1 检测中建立的细胞水平参考标准对其进行评估数据集。该参考标准还首次为计算机视觉算法提供了基准。此外,与其他论文一样,我们还通过测量算法与六位病理学家在 TPS 量化方面的一致性来评估我们的算法在幻灯片级别。我们发现细胞水平上的观察者间一致性较低(平均读者-读者 F1 分数 = 0.68),我们的算法略低于该水平(平均读者-AI F1 分数 = 0.55),特别是对于来自未包含在临床中心的病例训练集。尽管如此,我们发现与观察者间协议相比,AI-病理学家在量化 TPS 方面达成了良好的协议(平均读者-读者 Cohen kappa = 0.54,95% CI 0.26–0.81,平均读者-AI kappa = 0.49,95% CI 0.27-0.72) 。总之,我们的深度学习算法展示了在细胞水平检测 PD-L1 表达的前景,并在量化肿瘤比例评分 (TPS) 方面与病理学家表现出良好的一致性。我们公开发布我们的模型以通过 Grand-Challenge 平台使用。

更新日期:2024-03-27
down
wechat
bug