MamlFormer: Priori-experience guiding transformer network via manifold adversarial multi-modal learning for laryngeal histopathological grading,Information Fusion

当前位置： X-MOL 学术 › Inform. Fusion › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

MamlFormer: Priori-experience guiding transformer network via manifold adversarial multi-modal learning for laryngeal histopathological grading
Information Fusion ( IF 18.6 ) Pub Date : 2024-03-02 , DOI: 10.1016/j.inffus.2024.102333
Pan Huang , Chentao Li , Peng He , Hualiang Xiao , Yifang Ping , Peng Feng , Sukun Tian , Hu Chen , Francesco Mercaldo , Antonella Santone , Hui-yuan Yeh , Jing Qin

Pathologic grading of laryngeal squamous cell carcinoma (LSCC) plays a crucial role in diagnosis, prognosis, and migration. However, the grading performance and interpretability of the intelligent grading model based on LSCC low magnification images are poor. This is because it lacks the delicate nuclear information and information more relevant to grading contained in the high magnification images labeled by pathologists. Yet, low magnification images have information such as tissue texture and contours. Thus, we proposed an end-to-end transformer network with manifold adversarial multi-modal learning (MamlFormer). It effectively fuses and learns LSCC high and low magnification pathology image modalities. Firstly, we demonstrate the feasibility and sufficient conditions for modal fusion of LSCC high and low magnification images from Hoeffding's inequality and multimodal co-regularization. Secondly, we design a new manifold block. It constructs the manifold subspace by some principles. Those principles are divisibility, recoverability, and local distance closest of the feature matrix before and after the mapping of the LSCC each magnification image modalities. Meanwhile it can well solve the problems of redundant feature matrix information and weak modal semantic consistency after multimodal learning. Thirdly, we utilize the encoder and the adversarial loss function to implement adversarial block. It can adaptively learn the latent metrics of the modal distributions of LSCC high and low magnification images. Therefore, it also enhances the complementarity of LSCC high and low magnification image modalities. Then, numerous experiments show that MamlFormer outperforms other SOTA models in both grading performance and interpretability. Finally, we also performed generalization experiments on highly prevalent cervix squamous cell carcinoma. The MamlFormer over is superior to other SOTA models in terms of grading performance and interpretability. This indicates its excellent generalization performance and clinical practicability.

中文翻译：

MamlFormer：通过多种对抗性多模态学习引导变压器网络进行喉部组织病理学分级的先验经验

喉鳞状细胞癌（LSCC）的病理分级在诊断、预后和迁移中起着至关重要的作用。然而，基于LSCC低倍图像的智能分级模型的分级性能和可解释性较差。这是因为它缺乏精细的核信息以及与病理学家标记的高放大图像中包含的分级更相关的信息。然而，低倍率图像具有组织纹理和轮廓等信息。因此，我们提出了一种具有多种对抗性多模态学习的端到端变压器网络（MamlFormer）。它有效地融合和学习 LSCC 高倍率和低倍率病理图像模式。首先，我们从 Hoeffding 不等式和多模态共正则化证明了 LSCC 高低倍图像模态融合的可行性和充分条件。其次，我们设计了一个新的歧管块。它按照一定的原理构造流形子空间。这些原则是 LSCC 各放大图像模态映射前后特征矩阵的可分性、可恢复性和局部距离最接近。同时可以很好地解决多模态学习后特征矩阵信息冗余和模态语义一致性弱的问题。第三，我们利用编码器和对抗性损失函数来实现对抗性块。它可以自适应地学习 LSCC 高倍率和低倍率图像模态分布的潜在指标。因此，它也增强了 LSCC 高倍率和低倍率图像模式的互补性。然后，大量实验表明 MamlFormer 在分级性能和可解释性方面均优于其他 SOTA 模型。最后，我们还对高度流行的宫颈鳞状细胞癌进行了泛化实验。 MamlFormer 在分级性能和可解释性方面优于其他 SOTA 模型。这表明其具有良好的泛化性能和临床实用性。

更新日期：2024-03-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>