当前位置: X-MOL 学术Inform. Fusion › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Pay attention to doctor-patient dialogues: Multi-modal knowledge graph attention image-text embedding for COVID-19 diagnosis
Information Fusion ( IF 14.7 ) Pub Date : 2021-06-01 , DOI: 10.1016/j.inffus.2021.05.015
Wenbo Zheng 1, 2 , Lan Yan 2, 3 , Chao Gou 4 , Zhi-Cheng Zhang 5 , Jun Jason Zhang 6 , Ming Hu 7 , Fei-Yue Wang 2
Affiliation  

The sudden increase in coronavirus disease 2019 (COVID-19) cases puts high pressure on healthcare services worldwide. At this stage, fast, accurate, and early clinical assessment of the disease severity is vital. In general, there are two issues to overcome: (1) Current deep learning-based works suffer from multimodal data adequacy issues; (2) In this scenario, multimodal (e.g., text, image) information should be taken into account together to make accurate inferences. To address these challenges, we propose a multi-modal knowledge graph attention embedding for COVID-19 diagnosis. Our method not only learns the relational embedding from nodes in a constituted knowledge graph but also has access to medical knowledge, aiming at improving the performance of the classifier through the mechanism of medical knowledge attention. The experimental results show that our approach significantly improves classification performance compared to other state-of-the-art techniques and possesses robustness for each modality from multi-modal data. Moreover, we construct a new COVID-19 multi-modal dataset based on text mining, consisting of 1393 doctor-patient dialogues and their 3706 images (347 X-ray + 2598 CT + 761 ultrasound) about COVID-19 patients and 607 non-COVID-19 patient dialogues and their 10754 images (9658 X-ray + 494 CT + 761 ultrasound), and the fine-grained labels of all. We hope this work can provide insights to the researchers working in this area to shift the attention from only medical images to the doctor-patient dialogue and its corresponding medical images.



中文翻译:

注意医患对话:用于 COVID-19 诊断的多模态知识图注意图像文本嵌入

2019 年冠状病毒病 (COVID-19) 病例的突然增加给全球医疗保健服务带来了巨大压力。在这个阶段,对疾病严重程度进行快速、准确和早期的临床评估至关重要。总的来说,有两个问题需要克服:(1)当前基于深度学习的工作存在多模态数据充足性问题;(2) 在这种情况下,应该一起考虑多模态(例如,文本、图像)信息以做出准确的推断。为了应对这些挑战,我们提出了一种用于 COVID-19 诊断的多模态知识图注意力嵌入。我们的方法不仅从构成的知识图中的节点学习关系嵌入,而且还可以访问医学知识,旨在通过医学知识注意力机制提高分类器的性能。实验结果表明,与其他最先进的技术相比,我们的方法显着提高了分类性能,并且对来自多模态数据的每种模态具有鲁棒性。此外,我们基于文本挖掘构建了一个新的 COVID-19 多模态数据集,由 1393 个医患对话及其 3706 个图像(347 个 X 射线+ 2598 CT + 761 超声)关于 COVID-19 患者和 607 非 COVID-19 患者对话及其 10754 幅图像(9658 X 射线 + 494 CT +761 超声波),以及所有的细粒度标签。我们希望这项工作可以为该领域的研究人员提供见解,将注意力从仅医学图像转移到医患对话及其相应的医学图像。

更新日期:2021-06-01
down
wechat
bug