当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Extracting Structured Data from Physician-Patient Conversations By Predicting Noteworthy Utterances
arXiv - CS - Artificial Intelligence Pub Date : 2020-07-14 , DOI: arxiv-2007.07151
Kundan Krishna, Amy Pavel, Benjamin Schloss, Jeffrey P. Bigham, Zachary C. Lipton

Despite diverse efforts to mine various modalities of medical data, the conversations between physicians and patients at the time of care remain an untapped source of insights. In this paper, we leverage this data to extract structured information that might assist physicians with post-visit documentation in electronic health records, potentially lightening the clerical burden. In this exploratory study, we describe a new dataset consisting of conversation transcripts, post-visit summaries, corresponding supporting evidence (in the transcript), and structured labels. We focus on the tasks of recognizing relevant diagnoses and abnormalities in the review of organ systems (RoS). One methodological challenge is that the conversations are long (around 1500 words), making it difficult for modern deep-learning models to use them as input. To address this challenge, we extract noteworthy utterances---parts of the conversation likely to be cited as evidence supporting some summary sentence. We find that by first filtering for (predicted) noteworthy utterances, we can significantly boost predictive performance for recognizing both diagnoses and RoS abnormalities.

中文翻译:

通过预测值得注意的话语从医患对话中提取结构化数据

尽管为挖掘各种形式的医疗数据做出了各种努力,但在护理时医生和患者之间的对话仍然是未开发的见解来源。在本文中,我们利用这些数据来提取结构化信息,这些信息可能有助于医生在电子健康记录中处理访问后文档,从而可能减轻文书负担。在这项探索性研究中,我们描述了一个新的数据集,包括对话记录、访问后摘要、相应的支持证据(在记录中)和结构化标签。我们专注于在器官系统 (RoS) 审查中识别相关诊断和异常的任务。一个方法上的挑战是对话很长(大约 1500 个单词),这使得现代深度学习模型很难将它们用作输入。为了应对这一挑战,我们提取了值得注意的话语——对话中可能被引用作为支持某些摘要句子的证据的部分。我们发现,通过首先过滤(预测的)值得注意的话语,我们可以显着提高识别诊断和 RoS 异常的预测性能。
更新日期:2020-07-15
down
wechat
bug