当前位置: X-MOL 学术J. Am. Med. Inform. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MT-clinical BERT: scaling clinical information extraction with multitask learning
Journal of the American Medical Informatics Association ( IF 6.4 ) Pub Date : 2021-08-01 , DOI: 10.1093/jamia/ocab126
Andriy Mulyar 1 , Ozlem Uzuner 2 , Bridget McInnes 2
Affiliation  

Abstract
Objective
Clinical notes contain an abundance of important, but not-readily accessible, information about patients. Systems that automatically extract this information rely on large amounts of training data of which there exists limited resources to create. Furthermore, they are developed disjointly, meaning that no information can be shared among task-specific systems. This bottleneck unnecessarily complicates practical application, reduces the performance capabilities of each individual solution, and associates the engineering debt of managing multiple information extraction systems.
Materials and Methods
We address these challenges by developing Multitask-Clinical BERT: a single deep learning model that simultaneously performs 8 clinical tasks spanning entity extraction, personal health information identification, language entailment, and similarity by sharing representations among tasks.
Results
We compare the performance of our multitasking information extraction system to state-of-the-art BERT sequential fine-tuning baselines. We observe a slight but consistent performance degradation in MT-Clinical BERT relative to sequential fine-tuning.
Discussion
These results intuitively suggest that learning a general clinical text representation capable of supporting multiple tasks has the downside of losing the ability to exploit dataset or clinical note-specific properties when compared to a single, task-specific model.
Conclusions
We find our single system performs competitively with all state-the-art task-specific systems while also benefiting from massive computational benefits at inference.


中文翻译:

MT-clinical BERT:通过多任务学习扩展临床信息提取

摘要
客观的
临床笔记包含大量关于患者的重要但不易获取的信息。自动提取此信息的系统依赖于大量训练数据,而这些训练数据的创建资源有限。此外,它们是不相交的,这意味着在特定任务的系统之间不能共享任何信息。这一瓶颈不必要地使实际应用复杂化,降低了每个单独解决方案的性能能力,并与管理多个信息提取系统的工程债务相关联。
材料和方法
我们通过开发多任务临床 BERT 来应对这些挑战:一个单一的深度学习模型,通过在任务之间共享表示,同时执行 8 个临床任务,包括实体提取、个人健康信息识别、语言蕴涵和相似性。
结果
我们将我们的多任务信息提取系统的性能与最先进的 BERT 顺序微调基线进行了比较。我们观察到 MT-Clinical BERT 相对于顺序微调的轻微但一致的性能下降。
讨论
这些结果直观地表明,与单个任务特定模型相比,学习能够支持多个任务的一般临床文本表示具有失去利用数据集或临床笔记特定属性的能力的缺点。
结论
我们发现我们的单一系统与所有最先进的任务特定系统相比具有竞争力,同时还受益于推理的大量计算优势。
更新日期:2021-09-20
down
wechat
bug