Automated ICD coding via unsupervised knowledge integration (UNITE).,International Journal of Medical Informatics

当前位置： X-MOL 学术 › Int. J. Med. Inform. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Automated ICD coding via unsupervised knowledge integration (UNITE).
International Journal of Medical Informatics ( IF 3.7 ) Pub Date : 2020-04-04 , DOI: 10.1016/j.ijmedinf.2020.104135
Aaron Sonabend W ₁ , Winston Cai ₂ , Yuri Ahuja ₁ , Ashwin Ananthakrishnan ₃ , Zongqi Xia ₄ , Sheng Yu ₅ , Chuan Hong ₆

Affiliation

OBJECTIVE Accurate coding is critical for medical billing and electronic medical record (EMR)-based research. Recent research has been focused on developing supervised methods to automatically assign International Classification of Diseases (ICD) codes from clinical notes. However, supervised approaches rely on ICD code data stored in the hospital EMR system and is subject to bias rising from the practice and coding behavior. Consequently, portability of trained supervised algorithms to external EMR systems may suffer. METHOD We developed an unsupervised knowledge integration (UNITE) algorithm to automatically assign ICD codes for a specific disease by analyzing clinical narrative notes via semantic relevance assessment. The algorithm was validated using coded ICD data for 6 diseases from Partners HealthCare (PHS) Biobank and Medical Information Mart for Intensive Care (MIMIC-III). We compared the performance of UNITE against penalized logistic regression (LR), topic modeling, and neural network models within each EMR system. We additionally evaluated the portability of UNITE by training at PHS Biobank and validating at MIMIC-III, and vice versa. RESULTS UNITE achieved an averaged AUC of 0.91 at PHS and 0.92 at MIMIC over 6 diseases, comparable to LR and MLP. It had substantially better performance than topic models. In regards to portability, the performance of UNITE was consistent across different EMR systems, superior to LR, topic models and neural network models. CONCLUSION UNITE accurately assigns ICD code in EMR without requiring human labor, and has major advantages over commonly used machine learning approaches. In addition, the UNITE attained stable performance and high portability across EMRs in different institutions.

中文翻译：

通过无监督知识集成 (UNITE) 实现自动 ICD 编码。

目标准确的编码对于基于医疗计费和电子病历 (EMR) 的研究至关重要。最近的研究集中在开发有监督的方法，以从临床记录中自动分配国际疾病分类 (ICD) 代码。然而，有监督的方法依赖于存储在医院 EMR 系统中的 ICD 代码数据，并且受到实践和编码行为产生的偏差的影响。因此，训练有素的监督算法对外部 EMR 系统的可移植性可能会受到影响。方法我们开发了一种无监督知识集成 (UNITE) 算法，通过语义相关性评估分析临床叙述笔记，自动为特定疾病分配 ICD 代码。该算法使用来自 Partners HealthCare (PHS) Biobank 和 Medical Information Mart for Intensive Care (MIMIC-III) 的 6 种疾病的编码 ICD 数据进行了验证。我们将 UNITE 的性能与每个 EMR 系统中的惩罚逻辑回归 (LR)、主题建模和神经网络模型进行了比较。我们还通过在 PHS Biobank 进行培训并在 MIMIC-III 进行验证来评估 UNITE 的可移植性，反之亦然。结果 UNITE 在 6 种疾病中的 PHS 和 MIMIC 的平均 AUC 分别为 0.91 和 0.92，与 LR 和 MLP 相当。它具有比主题模型更好的性能。在可移植性方面，UNITE 的性能在不同的 EMR 系统中是一致的，优于 LR、主题模型和神经网络模型。结论 UNITE 在 EMR 中准确分配 ICD 代码，无需人工，并且与常用的机器学习方法相比具有重大优势。此外，UNITE 在不同机构的 EMR 之间实现了稳定的性能和高可移植性。

更新日期：2020-04-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11