当前位置: X-MOL 学术Int. J. Med. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Text preprocessing for improving hypoglycemia detection from clinical notes - A case study of patients with diabetes.
International Journal of Medical Informatics ( IF 4.9 ) Pub Date : 2019-08-25 , DOI: 10.1016/j.ijmedinf.2019.06.020
Lina Zhou 1 , Tariq Siddiqui 2 , Stephen L Seliger 3 , Jacob B Blumenthal 4 , Yin Kang 5 , Rebecca Doerfler 2 , Jeffrey C Fink 2
Affiliation  

BACKGROUND AND OBJECTIVE Hypoglycemia is a common safety event when attempting to optimize glycemic control in diabetes (DM). While electronic medical records provide a natural ground for detecting and analyzing hypoglycemia, ICD codes used in the databases may be invalid, insensitive or non-specific in detecting new hypoglycemic events. We developed text preprocessing methods to improve automatic detection of hypoglycemia from analysis of clinical encounter text notes. METHODS We set out to improve hypoglycemia detection from clinical notes by introducing three preprocessing methods: stop word filtering, medication signaling, and ICD narrative enrichment. To test the proposed methods, we selected clinical notes from VA Maryland Healthcare System, based on various combinations of three criteria that are suggestive of hypoglycemia, including ICD-9 code of diabetes and hypoglycemia, laboratory glucose values < 70 md/dL, and text reference to a proximate hypoglycemia event. In addition, we constructed one dataset of 395 clinical notes from year 2009 and another of 460 notes from year 2014 to test the generality of the proposed methods. For each of the datasets, two physician judges manually reviewed individual clinical notes to determine whether hypoglycemia was present or absent. A third physician judge served as a final adjudicator for disagreements. RESULTS Each of the proposed preprocessing methods contributed to the performance of hypoglycemia detection by significantly increasing the F1 score in the range of 5.3∼7.4% on one dataset (p < .01). Among the methods, stop word filtering contributed most to the performance improvement (7.4%). Combining all the preprocessing methods led to greater performance gain (p < .001) compared with using each method individually. Similar patterns were observed for the other dataset with the F1 score being increased in the range of 7.7%∼9.4% by individual methods (p < .001). Nevertheless, combining the three methods did not yield additional performance gain. CONCLUSION The proposed text preprocessing methods improved the performance of hypoglycemia detection from clinical text notes. Stop word filtering achieved the most performance improvement. ICD narrative enrichment boosted the recall of detection. Combining the three preprocessing methods led to additional performance gains.

中文翻译:

文本预处理可改善临床笔记中的低血糖检测-糖尿病患者的案例研究。

背景和目的低血糖症是试图优化糖尿病(DM)血糖控制的常见安全事件。尽管电子病历为检测和分析低血糖症提供了自然基础,但数据库中使用的ICD代码在检测新的降血糖事件中可能无效,不敏感或非特异性。我们开发了文本预处理方法,以通过对临床遇到的文本笔记进行分析来改善对低血糖症的自动检测。方法我们着手通过引入三种预处理方法来改善临床笔记中的低血糖检测:停用词过滤,药物信号传递和ICD叙事丰富。为了测试所提出的方法,我们根据提示低血糖的三个标准的各种组合,从VA马里兰州医疗保健系统中选择了临床笔记,包括糖尿病和低血糖的ICD-9代码,实验室葡萄糖值<70 md / dL,以及有关近期低血糖事件的文字参考。此外,我们从2009年开始构建了一个包含395个临床笔记的数据集,从2014年开始构建了另一个460个临床笔记的数据集,以测试所提出方法的通用性。对于每个数据集,两位医生的法官手动查看各个临床记录,以确定是否存在低血糖症。第三位医师法官作为分歧的最终裁决者。结果每种拟议的预处理方法均通过在一个数据集上将F1分数显着提高5.3%至7.4%来促进低血糖检测的性能(p <.01)。在这些方法中,停用词过滤对性能提高的贡献最大(7.4%)。与单独使用每种方法相比,将所有预处理方法组合在一起可获得更高的性能增益(p <.001)。对于其他数据集,通过单独的方法,F1得分在7.7%〜9.4%的范围内增加了类似的模式(p <.001)。但是,将这三种方法结合起来并不会带来额外的性能提升。结论所提出的文本预处理方法从临床文本注释中改善了低血糖检测的性能。停用词过滤实现了最大的性能改进。ICD的叙事丰富性促进了侦查的回忆。三种预处理方法的结合可带来额外的性能提升。对于其他数据集,通过单独的方法,F1得分在7.7%〜9.4%的范围内增加了类似的模式(p <.001)。但是,将这三种方法结合起来并不会带来额外的性能提升。结论所提出的文本预处理方法从临床文本注释中改善了低血糖检测的性能。停用词过滤实现了最大的性能改进。ICD的叙事丰富性促进了侦查的回忆。三种预处理方法的结合可带来额外的性能提升。对于其他数据集,通过单独的方法,F1得分在7.7%〜9.4%的范围内增加了类似的模式(p <.001)。但是,将这三种方法结合起来并不会带来额外的性能提升。结论所提出的文本预处理方法从临床文本注释中改善了低血糖检测的性能。停用词过滤实现了最大的性能改进。ICD的叙事丰富性促进了侦查的回忆。三种预处理方法的结合可带来额外的性能提升。结论所提出的文本预处理方法从临床文本注释中改善了低血糖检测的性能。停用词过滤实现了最大的性能改进。ICD的叙事丰富性促进了侦查的回忆。三种预处理方法的结合可以提高性能。结论所提出的文本预处理方法从临床文本注释中改善了低血糖检测的性能。停用词过滤实现了最大的性能改进。ICD的叙事丰富性促进了侦查的回忆。三种预处理方法的结合可带来额外的性能提升。
更新日期:2019-11-01
down
wechat
bug