Methods to improve the quality of smoking records in a primary care EMR database: exploring multiple imputation and pattern-matching algorithms.,BMC Medical Informatics and Decision Making

当前位置： X-MOL 学术 › BMC Med. Inform. Decis. Mak. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Methods to improve the quality of smoking records in a primary care EMR database: exploring multiple imputation and pattern-matching algorithms.
BMC Medical Informatics and Decision Making ( IF 3.3 ) Pub Date : 2020-03-14 , DOI: 10.1186/s12911-020-1068-5
Stephanie Garies _{1,

2} , Michael Cummings ₃ , Hude Quan ₂ , Kerry McBrien _{1,

2} , Neil Drummond _{1,

2,

3,

4} , Donna Manca ₃ , Tyler Williamson ₂

Affiliation

BACKGROUND Primary care electronic medical record (EMR) data are emerging as a useful source for secondary uses, such as disease surveillance, health outcomes research, and practice improvement. These data capture clinical details about patients' health status, as well as behavioural risk factors, such as smoking. While the importance of documenting smoking status in a healthcare setting is recognized, the quality of smoking data captured in EMRs is variable. This study was designed to test methods aimed at improving the quality of patient smoking information in a primary care EMR database. METHODS EMR data from community primary care settings extracted by two regional practice-based research networks in Alberta, Canada were used. Patients with at least one encounter in the previous 2 years (2016-2018) and having hypertension according to a validated definition were included (n = 48,377). Multiple imputation was tested under two different assumptions for missing data (smoking status is missing at random and missing not-at-random). A third method tested a novel pattern matching algorithm developed to augment smoking information in the primary care EMR database. External validity was examined by comparing the proportions of smoking categories generated in each method with a general population survey. RESULTS Among those with hypertension, 40.8% (n = 19,743) had either no smoking information recorded or it was not interpretable and considered missing. Those with missing smoking data differed statistically by demographics, clinical features, and type of EMR system used in the clinic. Both multiple imputation methods produced fully complete smoking status information, with the proportion of current smokers estimated at 25.3% (data missing at random) and 12.5% (data missing not-at-random). The pattern-matching algorithm classified 18.2% of patients as current smokers, similar to the population-based survey (18.9%), but still resulted in missing smoking information for 23.6% of patients. The algorithm was estimated to be 93.8% accurate overall, but varied by smoking status category. CONCLUSION Multiple imputation and algorithmic pattern-matching can be used to improve EMR data post-extraction but the recommended method depends on the purpose of secondary use (e.g. practice improvement or epidemiological analyses).

中文翻译：

提高初级保健 EMR 数据库中吸烟记录质量的方法：探索多重插补和模式匹配算法。

背景技术初级保健电子病历（EMR）数据正在成为二次用途的有用来源，例如疾病监测、健康结果研究和实践改进。这些数据捕获有关患者健康状况以及吸烟等行为危险因素的临床详细信息。虽然人们认识到在医疗保健环境中记录吸烟状况的重要性，但电子病历中捕获的吸烟数据的质量却存在差异。本研究旨在测试旨在提高初级保健 EMR 数据库中患者吸烟信息质量的方法。方法使用加拿大艾伯塔省两个基于实践的区域研究网络提取的社区初级保健机构的电子病历数据。纳入的患者在过去 2 年（2016-2018 年）中至少有过一次发病史，并且根据经过验证的定义患有高血压（n = 48,377）。在两种不同的缺失数据假设下测试了多重插补（吸烟状况随机缺失和非随机缺失）。第三种方法测试了一种新颖的模式匹配算法，该算法是为增强初级保健 EMR 数据库中的吸烟信息而开发的。通过将每种方法产生的吸烟类别的比例与一般人群调查进行比较来检查外部有效性。结果在高血压患者中，40.8% (n = 19,743) 没有记录吸烟信息，或者无法解释并被认为缺失。那些缺少吸烟数据的人因人口统计、临床特征和诊所使用的电子病历系统类型而存在统计差异。两种多重插补方法都产生了完全完整的吸烟状况信息，目前吸烟者的比例估计为 25.3%（数据随机缺失）和 12.3%。5%（数据丢失不是随机的）。模式匹配算法将 18.2% 的患者分类为当前吸烟者，与基于人群的调查 (18.9%) 类似，但仍然导致 23.6% 的患者丢失吸烟信息。该算法的总体准确率估计为 93.8%，但因吸烟状况类别而异。结论多重插补和算法模式匹配可用于改进 EMR 数据提取后，但推荐的方法取决于二次使用的目的（例如实践改进或流行病学分析）。

更新日期：2020-04-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11