当前位置: X-MOL 学术Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Electronic Health Record-Based Screening for Substance Abuse.
Big Data ( IF 4.6 ) Pub Date : 2018-09-01 , DOI: 10.1089/big.2018.0002
Farrokh Alemi 1 , Sanja Avramovic 1 , Mark D Schwartz 2
Affiliation  

Abstract Existing methods of screening for substance abuse (standardized questionnaires or clinician's simply asking) have proven difficult to initiate and maintain in primary care settings. This article reports on how predictive modeling can be used to screen for substance abuse using extant data in electronic health records (EHRs). We relied on data available through Veterans Affairs Informatics and Computing Infrastructure (VINCI) for the years 2006 through 2016. We focused on 4,681,809 veterans who had at least two primary care visits; 829,827 of whom had a hospitalization. Data included 699 million outpatient and 17 million inpatient records. The dependent variable was substance abuse as identified from 89 diagnostic codes using the Agency for Healthcare Quality and Research classification of diseases. In addition, we included the diagnostic codes used for identification of prescription abuse. The independent variables were 10,292 inpatient and 13,512 outpatient diagnoses, plus 71 dummy variables measuring age at different years between 20 and 90 years. A modified naive Bayes model was used to aggregate the risk across predictors. The accuracy of the predictions was examined using area under the receiver operating characteristic (AROC) curve in 20% of data, randomly set aside for the evaluation. Many physical/mental illnesses were associated with substance abuse. These associations supported findings reported in the literature regarding the impact of substance abuse on various diseases and vice versa. In randomly set-aside validation data, the model accurately predicted substance abuse for inpatient (AROC = 0.884), outpatient (AROC = 0.825), and combined inpatient and outpatient (AROC = 0.840) data. If one excludes information available after substance abuse is known, the cross-validated AROC remained high, 0.822 for inpatient and 0.817 for outpatient data. Data within EHRs can be used to detect existing or predict potential future substance abuse.

中文翻译:

基于电子病历的物质滥用筛查。

摘要现有的药物滥用筛查方法(标准问卷或临床医生的简单询问)已被证明很难在初级保健机构中启动和维持。本文报告了如何使用预测模型来利用电子健康记录(EHR)中的现有数据来筛选药物滥用情况。我们依靠2006年至2016年期间通过退伍军人事务信息和计算基础架构(VINCI)获得的数据。我们集中于4,681,809名退伍军人进行了至少两次初级保健就诊。其中829,827人住院。数据包括6.99亿门诊病人和1700万住院病人记录。因变量是滥用药物,使用了美国医疗保健质量和研究机构的疾病分类从89种诊断代码中识别出来。此外,我们包括了用于识别处方滥用的诊断代码。自变量为住院患者10,292名,门诊诊断为13,512名,另外还有71个虚拟变量用于测量20至90岁之间不同年龄的年龄。修改后的朴素贝叶斯模型用于汇总各预测因素之间的风险。使用接收器工作特性(AROC)曲线下的面积(占20%的数据)检查预测的准确性,并随机留出用于评估。许多身体/精神疾病与药物滥用有关。这些协会支持文献中有关药物滥用对各种疾病影响的研究结果,反之亦然。在随机预留的验证数据中,该模型可以准确预测住院患者(AROC = 0.884),门诊患者(AROC = 0.825),以及住院和门诊患者的综合数据(AROC = 0.840)。如果知道药物滥用后排除了可用信息,则交叉验证的AROC仍然很高,住院患者为0.822,门诊患者为0.817。EHR中的数据可用于检测现有或预测未来潜在的药物滥用。
更新日期:2018-09-01
down
wechat
bug