当前位置: X-MOL 学术npj Digit. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Imputation of missing values for electronic health record laboratory data
npj Digital Medicine ( IF 15.2 ) Pub Date : 2021-10-11 , DOI: 10.1038/s41746-021-00518-0
Jiang Li 1 , Xiaowei S Yan 2 , Durgesh Chaudhary 1 , Venkatesh Avula 1 , Satish Mudiganti 2 , Hannah Husby 2 , Shima Shahjouei 1 , Ardavan Afshar 3 , Walter F Stewart 4 , Mohammed Yeasin 5 , Ramin Zand 1 , Vida Abedi 1, 6
Affiliation  

Laboratory data from Electronic Health Records (EHR) are often used in prediction models where estimation bias and model performance from missingness can be mitigated using imputation methods. We demonstrate the utility of imputation in two real-world EHR-derived cohorts of ischemic stroke from Geisinger and of heart failure from Sutter Health to: (1) characterize the patterns of missingness in laboratory variables; (2) simulate two missing mechanisms, arbitrary and monotone; (3) compare cross-sectional and multi-level multivariate missing imputation algorithms applied to laboratory data; (4) assess whether incorporation of latent information, derived from comorbidity data, can improve the performance of the algorithms. The latter was based on a case study of hemoglobin A1c under a univariate missing imputation framework. Overall, the pattern of missingness in EHR laboratory variables was not at random and was highly associated with patients’ comorbidity data; and the multi-level imputation algorithm showed smaller imputation error than the cross-sectional method.



中文翻译:

电子健康记录实验室数据缺失值的插补

来自电子健康记录 (EHR) 的实验室数据通常用于预测模型,其中可以使用插补方法减轻估计偏差和模型性能的缺失。我们证明了估算在来自 Geisinger 的两个真实世界 EHR 衍生的缺血性卒中队列和来自 Sutter Health 的心力衰竭队列中的效用:(1)表征实验室变量中缺失的模式;(2) 模拟两种缺失机制,任意和单调;(3) 比较应用于实验室数据的横截面和多级多变量缺失插补算法;(4) 评估从合并症数据中提取的潜在信息的合并是否可以提高算法的性能。后者基于单变量缺失插补框架下的血红蛋白 A1c 案例研究。总体,不是随机的,并且与患者的合并症数据高度相关;多级插补算法的插补误差比横截面法小。

更新日期:2021-10-11
down
wechat
bug