当前位置: X-MOL 学术npj Digit. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Quantitative disease risk scores from EHR with applications to clinical risk stratification and genetic studies
npj Digital Medicine ( IF 15.2 ) Pub Date : 2021-07-23 , DOI: 10.1038/s41746-021-00488-3
Danqing Xu 1 , Chen Wang 1, 2 , Atlas Khan 2 , Ning Shang 2 , Zihuai He 3, 4 , Adam Gordon 5 , Iftikhar J Kullo 6 , Shawn Murphy 7, 8 , Yizhao Ni 9 , Wei-Qi Wei 10 , Ali Gharavi 2 , Krzysztof Kiryluk 2 , Chunhua Weng 11 , Iuliana Ionita-Laza 1
Affiliation  

Labeling clinical data from electronic health records (EHR) in health systems requires extensive knowledge of human expert, and painstaking review by clinicians. Furthermore, existing phenotyping algorithms are not uniformly applied across large datasets and can suffer from inconsistencies in case definitions across different algorithms. We describe here quantitative disease risk scores based on almost unsupervised methods that require minimal input from clinicians, can be applied to large datasets, and alleviate some of the main weaknesses of existing phenotyping algorithms. We show applications to phenotypic data on approximately 100,000 individuals in eMERGE, and focus on several complex diseases, including Chronic Kidney Disease, Coronary Artery Disease, Type 2 Diabetes, Heart Failure, and a few others. We demonstrate that relative to existing approaches, the proposed methods have higher prediction accuracy, can better identify phenotypic features relevant to the disease under consideration, can perform better at clinical risk stratification, and can identify undiagnosed cases based on phenotypic features available in the EHR. Using genetic data from the eMERGE-seq panel that includes sequencing data for 109 genes on 21,363 individuals from multiple ethnicities, we also show how the new quantitative disease risk scores help improve the power of genetic association studies relative to the standard use of disease phenotypes. The results demonstrate the effectiveness of quantitative disease risk scores derived from rich phenotypic EHR databases to provide a more meaningful characterization of clinical risk for diseases of interest beyond the prevalent binary (case-control) classification.



中文翻译:

来自 EHR 的定量疾病风险评分与临床风险分层和遗传研究的应用

在卫生系统中标记来自电子健康记录 (EHR) 的临床数据需要人类专家的广泛知识,以及临床医生的艰苦审查。此外,现有的表型分析算法并未在大型数据集中统一应用,并且可能会在不同算法之间的案例定义中出现不一致。我们在这里描述了基于几乎无监督方法的定量疾病风险评分,这些方法需要临床医生的最少输入,可以应用于大型数据集,并缓解现有表型算法的一些主要弱点。我们在 eMERGE 中展示了对大约 100,000 个人的表型数据的应用,并专注于几种复杂的疾病,包括慢性肾病、冠状动脉疾病、2 型糖尿病、心力衰竭等。我们证明,相对于现有方法,所提出的方法具有更高的预测准确性,可以更好地识别与所考虑疾病相关的表型特征,可以更好地进行临床风险分层,并且可以根据 EHR 中可用的表型特征识别未诊断病例。使用来自 eMERGE-seq 面板的遗传数据,其中包括来自多个种族的 21,363 个人的 109 个基因的测序数据,我们还展示了新的定量疾病风险评分如何帮助提高遗传关联研究相对于疾病表型的标准使用的能力。

更新日期:2021-07-23
down
wechat
bug