当前位置: X-MOL 学术J. Am. Med. Inform. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records.
Journal of the American Medical Informatics Association ( IF 6.4 ) Pub Date : 2020-09-24 , DOI: 10.1093/jamia/ocaa104
Neil S Zheng 1 , QiPing Feng 2, 3 , V Eric Kerchberger 1, 2 , Juan Zhao 1 , Todd L Edwards 2, 4 , Nancy J Cox 2, 4 , C Michael Stein 2, 3, 5 , Dan M Roden 1, 2, 3, 5 , Joshua C Denny 1, 2 , Wei-Qi Wei 1
Affiliation  

Abstract
Objective
Developing algorithms to extract phenotypes from electronic health records (EHRs) can be challenging and time-consuming. We developed PheMap, a high-throughput phenotyping approach that leverages multiple independent, online resources to streamline the phenotyping process within EHRs.
Materials and Methods
PheMap is a knowledge base of medical concepts with quantified relationships to phenotypes that have been extracted by natural language processing from publicly available resources. PheMap searches EHRs for each phenotype’s quantified concepts and uses them to calculate an individual’s probability of having this phenotype. We compared PheMap to clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network for type 2 diabetes mellitus (T2DM), dementia, and hypothyroidism using 84 821 individuals from Vanderbilt Univeresity Medical Center's BioVU DNA Biobank. We implemented PheMap-based phenotypes for genome-wide association studies (GWAS) for T2DM, dementia, and hypothyroidism, and phenome-wide association studies (PheWAS) for variants in FTO, HLA-DRB1, and TCF7L2.
Results
In this initial iteration, the PheMap knowledge base contains quantified concepts for 841 disease phenotypes. For T2DM, dementia, and hypothyroidism, the accuracy of the PheMap phenotypes were >97% using a 50% threshold and eMERGE case-control status as a reference standard. In the GWAS analyses, PheMap-derived phenotype probabilities replicated 43 of 51 previously reported disease-associated variants for the 3 phenotypes. For 9 of the 11 top associations, PheMap provided an equivalent or more significant P value than eMERGE-based phenotypes. The PheMap-based PheWAS showed comparable or better performance to a traditional phecode-based PheWAS. PheMap is publicly available online.
Conclusions
PheMap significantly streamlines the process of extracting research-quality phenotype information from EHRs, with comparable or better performance to current phenotyping approaches.


中文翻译:

PheMap:用于电子健康记录中高通量表型分析的多资源知识库。

摘要
客观的
开发从电子健康记录 (EHR) 中提取表型的算法可能具有挑战性且耗时。我们开发了 PheMap,这是一种高通量表型分析方法,它利用多个独立的在线资源来简化 EHR 中的表型分析过程。
材料和方法
PheMap 是医学概念的知识库,与通过自然语言处理从公开可用资源中提取的表型具有量化关系。PheMap 在 EHR 中搜索每种表型的量化概念,并使用它们来计算个人拥有这种表型的概率。我们使用范德比尔特大学医学中心 BioVU DNA 生物库的 84 821 名个体将 PheMap 与来自电子病历和基因组学 (eMERGE) 网络的 2 型糖尿病 (T2DM)、痴呆和甲状腺功能减退症的临床医生验证表型算法进行了比较。我们为 T2DM、痴呆和甲状腺功能减退症的全基因组关联研究 (GWAS) 实施了基于 PheMap 的表型,并针对FTO、HLA-DRB1TCF7L2
结果
在这个初始迭代中,PheMap 知识库包含 841 种疾病表型的量化概念。对于 T2DM、痴呆和甲状腺功能减退,使用 50% 阈值和 eMERGE 病例对照状态作为参考标准,PheMap 表型的准确性 > 97%。在 GWAS 分析中,PheMap 衍生的表型概率复制了先前报道的 3 种表型的 51 种疾病相关变异中的 43 种。对于 11 个顶级关联中的 9 个,PheMap 提供了与基于 eMERGE 的表型相同或更显着的P值。基于 PheMap 的 PheWAS 表现出与传统的基于 phecode 的 PheWAS 相当或更好的性能。PheMap 可在线公开获取。
结论
PheMap 显着简化了从 EHR 中提取研究质量表型信息的过程,其性能与当前的表型分析方法相当或更好。
更新日期:2020-11-18
down
wechat
bug