当前位置: X-MOL 学术npj Digit. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep representation learning of electronic health records to unlock patient stratification at scale.
npj Digital Medicine ( IF 15.2 ) Pub Date : 2020-07-17 , DOI: 10.1038/s41746-020-0301-z
Isotta Landi 1, 2 , Benjamin S Glicksberg 3, 4, 5 , Hao-Chih Lee 4, 5 , Sarah Cherng 4, 5 , Giulia Landi 6 , Matteo Danieletto 3, 4, 5 , Joel T Dudley 4, 5 , Cesare Furlanello 1, 7 , Riccardo Miotto 3, 4, 5
Affiliation  

Deriving disease subtypes from electronic health records (EHRs) can guide next-generation personalized medicine. However, challenges in summarizing and representing patient data prevent widespread practice of scalable EHR-based stratification analysis. Here we present an unsupervised framework based on deep learning to process heterogeneous EHRs and derive patient representations that can efficiently and effectively enable patient stratification at scale. We considered EHRs of 1,608,741 patients from a diverse hospital cohort comprising a total of 57,464 clinical concepts. We introduce a representation learning model based on word embeddings, convolutional neural networks, and autoencoders (i.e., ConvAE) to transform patient trajectories into low-dimensional latent vectors. We evaluated these representations as broadly enabling patient stratification by applying hierarchical clustering to different multi-disease and disease-specific patient cohorts. ConvAE significantly outperformed several baselines in a clustering task to identify patients with different complex conditions, with 2.61 entropy and 0.31 purity average scores. When applied to stratify patients within a certain condition, ConvAE led to various clinically relevant subtypes for different disorders, including type 2 diabetes, Parkinson’s disease, and Alzheimer’s disease, largely related to comorbidities, disease progression, and symptom severity. With these results, we demonstrate that ConvAE can generate patient representations that lead to clinically meaningful insights. This scalable framework can help better understand varying etiologies in heterogeneous sub-populations and unlock patterns for EHR-based research in the realm of personalized medicine.



中文翻译:

电子健康记录的深度表示学习可大规模解锁患者分层。

从电子健康记录 (EHR) 中得出疾病亚型可以指导下一代个性化医疗。然而,总结和表示患者数据方面的挑战阻碍了基于 EHR 的可扩展分层分析的广泛实践。在这里,我们提出了一个基于深度学习的无监督框架,用于处理异构 EHR 并导出患者表征,从而高效且有效地实现大规模患者分层。我们考虑了来自不同医院队列的 1,608,741 名患者的 EHR,总共包含 57,464 个临床概念。我们引入了基于词嵌入、卷积神经网络和自动编码器(即 ConvAE)的表示学习模型,将患者轨迹转换为低维潜在向量。我们评估了这些表示,通过将层次聚类应用于不同的多疾病和特定疾病的患者群体,广泛地实现了患者分层。在识别具有不同复杂状况的患者的聚类任务中,ConvAE 显着优于多个基线,熵平均得分为 2.61,纯度平均得分为 0.31。当应用于对特定病情的患者进行分层时,ConvAE 会导致不同疾病的各种临床相关亚型,包括 2 型糖尿病、帕金森病和阿尔茨海默病,很大程度上与合并症、疾病进展和症状严重程度相关。通过这些结果,我们证明 ConvAE 可以生成患者表征,从而产生具有临床意义的见解。这种可扩展的框架可以帮助更好地了解异质亚群中的不同病因,并解锁个性化医疗领域基于 EHR 的研究模式。

更新日期:2020-07-17
down
wechat
bug