Using topological data analysis and pseudo time series to infer temporal phenotypes from electronic health records

https://doi.org/10.1016/j.artmed.2020.101930Get rights and content
Under a Creative Commons license
open access

Highlights

  • Topological Data and Pseudo Time Series to discover Type 2 Diabetes temporal phenotypes.

  • Temporal phenotypes inferred from state-space model based on hidden-states transitions.

  • Study of states continuous transitions visually delivered in an easily explainable way.

  • Mined phenotypes characterized by significant differences in disease deterioration.

Abstract

Temporal phenotyping enables clinicians to better understand observable characteristics of a disease as it progresses. Modelling disease progression that captures interactions between phenotypes is inherently challenging. Temporal models that capture change in disease over time can identify the key features that characterize disease subtypes that underpin these trajectories. These models will enable clinicians to identify early warning signs of progression in specific sub-types and therefore to make informed decisions tailored to individual patients. In this paper, we explore two approaches to building temporal phenotypes based on the topology of data: topological data analysis and pseudo time-series. Using type 2 diabetes data, we show that the topological data analysis approach is able to identify disease trajectories and that pseudo time-series can infer a state space model characterized by transitions between hidden states that represent distinct temporal phenotypes. Both approaches highlight lipid profiles as key factors in distinguishing the phenotypes.

Keywords

Type 2 diabetes
Unsupervised machine learning
Longitudinal studies
Electronic phenotyping

Cited by (0)