Identifying subpopulations of septic patients: A temporal data-driven approach
Introduction
In intensive care units (ICUs), it is important to monitor patient health status over time. This necessity results in multiple measurements of a particular clinical variable across a given patients stay. Due to the time-varying nature of these measurements, they are usually referred to as temporal/longitudinal electronic health records (EHRs). Since temporal EHR data have valuable information about the evolution of patient health status, researchers have been able to improve predictive modeling [[1], [2], [3]] and patient stratification [4] through using these additional data. However, analyzing longitudinal data is accompanied with multiple challenges such as irregular sampling rates and varying lengths of available measurements, as well as the inherent correlation of repeated measurements of a variable over time for the same patient.
The goal of this research is two-fold. First, through an exploratory analysis, we investigated the informativeness of the temporal vital signs data in our septic patient stratification via functional data analysis [5]. Second, we added other clinical information about the patients to the temporal vital signs data and performed cluster analysis to identify subpopulations of septic patients with similar clinical needs and trajectories in our data. An exploratory analysis of the included variables was used to interpret the identified clusters and derive insights. This information may be used to design customized care platforms for septic patients who share similar needs.
Section snippets
Longitudinal data analysis in ICU patients
ICUs have a heterogeneous population with different health status dynamics but similar needs for constant care [6,7]. The heterogeneity in ICUs adds to the importance of finding similar patients and detecting the underlying phenotypic groups. Recently, efforts have been made to employ the temporal information of heterogeneous EHR data to reveal subpopulations.
A large and growing body of literature has investigated vital sign trajectories to discover patient subpopulations in order to identify
Study sample
This study utilized a subset of data from patients admitted to the ICUs of the Beth Israel Deaconess Medical Center between 2008 and 2012 (MIMIC III database [23]) provided in [18] and data extraction was done using the code provided by the authors [24]. The data from 2001 to 2007 were excluded to focus on the population of MIMIC which have antibiotic prescription measured. Because the MIMIC database is de-identified and public, patient consent and research ethics approval were waived. From
FPC score extraction
Applying PACE on vital signs with fraction-of-variance-explained threshold set to 98% resulted in three eigenfunctions for HR and RR, four for MBP, SysBP and SpO2 and five for Temp. The scores for these eigenfunctions were extracted for each patient to be used as features for the clustering phase.
Step 1: investigating the informativeness of temporal data over cross-sectional data
This section presents the results for hierarchical clustering with Euclidean metric (HE), hierarchical clustering with cosine metric (HC), DBSCAN with Euclidean metric (DE) and DBSCAN with cosine
Discussion
This study focused on leveraging temporal vital sign data along with other clinical characteristics in clustering septic patients. Summarizing originally longitudinal vital sign measurements by their average eliminates the temporal dynamics of patient health status changes which can be helpful in phenotyping septic patients. Our results from the exploratory analysis in Step 1 (Section 4.2.1) implied that including information about vital signs trends results in the emergence of subpopulations
Conclusions
In this paper, we deployed hierarchical clustering and DBSCAN to discover phenotypes in Sepsis-3 patients while including temporal aspect of vital signs. We used t-SNE to construct a two-dimensional mapping of patients based on patient similarities derived by Euclidean and cosine metrics. In the first step, we demonstrated that including temporal characteristics of vital signs independent of clustering method and patient similarity metric helps in stratifying septic patients. In the second
Declaration of competing interestCOI
None declared.
Acknowledgement
This study was supported by Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grants (RGPIN-2014-04743, RGPIN-2020-04382) and an Early Researcher Award from the Ontario Ministry of Research, Innovation and Science (ER15-11-188).
References (49)
- et al.
Incorporating temporal ehr data in predictive models for risk stratification of renal function deterioration
J. Biomed. Inf.
(2015) - et al.
Ensemble fuzzy models in personalized medicine: application to vasopressors administration
Eng. Appl. Artif. Intell.
(2016) - et al.
Septic shock prediction for icu patients via coupled hmm walking on sequential contrast patterns
J. Biomed. Inf.
(2017) - et al.
A graphical aid to the interpretation and validation of cluster analysis
J. Comput. Appl. Math.
(1987) Cluster-wise assessment of cluster stability
Comput. Stat. Data Anal.
(2007)- et al.
Serum anion gap predicts lactate poorly, but may be used to identify sepsis patients at risk for death: a cohort study
J. Crit. Care
(2018) - et al.
A novel temporal similarity measure for patients based on irregularly measured data in electronic health records
- et al.
Localized supervised metric learning on temporal physiological data
- et al.
Patient similarity in prediction models based on health data: a scoping review
JMIR medical informatics
(2017) - et al.
Functional Data Analysis
(2005)