Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis,Artificial Intelligence in Medicine

当前位置： X-MOL 学术 › Artif. Intell. Med. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis
Artificial Intelligence in Medicine ( IF 7.5 ) Pub Date : 2020-12-15 , DOI: 10.1016/j.artmed.2020.102003
Luca Roggeveen ₁ , Ali El Hassouni ₂ , Jonas Ahrendt ₃ , Tingjie Guo ₃ , Lucas Fleuren ₁ , Patrick Thoral ₃ , Armand Rj Girbes ₃ , Mark Hoogendoorn ₂ , Paul Wg Elbers ₃

Affiliation

Introduction

In recent years, reinforcement learning (RL) has gained traction in the healthcare domain. In particular, RL methods have been explored for haemodynamic optimization of septic patients in the Intensive Care Unit. Most hospitals however, lack the data and expertise for model development, necessitating transfer of models developed using external datasets. This approach assumes model generalizability across different patient populations, the validity of which has not previously been tested. In addition, there is limited knowledge on safety and reliability. These challenges need to be addressed to further facilitate implementation of RL models in clinical practice.

Method

We developed and validated a new reinforcement learning model for hemodynamic optimization in sepsis on the MIMIC intensive care database from the USA using a dueling double deep Q network. We then transferred this model to the European AmsterdamUMCdb intensive care database. T-Distributed Stochastic Neighbor Embedding and Sequential Organ Failure Assessment scores were used to explore the differences between the patient populations. We apply off-policy policy evaluation methods to quantify model performance. In addition, we introduce and apply a novel deep policy inspection to analyse how the optimal policy relates to the different phases of sepsis and sepsis treatment to provide interpretable insight in order to assess model safety and reliability.

Results

The off-policy evaluation revealed that the optimal policy outperformed the physician policy on both datasets despite marked differences between the two patient populations and physician's policies. Our novel deep policy inspection method showed insightful results and unveiled that the model could initiate therapy adequately and adjust therapy intensity to illness severity and disease progression which indicated safe and reliable model behaviour. Compared to current physician behavior, the developed policy prefers a more liberal use of vasopressors with a more restrained use of fluid therapy in line with previous work.

Conclusion

We created a reinforcement learning model for optimal bedside hemodynamic management and demonstrated model transferability between populations from the USA and Europe for the first time. We proposed new methods for deep policy inspection integrating expert domain knowledge. This is expected to facilitate progression to bedside clinical decision support for the treatment of critically ill patients.

中文翻译：

一种新的强化学习模型的跨大西洋转移性，用于优化脓毒症危重患者的血流动力学治疗

简介

近年来，强化学习 (RL) 在医疗保健领域受到关注。特别是，RL 方法已被探索用于重症监护病房中脓毒症患者的血流动力学优化。然而，大多数医院缺乏模型开发的数据和专业知识，需要转移使用外部数据集开发的模型。这种方法假设模型具有跨不同患者群体的通用性，其有效性之前尚未经过测试。此外，关于安全性和可靠性的知识有限。需要解决这些挑战，以进一步促进 RL 模型在临床实践中的实施。

方法

我们使用决斗双深度 Q 网络在美国 MIMIC 重症监护数据库上开发并验证了一种新的强化学习模型，用于脓毒症中的血流动力学优化。然后我们将此模型转移到欧洲阿姆斯特丹UMCdb重症监护数据库。使用 T 分布随机邻居嵌入和顺序器官衰竭评估分数来探索患者群体之间的差异。我们应用离策略策略评估方法来量化模型性能。此外，我们引入并应用了一种新颖的深度策略检查来分析最佳策略如何与脓毒症和脓毒症治疗的不同阶段相关，以提供可解释的见解，以评估模型的安全性和可靠性。

结果

离策略评估表明，尽管两个患者群体和医生的策略之间存在显着差异，但最佳策略在两个数据集上都优于医生策略。我们新颖的深度政策检查方法显示了深刻的结果，并揭示了该模型可以充分启动治疗并根据疾病严重程度和疾病进展调整治疗强度，这表明模型行为安全可靠。与目前的医生行为相比，已制定的政策更喜欢更自由地使用血管加压药，与以前的工作一致，更克制地使用液体疗法。