Penalized Q-learning for dynamic treatment regimens,Statistica Sinica

当前位置： X-MOL 学术 › Stat. Sin. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Penalized Q-learning for dynamic treatment regimens
Statistica Sinica ( IF 1.5 ) Pub Date : 2015-01-01 , DOI: 10.5705/ss.2012.364
R Song ₁ , W Wang ₁ , D Zeng ₁ , M R Kosorok ₁

Affiliation

A dynamic treatment regimen incorporates both accrued information and long-term effects of treatment from specially designed clinical trials. As these trials become more and more popular in conjunction with longitudinal data from clinical studies, the development of statistical inference for optimal dynamic treatment regimens is a high priority. In this paper, we propose a new machine learning framework called penalized Q-learning, under which valid statistical inference is established. We also propose a new statistical procedure: individual selection and corresponding methods for incorporating individual selection within penalized Q-learning. Extensive numerical studies are presented which compare the proposed methods with existing methods, under a variety of scenarios, and demonstrate that the proposed approach is both inferentially and computationally superior. It is illustrated with a depression clinical trial study.

中文翻译：

动态治疗方案的惩罚 Q 学习

动态治疗方案结合了来自专门设计的临床试验的累积信息和治疗的长期效果。随着这些试验结合临床研究的纵向数据变得越来越流行，优化动态治疗方案的统计推断的发展成为当务之急。在本文中，我们提出了一种新的机器学习框架，称为惩罚 Q 学习，在该框架下建立了有效的统计推断。我们还提出了一种新的统计程序：个体选择和将个体选择纳入惩罚 Q 学习的相应方法。提供了广泛的数值研究，在各种情况下将所提出的方法与现有方法进行比较，并证明所提出的方法在推理和计算上都具有优势。用抑郁症临床试验研究说明了这一点。

更新日期：2015-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11