Valid post-selection inference in Robust Q-learning,arXiv - STAT - Methodology

当前位置： X-MOL 学术 › arXiv.stat.ME › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Valid post-selection inference in Robust Q-learning
arXiv - STAT - Methodology Pub Date : 2022-08-05 , DOI: arxiv-2208.03233
Jeremiah Jones, Ashkan Ertefaie, Robert L. Strawderman

Constructing an optimal adaptive treatment strategy becomes complex when there are a large number of potential tailoring variables. In such scenarios, many of these extraneous variables may contribute little or no benefit to an adaptive strategy while increasing implementation costs and putting an undue burden on patients. Although existing methods allow selection of the informative prognostic factors, statistical inference is complicated by the data-driven selection process. To remedy this deficiency, we adapt the Universal Post-Selection Inference procedure to the semiparametric Robust Q-learning method and the unique challenges encountered in such multistage decision methods. In the process, we also identify a uniform improvement to confidence intervals constructed in this post-selection inference framework. Under certain rate assumptions, we provide theoretical results that demonstrate the validity of confidence regions and tests constructed from our proposed procedure. The performance of our method is compared to the Selective Inference framework through simulation studies, demonstrating the strengths of our procedure and its applicability to multiple selection mechanisms.

中文翻译：

鲁棒 Q 学习中的有效选择后推理

当存在大量潜在的定制变量时，构建最佳的适应性治疗策略变得复杂。在这种情况下，这些无关变量中的许多可能对适应性策略的贡献很小或没有好处，同时增加了实施成本并给患者带来了不应有的负担。尽管现有方法允许选择信息性预后因素，但数据驱动的选择过程使统计推断变得复杂。为了弥补这一缺陷，我们将通用后选择推理过程应用于半参数鲁棒 Q 学习方法以及在这种多阶段决策方法中遇到的独特挑战。在此过程中，我们还确定了在此选择后推理框架中构建的置信区间的统一改进。在某些速率假设下，我们提供的理论结果证明了从我们提出的程序构建的置信区域和测试的有效性。通过模拟研究将我们方法的性能与选择性推理框架进行了比较，证明了我们程序的优势及其对多种选择机制的适用性。

更新日期：2022-08-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文