当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Functional Sequential Treatment Allocation
Journal of the American Statistical Association ( IF 3.0 ) Pub Date : 2021-01-19 , DOI: 10.1080/01621459.2020.1851236
Anders Bredahl Kock 1, 2 , David Preinerstorfer 3 , Bezirgen Veliyev 2
Affiliation  

Abstract

Consider a setting in which a policy maker assigns subjects to treatments, observing each outcome before the next subject arrives. Initially, it is unknown which treatment is best, but the sequential nature of the problem permits learning about the effectiveness of the treatments. While the multi-armed-bandit literature has shed much light on the situation when the policy maker compares the effectiveness of the treatments through their mean, much less is known about other targets. This is restrictive, because a cautious decision maker may prefer to target a robust location measure such as a quantile or a trimmed mean. Furthermore, socio-economic decision making often requires targeting purpose specific characteristics of the outcome distribution, such as its inherent degree of inequality, welfare or poverty. In the present article, we introduce and study sequential learning algorithms when the distributional characteristic of interest is a general functional of the outcome distribution. Minimax expected regret optimality results are obtained within the subclass of explore-then-commit policies, and for the unrestricted class of all policies. Supplementary materials for this article are available online.



中文翻译:

功能性顺序处理分配

摘要

考虑一个政策制定者分配受试者接受治疗的环境,在下一个受试者到达之前观察每个结果。最初,不知道哪种治疗是最好的,但问题的顺序性质允许了解治疗的有效性。当决策者通过均值比较治疗的有效性时,多臂强盗的文献已经阐明了很多情况,但对其他目标知之甚少。这是限制性的,因为谨慎的决策者可能更喜欢以稳健的位置度量为目标,例如分位数或修剪均值。此外,社会经济决策通常需要针对结果分布的特定目的特征,例如其固有的不平等程度、福利或贫困程度。在本文中,当感兴趣的分布特征是结果分布的一般函数时,我们会介绍和研究顺序学习算法。Minimax 期望后悔最优结果是在 explore-then-commit 策略的子类和所有策略的无限制类中获得的。本文的补充材料可在线获取。

更新日期:2021-01-19
down
wechat
bug