当前位置: X-MOL 学术Probab. Eng. Inf. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A regret lower bound for assortment optimization under the capacitated MNL model with arbitrary revenue parameters
Probability in the Engineering and Informational Sciences ( IF 1.1 ) Pub Date : 2021-09-01 , DOI: 10.1017/s0269964821000395
Yannik Peeters 1 , Arnoud V. den Boer 2
Affiliation  

In this note, we consider dynamic assortment optimization with incomplete information under the capacitated multinomial logit choice model. Recently, it has been shown that the regret (the cumulative expected revenue loss caused by offering suboptimal assortments) that any decision policy endures is bounded from below by a constant times $\sqrt {NT}$, where $N$ denotes the number of products and $T$ denotes the time horizon. This result is shown under the assumption that the product revenues are constant, and thus leaves the question open whether a lower regret rate can be achieved for nonconstant revenue parameters. In this note, we show that this is not the case: we show that, for any vector of product revenues there is a positive constant such that the regret of any policy is bounded from below by this constant times $\sqrt {N T}$. Our result implies that policies that achieve ${{\mathcal {O}}}(\sqrt {NT})$ regret are asymptotically optimal for all product revenue parameters.



中文翻译:

具有任意收益参数的容量化 MNL 模型下分类优化的遗憾下界

在本说明中,我们考虑在容量多项式 logit 选择模型下具有不完整信息的动态分类优化。最近,已经表明,任何决策策略所承受的遗憾(由于提供次优分类而导致的累积预期收入损失)都以常数倍$\sqrt {NT}$ 为界,其中$N$表示产品,$T$表示时间范围。该结果是在产品收入不变的假设下显示的,因此对于非常量的收入参数是否可以实现较低的遗憾率仍是一个悬而未决的问题。在本说明中,我们证明情况并非如此:我们证明,对于任何产品收入的向量有一个正常数,因此任何政策的遗憾都由这个常数乘以$\sqrt {NT}$ 为界。我们的结果表明,实现${{\mathcal {O}}}(\sqrt {NT})$后悔的策略对于所有产品收入参数都是渐近最优的。

更新日期:2021-09-01
down
wechat
bug