Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising,arXiv - CS - Systems and Control

当前位置： X-MOL 学术 › arXiv.cs.SY › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising
arXiv - CS - Systems and Control Pub Date : 2020-06-29 , DOI: arxiv-2006.16312
Xiaotian Hao, Zhaoqing Peng, Yi Ma, Guan Wang, Junqi Jin, Jianye Hao, Shan Chen, Rongquan Bai, Mingzhou Xie, Miao Xu, Zhenzhe Zheng, Chuan Yu, Han Li, Jian Xu, Kun Gai

In E-commerce, advertising is essential for merchants to reach their target users. The typical objective is to maximize the advertiser's cumulative revenue over a period of time under a budget constraint. In real applications, an advertisement (ad) usually needs to be exposed to the same user multiple times until the user finally contributes revenue (e.g., places an order). However, existing advertising systems mainly focus on the immediate revenue with single ad exposures, ignoring the contribution of each exposure to the final conversion, thus usually falls into suboptimal solutions. In this paper, we formulate the sequential advertising strategy optimization as a dynamic knapsack problem. We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space while ensuring the solution quality. To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach. Extensive offline and online experiments show the superior performance of our approaches over state-of-the-art baselines in terms of cumulative revenue.

中文翻译：

面向高效多渠道顺序广告的动态背包优化

在电子商务中，广告对于商家接触目标用户至关重要。典型的目标是在预算约束下使广告商在一段时间内的累积收入最大化。在实际应用中，一个广告（ad）通常需要多次向同一个用户展示，直到用户最终贡献收入（例如下单）。然而，现有的广告系统主要关注单次广告曝光的即时收入，而忽略了每次曝光对最终转化的贡献，因此通常会陷入次优解决方案。在本文中，我们将顺序广告策略优化表述为一个动态背包问题。我们提出了一个理论上有保证的双层优化框架，在保证解质量的同时，显着减少了原优化空间的解空间。为了提高强化学习的探索效率，我们还设计了一种有效的动作空间缩减方法。广泛的离线和在线实验表明，我们的方法在累积收入方面优于最先进的基线。

更新日期：2020-07-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>