Reinforcement learning enabled dynamic bidding strategy for instant delivery trading,Computers & Industrial Engineering

当前位置： X-MOL 学术 › Comput. Ind. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Reinforcement learning enabled dynamic bidding strategy for instant delivery trading
Computers & Industrial Engineering ( IF 6.7 ) Pub Date : 2021-08-02 , DOI: 10.1016/j.cie.2021.107596
Chaojie Guo ₁ , Russell G. Thompson ₁ , Greg Foliente ₁ , Xiaoshuai Peng ₂

Affiliation

Due to the great potential to enable collaboration and improve consolidation, auctions have been identified as a possible effective option to improve the efficiency of instant delivery. Instant delivery markets are complex and dynamic systems influenced by highly random demand. Conventional bidding strategies require perfect market information and cannot be adjusted effectively according to the evolution of requests. To address this problem, this paper proposes an auction-based trading platform to enable freight transportation procurement and develops a Reinforcement Learning (RL) enabled dynamic bidding strategy to optimize carrier’s behavior in sequential auctions. In the RL enabled dynamic bidding strategy, three RL algorithms, including Q-learning, Deep Q Network and experience replay based Q-learning are used to improve carrier’s bidding ability. The simulation results demonstrate that compared with the conventional bidding strategy, the RL enabled dynamic bidding strategies with any of the three RL algorithms can help carrier secure more auctions and gain more profit in a competitive marketplace. In addition, the advantages of the RL enabled dynamic bidding strategies are more obvious and the performance is more stable in more uncertain market environments.

中文翻译：

强化学习为即时交付交易启用动态竞价策略

由于实现协作和改进整合的巨大潜力，拍卖已被确定为提高即时交付效率的可能有效选项。即时交付市场是受高度随机需求影响的复杂动态系统。传统的投标策略需要完善的市场信息，不能根据需求的演变进行有效的调整。为了解决这个问题，本文提出了一个基于拍卖的交易平台来实现货运采购，并开发了一种支持强化学习 (RL) 的动态投标策略，以优化承运人在顺序拍卖中的行为。在 RL 启用的动态投标策略中，三种 RL 算法，包括 Q-learning、使用深度Q网络和基于Q-learning的经验回放来提高运营商的竞价能力。仿真结果表明，与传统竞价策略相比，RL 启用动态竞价策略与三种 RL 算法中的任何一种都可以帮助运营商在竞争激烈的市场中获得更多拍卖并获得更多利润。此外，RL启用的动态竞价策略的优势更加明显，在更加不确定的市场环境中表现更加稳定。

更新日期：2021-08-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11