Dynamic holding control to avoid bus bunching: A multi-agent deep reinforcement learning framework,Transportation Research Part C: Emerging Technologies

当前位置： X-MOL 学术 › Transp. Res. Part C Emerg. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Dynamic holding control to avoid bus bunching: A multi-agent deep reinforcement learning framework
Transportation Research Part C: Emerging Technologies ( IF 7.6 ) Pub Date : 2020-05-25 , DOI: 10.1016/j.trc.2020.102661
Jiawei Wang , Lijun Sun

Bus bunching has been a long-standing problem that undermines the efficiency and reliability of public transport services. The most popular countermeasure in practice is to introduce static and dynamic holding control. However, most previous holding control strategies mainly consider local information with a pre-specified headway/schedule, while the global coordination of the whole bus fleet and its long-term effect are often overlooked. To efficiently incorporate global coordination and long-term operation in bus holding, in this paper we propose a multi-agent deep reinforcement learning (MDRL) framework to develop dynamic and flexible holding control strategies for a bus route. Specifically, we model each bus as an agent that interacts with not only its leader/follower but also all other vehicles in the fleet. To better explore potential strategies, we develop an effective headway-based reward function in the proposed framework. In the learning framework, we model fleet coordination by using a basic actor-critic scheme along with a joint action tracker to better characterize the complex interactions among agents in policy learning, and we apply proximal policy optimization to improve learning performance. We conduct extensive numerical experiments to evaluate the proposed MDRL framework against multiple baseline models that only rely on local information. Our results demonstrate the superiority of the proposed framework and show the promise of applying MDRL in the coordinative control of public transport vehicle fleets in real-world operations.

中文翻译：

动态保持控制，避免总线聚束：多主体深度强化学习框架

公交车集中是一个长期存在的问题，它破坏了公共交通服务的效率和可靠性。实际上，最流行的对策是引入静态和动态保持控制。然而，大多数先前的控股控制策略主要考虑具有预先规定的进度/时间表的本地信息，而整个公交车队的全球协调及其长期影响常常被忽视。为了有效地将全局协调和长期运行纳入公交车候车中，本文提出了一种多主体深度强化学习（MDRL）框架，以开发公交路线的动态和灵活的公交车候车控制策略。具体来说，我们将每辆公交车建模为一个不仅与其领导者/跟随者而且与车队中所有其他车辆进行交互的代理。为了更好地探索潜在策略，我们在提出的框架中开发了有效的基于进度的奖励功能。在学习框架中，我们通过使用基本的行为者批评方案以及联合行动跟踪程序对舰队协调进行建模，以更好地表征政策学习中主体之间的复杂互动，并且我们应用近端策略优化来提高学习绩效。我们进行了广泛的数值实验，以针对仅依赖本地信息的多个基线模型评估建议的MDRL框架。我们的结果证明了所提出框架的优越性，并显示了在实际操作中将MDRL应用于公共交通车队协调控制的希望。我们通过使用基本的行为者批评方案以及联合行动跟踪器对机队协调进行建模，以更好地表征政策学习中代理商之间的复杂互动，并且我们应用近端策略优化来提高学习绩效。我们进行了广泛的数值实验，以针对仅依赖本地信息的多个基线模型评估建议的MDRL框架。我们的结果证明了所提出框架的优越性，并显示了在实际操作中将MDRL应用于公共交通车队协调控制的希望。我们通过使用基本的行为者批评方案以及联合行动跟踪器对机队协调进行建模，以更好地表征政策学习中代理商之间的复杂互动，并且我们应用近端策略优化来提高学习绩效。我们进行了广泛的数值实验，以针对仅依赖本地信息的多个基线模型评估建议的MDRL框架。我们的结果证明了所提出框架的优越性，并显示了在实际操作中将MDRL应用于公共交通车队协调控制的希望。我们进行了广泛的数值实验，以针对仅依赖本地信息的多个基线模型评估建议的MDRL框架。我们的结果证明了所提出框架的优越性，并显示了在实际操作中将MDRL应用于公共交通车队协调控制的希望。我们进行了广泛的数值实验，以针对仅依赖本地信息的多个基线模型评估建议的MDRL框架。我们的结果证明了所提出框架的优越性，并显示了在实际操作中将MDRL应用于公共交通车队协调控制的希望。

更新日期：2020-05-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文