Multi-Objective Vehicle Rebalancing for Ridehailing System using a Reinforcement Learning Approach,arXiv - CS - Systems and Control

当前位置： X-MOL 学术 › arXiv.cs.SY › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-Objective Vehicle Rebalancing for Ridehailing System using a Reinforcement Learning Approach
arXiv - CS - Systems and Control Pub Date : 2020-07-14 , DOI: arxiv-2007.06801
Yuntian Deng, Hao Chen, Shiping Shao, Jiacheng Tang, Jianzong Pi, Abhishek Gupta

The problem of designing a rebalancing algorithm for a large-scale ridehailing system with asymmetric demand is considered here. We pose the rebalancing problem within a semi Markov decision problem (SMDP) framework with closed queues of vehicles serving stationary, but asymmetric demand, over a large city with multiple nodes (representing neighborhoods). We assume that the passengers queue up at every node until they are matched with a vehicle. The goal of the SMDP is to minimize a convex combination of the waiting time of the passengers and the total empty vehicle miles traveled. The resulting SMDP appears to be difficult to solve for closed-form expression for the rebalancing strategy. As a result, we use a deep reinforcement learning algorithm to determine the approximately optimal solution to the SMDP. The trained policy is compared with other well-known algorithms for rebalancing, which are designed to address other objectives (such as to minimize demand drop probability) for the ridehailing problem.

中文翻译：

使用强化学习方法的叫车系统多目标车辆再平衡

这里考虑了为具有非对称需求的大规模乘车系统设计重新平衡算法的问题。我们在半马尔可夫决策问题 (SMDP) 框架内提出再平衡问题，其中封闭的车辆队列服务于具有多个节点（代表社区）的大城市，服务于静止但不对称的需求。我们假设乘客在每个节点排队，直到他们与车辆匹配。SMDP 的目标是最小化乘客等待时间和行驶的空车总里程的凸组合。由此产生的 SMDP 似乎难以解决重新平衡策略的封闭形式表达式。因此，我们使用深度强化学习算法来确定 SMDP 的近似最优解。

更新日期：2020-07-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文