Abstract
Autonomous on-demand services, such as GOGOX (formerly GoGoVan) in Hong Kong, provide a platform for users to request services and for suppliers to meet such demands. In such a platform, the suppliers have autonomy to accept or reject the demands to be dispatched to him/her, so it is challenging to make an online matching between demands and suppliers. Existing methods use round-based approaches to dispatch demands. In these works, the dispatching decision is based on the predicted response patterns of suppliers to demands in the current round, but they all fail to consider the impact of future demands and suppliers on the current dispatching decision. This could lead to taking a suboptimal dispatching decision from the future perspective. To solve this problem, we propose a novel demand dispatching model using deep reinforcement learning. In this model, we make each demand as an agent. The action of each agent, i.e., the dispatching decision of each demand, is determined by a centralized algorithm in a coordinated way. The model works in the following two steps. (1) It learns the demand’s expected value in each spatiotemporal state using historical transition data. (2) Based on the learned values, it conducts a Many-To-Many dispatching using a combinatorial optimization algorithm by considering both immediate rewards and expected values of demands in the next round. In order to get a higher total reward, the demands with a high expected value (short response time) in the future may be delayed to the next round. On the contrary, the demands with a low expected value (long response time) in the future would be dispatched immediately. Through extensive experiments using real-world datasets, we show that the proposed model outperforms the existing models in terms of Cancellation Rate and Average Response Time.
- Aamena Alshamsi, Sherief Abdallah, and Iyad Rahwan. 2009. Multiagent self-organization for a taxi dispatch system. In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems. 21–28.Google Scholar
- P. Arunapuram, J. W. Bartel, and P. Dewan. 2014. Distribution, correlation and prediction of response times in Stack Overflow. In Proceedings of the 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing. 378–387. DOI:https://doi.org/10.4108/icst.collaboratecom.2014.257265Google Scholar
- N. Burlutskiy, A. Fish, N. Ali, and M. Petridis. 2015. Prediction of users’ response time in Q&A communities. In Proceedings of the 2015 IEEE 14th International Conference on Machine Learning and Applications. 618–623. DOI:https://doi.org/10.1109/ICMLA.2015.190Google Scholar
- Yong Chen, Ming Zhou, Ying Wen, Yaodong Yang, Yufeng Su, Weinan Zhang, Dell Zhang, Jun Wang, and Han Liu. 2018. Factorized Q-learning for large-scale multi-agent systems. In Proceedings of the 1st International Conference on Distributed Artificial Intelligence. ACM, Article No. 7. https://doi.org/10.1145/3356464.3357707 Google Scholar
- P. Cheng, X. Lian, L. Chen, and C. Shahabi. 2017. Prediction-based task assignment in spatial crowdsourcing. In Proceedings of the 2017 IEEE 33rd International Conference on Data Engineering. IEEE, 997–1008. DOI:https://doi.org/10.1109/ICDE.2017.146Google Scholar
- David Geiger and Martin Schader. 2014. Personalized task recommendation in crowdsourcing information systems—current state of the art. Decision Support Systems 65, C (2014), 3–16.Google Scholar
- GOGOX. 2020. GOGOX Hong Kong. Retrieved from https://www.gogox.com.hk.Google Scholar
- Jiarui Jin, Ming Zhou, Weinan Zhang, Minne Li, Zilong Guo, Zhiwei Qin, Yan Jiao, Xiaocheng Tang, Chenxi Wang, Jun Wang, Guobin Wi, and Jieping Ye. 2019. CoRide: Joint order dispatching and fleet management for multi-scale ride-hailing platforms. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. ACM, New York, NY, USA, 1983–1992. DOI:https://doi.org/10.1145/3357384.3357978Google ScholarDigital Library
- Jintao Ke, Feng Xiao, Hai Yang, and Jieping Ye. 2020. Learning to delay in ride-sourcing systems: a multi-agent deep reinforcement learning framework. IEEE Transactions on Knowledge and Data Engineering. DOI:10.1109/TKDE.2020.3006084Google Scholar
- S. Klos née Müller, C. Tekin, M. van der Schaar, and A. Klein. 2018. Context-aware hierarchical online learning for performance maximization in mobile crowdsourcing. IEEE/ACM Transactions on Networking 26, 3 (Jun. 2018), 1334–1347. DOI:https://doi.org/10.1109/TNET.2018.2828415Google Scholar
- Derhorng Lee, Hao Wang, Ruey Long Cheu, and Siew Hoon Teo. 2004. Taxi dispatch system based on current demands and real-time traffic conditions. Transportation Research Record 1882, 1882 (2004), 193–200.Google ScholarCross Ref
- W. Li, J. Cao, J. Guan, S. Zhou, G. Liang, W. K. Y. So, and M. Szczecinski. 2019. A general framework for unmet demand prediction in on-demand transport services. IEEE Transactions on Intelligent Transportation Systems 20, 8 (Aug. 2019), 2820–2830. DOI:https://doi.org/10.1109/TITS.2018.2873092Google ScholarCross Ref
- Kaixiang Lin, Renyu Zhao, Zhe Xu, and Jiayu Zhou. 2018. Efficient large-scale fleet management via multi-agent deep reinforcement learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, New York, NY, 1774–1783. DOI:https://doi.org/10.1145/3219819.3219993Google ScholarDigital Library
- Jalal Mahmud, Jilin Chen, and Jeffrey Nichols. 2013. When will you answer this? Estimating response time in Twitter. In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media.Google Scholar
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529–533. DOI:https://doi.org/10.1038/nature14236Google Scholar
- James Munkres. 1957. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics 10, 1 (1957), 196–210.Google Scholar
- K. T. Seow, N. H. Dang, and D. Lee. 2010. A collaborative multiagent taxi-dispatch system. IEEE Transactions on Automation Science and Engineering 7, 3 (Jul. 2010), 607–616. DOI:https://doi.org/10.1109/TASE.2009.2028577Google Scholar
- Xiaocheng Tang, Zhiwei (Tony) Qin, Fan Zhang, Zhaodong Wang, Zhe Xu, Yintai Ma, Hongtu Zhu, and Jieping Ye. 2019. A deep value-network based approach for multi-driver order dispatching. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 1780–1790. DOI:https://doi.org/10.1145/3292500.3330724Google ScholarDigital Library
- Hien To, Cyrus Shahabi, and Leyla Kazemi. 2015. A server-assigned spatial crowdsourcing framework. ACM Transaction of Spatial Algorithms Systems 1, 1 (Jul. 2015), Article 2, 28 pages. DOI:https://doi.org/10.1145/2729713Google ScholarDigital Library
- Yongxin Tong, Jieying She, Bolin Ding, Lei Chen, Tianyu Wo, and Ke Xu. 2016. Online minimum matching in real-time spatial data: Experiments and analysis. Proceedings of the VLDB Endowment 9, 12 (Aug. 2016), 1053–1064. DOI:https://doi.org/10.14778/2994509.2994523Google ScholarDigital Library
- Yongxin Tong, Jieying She, Bolin Ding, Libin Wang, and Lei Chen. 2016. Online mobile micro-task allocation in spatial crowdsourcing. In Proceedings of the 2016 IEEE 32Nd International Conference on Data Engineering. IEEE, 49–60.Google ScholarCross Ref
- Yongxin Tong, Libin Wang, Zhou Zimu, Bolin Ding, Lei Chen, Jieping Ye, and Ke Xu. 2017. Flexible online task assignment in real-time spatial data. Proceedings of the VLDB Endowment 10, 11 (2017), 1334–1345.Google ScholarDigital Library
- Y. Tong, Y. Zeng, B. Ding, L. Wang, and L. Chen. 2019. Two-sided online micro-task assignment in spatial crowdsourcing. IEEE Transactions on Knowledge and Data Engineering. DOI:https://doi.org/10.1109/TKDE.2019.2948863Google Scholar
- Yongxin Tong, Zimu Zhou, Yuxiang Zeng, Lei Chen, and Cyrus Shahabi. 2020. Spatial crowdsourcing: A survey. The VLDB Journal 29, 1 (2020), 217–250. DOI:https://doi.org/10.1007/s00778-019-00568-7Google ScholarDigital Library
- Yuqi Wang, Jiannong Cao, Lifang He, Wengen Li, Lichao Sun, and Philip S. Yu. 2017. Coupled sparse matrix factorization for response time prediction in logistics services. In Proceedings of the 2017 ACM Conference on Information and Knowledge Management. ACM, New York, NY, 939–947. DOI:https://doi.org/10.1145/3132847.3132948Google Scholar
- Y. Wang, Y. Tong, C. Long, P. Xu, K. Xu, and W. Lv. 2019. Adaptive dynamic bipartite graph matching: A reinforcement learning approach. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering. IEEE, 1478–1489. DOI:https://doi.org/10.1109/ICDE.2019.00133Google Scholar
- Z. Wang, Z. Qin, X. Tang, J. Ye, and H. Zhu. 2018. Deep reinforcement learning with knowledge transfer for online rides order dispatching. In Proceedings of the 2018 IEEE International Conference on Data Mining. IEEE, 617–626. DOI:https://doi.org/10.1109/ICDM.2018.00077Google ScholarCross Ref
- Zhe Xu, Zhixin Li, Qingwen Guan, Dingshui Zhang, Qiang Li, Junxiao Nan, Chunyang Liu, Wei Bian, and Jieping Ye. 2018. Large-scale order dispatch in on-demand ride-hailing platforms: A learning and planning approach. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 905–913. DOI:https://doi.org/10.1145/3219819.3219824Google ScholarDigital Library
- L. Yang, X. Yu, J. Cao, W. Li, Y. Wang, and M. Szczecinski. 2019. A novel demand dispatching model for autonomous on-demand services. IEEE Transactions on Services Computing. DOI:https://doi.org/10.1109/TSC.2019.2941680Google Scholar
- Lingyu Zhang, Tao Hu, Yue Min, Guobin Wu, Junying Zhang, Pengcheng Feng, Pinghua Gong, and Jieping Ye. 2017. A taxi order dispatch model based on combinatorial optimization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 2151–2159. DOI:https://doi.org/10.1145/3097983.3098138Google ScholarDigital Library
- L. Zheng and L. Chen. 2017. Maximizing acceptance in rejection-aware spatial crowdsourcing. IEEE Transactions on Knowledge and Data Engineering 29, 9 (Sep. 2017), 1943–1956. DOI:https://doi.org/10.1109/TKDE.2017.2676771Google ScholarCross Ref
- Libin Zheng, Lei Chen, and Jieping Ye. 2018. Order dispatch in price-aware ridesharing. Proceedings of the VLDB Endowment 11, 8 (Apr. 2018), 853–865. DOI:https://doi.org/10.14778/3204028.3204030Google ScholarDigital Library
Index Terms
- Exploring Deep Reinforcement Learning for Task Dispatching in Autonomous On-Demand Services
Recommendations
Addressing the Task of Rocket Recycling with Deep Reinforcement Learning
ICIT '18: Proceedings of the 6th International Conference on Information Technology: IoT and Smart CityReinforcement learning is a promising paradigm in addressing complex sequential decision-making problems, which has attracted increasing attention from various domains. In this paper, we investigated the plausibility of employing reinforcement learning ...
Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Neural Information ProcessingAbstractAs the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method ...
Deep reinforcement learning for multi-agent interaction
Multi-agent systems research in the United KingdomThe development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel ...
Comments