skip to main content
research-article

Exploring Deep Reinforcement Learning for Task Dispatching in Autonomous On-Demand Services

Authors Info & Claims
Published:21 April 2021Publication History
Skip Abstract Section

Abstract

Autonomous on-demand services, such as GOGOX (formerly GoGoVan) in Hong Kong, provide a platform for users to request services and for suppliers to meet such demands. In such a platform, the suppliers have autonomy to accept or reject the demands to be dispatched to him/her, so it is challenging to make an online matching between demands and suppliers. Existing methods use round-based approaches to dispatch demands. In these works, the dispatching decision is based on the predicted response patterns of suppliers to demands in the current round, but they all fail to consider the impact of future demands and suppliers on the current dispatching decision. This could lead to taking a suboptimal dispatching decision from the future perspective. To solve this problem, we propose a novel demand dispatching model using deep reinforcement learning. In this model, we make each demand as an agent. The action of each agent, i.e., the dispatching decision of each demand, is determined by a centralized algorithm in a coordinated way. The model works in the following two steps. (1) It learns the demand’s expected value in each spatiotemporal state using historical transition data. (2) Based on the learned values, it conducts a Many-To-Many dispatching using a combinatorial optimization algorithm by considering both immediate rewards and expected values of demands in the next round. In order to get a higher total reward, the demands with a high expected value (short response time) in the future may be delayed to the next round. On the contrary, the demands with a low expected value (long response time) in the future would be dispatched immediately. Through extensive experiments using real-world datasets, we show that the proposed model outperforms the existing models in terms of Cancellation Rate and Average Response Time.

References

  1. Aamena Alshamsi, Sherief Abdallah, and Iyad Rahwan. 2009. Multiagent self-organization for a taxi dispatch system. In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems. 21–28.Google ScholarGoogle Scholar
  2. P. Arunapuram, J. W. Bartel, and P. Dewan. 2014. Distribution, correlation and prediction of response times in Stack Overflow. In Proceedings of the 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing. 378–387. DOI:https://doi.org/10.4108/icst.collaboratecom.2014.257265Google ScholarGoogle Scholar
  3. N. Burlutskiy, A. Fish, N. Ali, and M. Petridis. 2015. Prediction of users’ response time in Q&A communities. In Proceedings of the 2015 IEEE 14th International Conference on Machine Learning and Applications. 618–623. DOI:https://doi.org/10.1109/ICMLA.2015.190Google ScholarGoogle Scholar
  4. Yong Chen, Ming Zhou, Ying Wen, Yaodong Yang, Yufeng Su, Weinan Zhang, Dell Zhang, Jun Wang, and Han Liu. 2018. Factorized Q-learning for large-scale multi-agent systems. In Proceedings of the 1st International Conference on Distributed Artificial Intelligence. ACM, Article No. 7. https://doi.org/10.1145/3356464.3357707 Google ScholarGoogle Scholar
  5. P. Cheng, X. Lian, L. Chen, and C. Shahabi. 2017. Prediction-based task assignment in spatial crowdsourcing. In Proceedings of the 2017 IEEE 33rd International Conference on Data Engineering. IEEE, 997–1008. DOI:https://doi.org/10.1109/ICDE.2017.146Google ScholarGoogle Scholar
  6. David Geiger and Martin Schader. 2014. Personalized task recommendation in crowdsourcing information systems—current state of the art. Decision Support Systems 65, C (2014), 3–16.Google ScholarGoogle Scholar
  7. GOGOX. 2020. GOGOX Hong Kong. Retrieved from https://www.gogox.com.hk.Google ScholarGoogle Scholar
  8. Jiarui Jin, Ming Zhou, Weinan Zhang, Minne Li, Zilong Guo, Zhiwei Qin, Yan Jiao, Xiaocheng Tang, Chenxi Wang, Jun Wang, Guobin Wi, and Jieping Ye. 2019. CoRide: Joint order dispatching and fleet management for multi-scale ride-hailing platforms. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. ACM, New York, NY, USA, 1983–1992. DOI:https://doi.org/10.1145/3357384.3357978Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jintao Ke, Feng Xiao, Hai Yang, and Jieping Ye. 2020. Learning to delay in ride-sourcing systems: a multi-agent deep reinforcement learning framework. IEEE Transactions on Knowledge and Data Engineering. DOI:10.1109/TKDE.2020.3006084Google ScholarGoogle Scholar
  10. S. Klos née Müller, C. Tekin, M. van der Schaar, and A. Klein. 2018. Context-aware hierarchical online learning for performance maximization in mobile crowdsourcing. IEEE/ACM Transactions on Networking 26, 3 (Jun. 2018), 1334–1347. DOI:https://doi.org/10.1109/TNET.2018.2828415Google ScholarGoogle Scholar
  11. Derhorng Lee, Hao Wang, Ruey Long Cheu, and Siew Hoon Teo. 2004. Taxi dispatch system based on current demands and real-time traffic conditions. Transportation Research Record 1882, 1882 (2004), 193–200.Google ScholarGoogle ScholarCross RefCross Ref
  12. W. Li, J. Cao, J. Guan, S. Zhou, G. Liang, W. K. Y. So, and M. Szczecinski. 2019. A general framework for unmet demand prediction in on-demand transport services. IEEE Transactions on Intelligent Transportation Systems 20, 8 (Aug. 2019), 2820–2830. DOI:https://doi.org/10.1109/TITS.2018.2873092Google ScholarGoogle ScholarCross RefCross Ref
  13. Kaixiang Lin, Renyu Zhao, Zhe Xu, and Jiayu Zhou. 2018. Efficient large-scale fleet management via multi-agent deep reinforcement learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, New York, NY, 1774–1783. DOI:https://doi.org/10.1145/3219819.3219993Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jalal Mahmud, Jilin Chen, and Jeffrey Nichols. 2013. When will you answer this? Estimating response time in Twitter. In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  15. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529–533. DOI:https://doi.org/10.1038/nature14236Google ScholarGoogle Scholar
  16. James Munkres. 1957. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics 10, 1 (1957), 196–210.Google ScholarGoogle Scholar
  17. K. T. Seow, N. H. Dang, and D. Lee. 2010. A collaborative multiagent taxi-dispatch system. IEEE Transactions on Automation Science and Engineering 7, 3 (Jul. 2010), 607–616. DOI:https://doi.org/10.1109/TASE.2009.2028577Google ScholarGoogle Scholar
  18. Xiaocheng Tang, Zhiwei (Tony) Qin, Fan Zhang, Zhaodong Wang, Zhe Xu, Yintai Ma, Hongtu Zhu, and Jieping Ye. 2019. A deep value-network based approach for multi-driver order dispatching. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 1780–1790. DOI:https://doi.org/10.1145/3292500.3330724Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hien To, Cyrus Shahabi, and Leyla Kazemi. 2015. A server-assigned spatial crowdsourcing framework. ACM Transaction of Spatial Algorithms Systems 1, 1 (Jul. 2015), Article 2, 28 pages. DOI:https://doi.org/10.1145/2729713Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yongxin Tong, Jieying She, Bolin Ding, Lei Chen, Tianyu Wo, and Ke Xu. 2016. Online minimum matching in real-time spatial data: Experiments and analysis. Proceedings of the VLDB Endowment 9, 12 (Aug. 2016), 1053–1064. DOI:https://doi.org/10.14778/2994509.2994523Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yongxin Tong, Jieying She, Bolin Ding, Libin Wang, and Lei Chen. 2016. Online mobile micro-task allocation in spatial crowdsourcing. In Proceedings of the 2016 IEEE 32Nd International Conference on Data Engineering. IEEE, 49–60.Google ScholarGoogle ScholarCross RefCross Ref
  22. Yongxin Tong, Libin Wang, Zhou Zimu, Bolin Ding, Lei Chen, Jieping Ye, and Ke Xu. 2017. Flexible online task assignment in real-time spatial data. Proceedings of the VLDB Endowment 10, 11 (2017), 1334–1345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Tong, Y. Zeng, B. Ding, L. Wang, and L. Chen. 2019. Two-sided online micro-task assignment in spatial crowdsourcing. IEEE Transactions on Knowledge and Data Engineering. DOI:https://doi.org/10.1109/TKDE.2019.2948863Google ScholarGoogle Scholar
  24. Yongxin Tong, Zimu Zhou, Yuxiang Zeng, Lei Chen, and Cyrus Shahabi. 2020. Spatial crowdsourcing: A survey. The VLDB Journal 29, 1 (2020), 217–250. DOI:https://doi.org/10.1007/s00778-019-00568-7Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yuqi Wang, Jiannong Cao, Lifang He, Wengen Li, Lichao Sun, and Philip S. Yu. 2017. Coupled sparse matrix factorization for response time prediction in logistics services. In Proceedings of the 2017 ACM Conference on Information and Knowledge Management. ACM, New York, NY, 939–947. DOI:https://doi.org/10.1145/3132847.3132948Google ScholarGoogle Scholar
  26. Y. Wang, Y. Tong, C. Long, P. Xu, K. Xu, and W. Lv. 2019. Adaptive dynamic bipartite graph matching: A reinforcement learning approach. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering. IEEE, 1478–1489. DOI:https://doi.org/10.1109/ICDE.2019.00133Google ScholarGoogle Scholar
  27. Z. Wang, Z. Qin, X. Tang, J. Ye, and H. Zhu. 2018. Deep reinforcement learning with knowledge transfer for online rides order dispatching. In Proceedings of the 2018 IEEE International Conference on Data Mining. IEEE, 617–626. DOI:https://doi.org/10.1109/ICDM.2018.00077Google ScholarGoogle ScholarCross RefCross Ref
  28. Zhe Xu, Zhixin Li, Qingwen Guan, Dingshui Zhang, Qiang Li, Junxiao Nan, Chunyang Liu, Wei Bian, and Jieping Ye. 2018. Large-scale order dispatch in on-demand ride-hailing platforms: A learning and planning approach. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 905–913. DOI:https://doi.org/10.1145/3219819.3219824Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. L. Yang, X. Yu, J. Cao, W. Li, Y. Wang, and M. Szczecinski. 2019. A novel demand dispatching model for autonomous on-demand services. IEEE Transactions on Services Computing. DOI:https://doi.org/10.1109/TSC.2019.2941680Google ScholarGoogle Scholar
  30. Lingyu Zhang, Tao Hu, Yue Min, Guobin Wu, Junying Zhang, Pengcheng Feng, Pinghua Gong, and Jieping Ye. 2017. A taxi order dispatch model based on combinatorial optimization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 2151–2159. DOI:https://doi.org/10.1145/3097983.3098138Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. L. Zheng and L. Chen. 2017. Maximizing acceptance in rejection-aware spatial crowdsourcing. IEEE Transactions on Knowledge and Data Engineering 29, 9 (Sep. 2017), 1943–1956. DOI:https://doi.org/10.1109/TKDE.2017.2676771Google ScholarGoogle ScholarCross RefCross Ref
  32. Libin Zheng, Lei Chen, and Jieping Ye. 2018. Order dispatch in price-aware ridesharing. Proceedings of the VLDB Endowment 11, 8 (Apr. 2018), 853–865. DOI:https://doi.org/10.14778/3204028.3204030Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploring Deep Reinforcement Learning for Task Dispatching in Autonomous On-Demand Services

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Knowledge Discovery from Data
        ACM Transactions on Knowledge Discovery from Data  Volume 15, Issue 3
        June 2021
        533 pages
        ISSN:1556-4681
        EISSN:1556-472X
        DOI:10.1145/3454120
        Issue’s Table of Contents

        Copyright © 2021 Association for Computing Machinery.

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 21 April 2021
        • Accepted: 1 December 2020
        • Revised: 1 November 2020
        • Received: 1 March 2020
        Published in tkdd Volume 15, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed
      • Article Metrics

        • Downloads (Last 12 months)27
        • Downloads (Last 6 weeks)1

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format