Skip to main content
Log in

Soft Actor-Critic for Navigation of Mobile Robots

  • Regular Paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

This paper provides a study of two deep reinforcement learning techniques for application in navigation of mobile robots, one of the techniques is the Soft Actor Critic (SAC) that is compared with the Deep Deterministic Policy Gradients (DDPG) algorithm in the same situation. In order to make a robot to arrive at a target in an environment, both networks have 10 laser range findings, the previous linear and angular velocity, and relative position and angle of the mobile robot to the target are used as the network inputs. As outputs, the networks have the linear and angular velocity of the mobile robot. The reward function created was designed in a way to only give a positive reward to the agent when it gets to the target and a negative reward when colliding with any object. The proposed architecture was applied successfully in two simulated environments, and a comparison between the two referred techniques was made using the results obtained as a basis and it was demonstrated that the SAC algorithm has a superior performance for the navigation of mobile robots than the DDPG algorithm (Code available at https://github.com/dranaju/project).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Materials Availability

Available in GitHub https://github.com/dranaju/project.

References

  1. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing atari with deep reinforcement learning, [Online]. Available: arXiv:1312.5602 (2013)

  2. Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. [Online]. Available: arXiv:1511.05952 (2016)

  3. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. [Online]. Available: arXiv:1509.02971 (2016)

  4. Schulman, J., Moritz, P., Levine, S., Jordan, M.I., Abbeel, P.: High-dimensional continuous control using generalized advantage estimation. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. [Online]. Available: arXiv:1506.02438 (2016)

  5. Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp 3389–3396. IEEE (2017)

  6. Mahmood, A.R., Korenkevych, D., Vasan, G., Ma, W., Bergstra, J.: Benchmarking reinforcement learning algorithms on real-world robots, CoRR, vol. abs/1809.07731. [Online]. Available: arXiv:1809.07731 (2018)

  7. Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep q-learning with model-based acceleration. In: International Conference on Machine Learning, pp 2829–2838 (2016)

  8. Zhu, Y., Mottaghi, R., Kolve, E., Lim, J.J., Gupta, A., Fei-Fei, L., Farhadi, A.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp 3357–3364. IEEE (2017)

  9. Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots, CoRR, vol. abs/1610.01733. [Online]. Available: arXiv:1610.01733 (2016)

  10. Tai, L., Paolo, G., Liu, M.: Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 31–36. IEEE (2017)

  11. Chen, Y.F., Everett, M., Liu, M., How, J.P.: Socially aware motion planning with deep reinforcement learning. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 1343–1350. IEEE (2017)

  12. Jesus, J.C., Bottega, J.A., Cuadros, M.A., Gamarra, D.F.: Deep deterministic policy gradient for navigation of mobile robots in simulated environments. In: 2019 19th International Conference on Advanced Robotics (ICAR), pp 362–367. IEEE (2019)

  13. Pajaziti, A., Avdullahu, P.: Slam–map building and navigation via ros. Int. J. Intell. Syst. Appl. Eng. 2(4), 71–75 (2014)

    Article  Google Scholar 

  14. Omara, H.I.M.A., Sahari, K.S.M.: Indoor mapping using kinect and ros. In: 2015 International Symposium on Agents, Multi-Agent Systems and Robotics (ISAMSR), pp 110–116. IEEE (2015)

  15. François-Lavet, V., Henderson, P., Islam, R., Bellemare, M.G., Pineau, J.: An introduction to deep reinforcement learning. Found. Trends Mach. Learn. 11(3-4), 219–354 (2018). [Online]. Available: https://doi.org/10.1561/2200000071

    Article  Google Scholar 

  16. Hausknecht, M., Stone, P.: Deep recurrent q-learning for partially observable mdps. In: 2015 AAAI Fall Symposium Series (2015)

  17. Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)

  18. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor, [Online]. Available: arXiv:1801.01290 (2018)

  19. Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., Levine, S.: Soft actor-critic algorithms and applications, [Online]. Available: arXiv:1812.05905 (2018)

  20. Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)

    Article  Google Scholar 

  21. Zhelo, O., Zhang, J., Tai, L., Liu, M., Burgard, W.: Curiosity-driven exploration for mapless navigation with deep reinforcement learning, [Online]. Available: arXiv:1804.00456 (2018)

  22. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  23. Uhlenbeck, G.E., Ornstein, L.S.: On the theory of the brownian motion. Phys. Rev. 36(5), 823 (1930)

    Article  Google Scholar 

  24. Plappert, M., Houthooft, R., Dhariwal, P., Sidor, S., Chen, R.Y., Chen, X., Asfour, T., Abbeel, P., Andrychowicz, M.: Parameter space noise for exploration, [Online]. Available: arXiv:1706.01905 (2017)

  25. Achiam, J.: Spinning up in deep reinforcement learning, GitHub repository. [Online]. Available: https://github.com/openai/spinningup (2018)

  26. Polyak, B.T., Juditsky, A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4), 838–855 (1992)

    Article  MathSciNet  Google Scholar 

  27. Pfitscher, M., Welfer, D., do Nascimento, E.J., Cuadros, M. A. d. S. L., Gamarra, D.F.T.: Article users activity gesture recognition on kinect sensor using convolutional neural networks and fastdtw for controlling movements of a mobile robot. Intel. Artif. 22(63), 121–134 (2019)

    Article  Google Scholar 

  28. da Silva, R.M., de Souza Leite Cuadros, M.A., Gamarra, D.F.T.: Comparison of a backstepping and a fuzzy controller for tracking a trajectory with a mobile robot. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds.) Intelligent Systems Design and Applications, pp 212–221. Cham, Springer International Publishing (2020)

  29. Subramanian, V.: Deep learning with pytorch: a practical approach to building neural network models using PyTorch Packt. Publishing Ltd (2018)

  30. Wu, C.-J., Brooks, D., Chen, K., Chen, D., Choudhury, S., Dukhan, M., Hazelwood, K., Isaac, E., Jia, Y., Jia, B., et al.: Machine learning at facebook: understanding inference at the edge. In: 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp 331–344. IEEE (2019)

  31. Laskin, M., Srinivas, A., Abbeel, P.: Curl: contrastive unsupervised representations for reinforcement learning. In: International Conference on Machine Learning, pp 5639–5650. PMLR (2020)

  32. Dai, Z., Yang, Z., Yang, Y., Carbonell, J.G., Le, Q.V., Salakhutdinov, R.: Transformer-xl: attentive language models beyond a fixed-length context, [Online]. Available: arXiv:1901.02860 (2019)

  33. Rao, D., McMahan, B.: Natural Language Processing with Pytorch: Build Intelligent Language Applications Using Deep Learning. O’Reilly Media, Inc (2019)

  34. Pyo, Y., Cho, H., Jung, R., Lim, T.: Ros Robot Programming. Seoul, ROBOTIS Co (2015)

    Google Scholar 

  35. Joseph, L.: Mastering ROS for robotics programming. Packt Publishing Ltd (2015)

  36. de Assis Brasil, P.M., Pereira, F.U., de Souza Leite Cuadros, M.A., Cukla, A.R., Tello Gamarra, D.F.: A study on global path planners algorithms for the simulated turtlebot 3 robot in ros. In: 2020 Latin American Robotics Symposium (LARS), 2020 Brazilian Symposium on Robotics (SBR) and 2020 Workshop on Robotics in Education (WRE), pp 1–6 (2020)

  37. Fairchild, C., Harman, T.L.: ROS Robotics by Example. Packt Publishing Ltd (2016)

  38. Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank Fabio Ugalde Pereira, by sharing the idea and environments with symmetric and assymmetric map formats, and all participants of VersusAI, by interchanging ideas and thoughts in the area of Robotics and Artificial Intelligence.

Author information

Authors and Affiliations

Authors

Contributions

- Junior Costa de Jesus conceived the research, writing of the article, designed and program the experiments, collected and processed the test data.

- Ricardo Bedin Grando write the article, collected and processed the test data.

- Victor Augusto Kich write the article, program the experiments, collected and processed the test data.

- Alisson Henrique Kolling write the article, program the experiments, collected and processed the test data.

- Marco Antonio de Souza Leite Cuadros discussion and conception of the main ideas of the article, provided valuable comments.

- Daniel Fernando Tello Gamarra conceived the research, writing of the article and discussion of the main ideas of the article.

Corresponding author

Correspondence to Junior Costa de Jesus.

Ethics declarations

Ethical Approval

The article has the approval of all the authors.

Consent to Participate

All the authors gave their consent to participate in this article.

Consent for Publication

The authors gave their authorization for the publishing of this article.

Competing interests

There are not conflict of interest or competing interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Jesus, J., Kich, V.A., Kolling, A.H. et al. Soft Actor-Critic for Navigation of Mobile Robots. J Intell Robot Syst 102, 31 (2021). https://doi.org/10.1007/s10846-021-01367-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-021-01367-5

Keywords

Navigation