Abstract
Peg-in-hole assembly with narrow clearance is a typical robotic contact-rich task in industrial manufacturing. Robot learning allows robots to directly acquire the assembly skills for this task without modeling and recognizing the complex contact states. However, learning such skills is still challenging for robot because of the difficulties in collecting massive transitions data and transferring skills to new tasks, which inevitably leads to low training efficiency. This paper formulated the assembly task as a Markov decision process, and proposed a model accelerated reinforcement learning method to efficiently learn assembly policy. In this method, the assembly policy is learned with the maximum entropy reinforcement learning framework and executed with an impedance controller, which ensures exploration efficiency meanwhile allows transferring skills between tasks. To reduce sample complexity and improve training efficiency, the proposed method learns the environment dynamics with Gaussian Process while training policy, then, the learned dynamic model is utilized to improve target value estimation and generate virtual data to argument transition samples. This method can robustly learn assembly skills while minimizing real-world interaction requirements which makes it suitable for realistic assembly scenarios. To verify the proposed method, experiments on an industrial robot are conducted, and the results demonstrate that the proposed method improves the training efficiency by 31% compared with the method without model acceleration and the learned skill can be transferred to new tasks to accelerate the training for new policies.
Similar content being viewed by others
References
Billard, A., Calinon, S., Dillmann, R., Schaal, S.: Robot programming by demonstration. Springer handbook of robotics, 1371–1394 (2008)
Chhatpar, S.R., Branicky, M.S.: Search strategies for peg-in-hole assemblies with position uncertainty. Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180), 20011465–1470. (2001)
Fujimoto, S., van Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. Paper presented at the Proceedings of the 35th International Conference on Machine Learning (2018)
Gao, X., Ling, J., Xiao, X., Li, M.: Learning force-relevant skills from human demonstration. Complexity 2019, 1–11 (2019). https://doi.org/10.1155/2019/5262859
Haarnoja, T., Tang, H., Abbeel, P., Levine, S.: Reinforcement learning with deep energy-based policiesProceedings of the 34th International Conference on Machine Learning Volume 70, Sydney, NSW, Australia, 2017. JMLR.org, p 1352–1361. (2017)
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, 2018. PMLR, p 1861–1870. (2018)
Hannaford, B., Lee, P.: Hidden markov model analysis of force/torque information in telemanipulation. Int. J. Robot. Res. 10(5), 528–539 (1991). https://doi.org/10.1177/027836499101000508
Inoue, T., De Magistris, G., Munawar, A., Yokoya, T., Tachibana, R.: Deep reinforcement learning for high precision assembly tasks. Paper presented at the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2018)
Jakovljevic, Z., Petrovic, P.B., Hodolic, J.: Contact states recognition in robotic part mating based on support vector machines. Int. J. Adv. Manuf. Technol. 59(1), 377–395 (2012). https://doi.org/10.1007/s00170-011-3501-5
Jakovljevic, Z., Petrovic, P.B., Mikovic, V.D., Pajic, M.: Fuzzy inference mechanism for recognition of contact states in intelligent robotic assembly. J Intell Manuf 25(3), 571–587 (2014). https://doi.org/10.1007/s10845-012-0706-x
Jasim, I.F., Plapper, P.W., Voos, H.: Contact-state modelling in force-controlled robotic peg-in-hole assembly processes of flexible objects using optimised Gaussian mixtures. Proc. Inst. Mech. Eng. Part B J. Eng. Manuf. 231(8), 1448–1463 (2017). https://doi.org/10.1177/0954405415598945
Kim, I., Lim, D., Kim, K.: Active peg-in-hole of chamferless parts using force/moment sensor. Paper presented at the 1999 IEEE/RSJ International Conference on Intelligent Robots and Systems., 1999
Kramberger, A., Gams, A., Nemec, B., Chrysostomou, D., Madsen, O., Ude, A.: Generalization of orientation trajectories and force-torque profiles for robotic assembly. Robot Auton Syst 98, 333–346 (2017). https://doi.org/10.1016/j.robot.2017.09.019
Kronander, K.J.A.: Control and learning of compliant manipulation skills. doctoral, EPFL (2015)
Kronander, K., Burdet, E., Billard, A.: Task transfer via collaborative manipulation for insertion assembly. Paper presented at the 2014 Workshop on human-robot interaction for industrial manufacturing, robotics, science and systems (2014)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Li, F., Jiang, Q., Zhang, S., Wei, M., Song, R.: Robot skill acquisition in assembly process using deep reinforcement learning. Neurocomputing 345, 92–102 (2019). https://doi.org/10.1016/j.neucom.2019.01.087
Luo, J., Solowjow, E., Wen, C., Ojea, J.A., Agogino, A.M.: Deep reinforcement learning for robotic assembly of mixed deformable and rigid objects. Paper presented at the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Polydoros, A.S., Nalpantidis, L.: Survey of model-based reinforcement learning: applications on robotics. J Intell Robot Syst 86(2), 153–173 (2017). https://doi.org/10.1007/s10846-017-0468-y
Ren, T., Dong, Y., Wu, D., Chen, K.: Learning-based variable compliance control for robotic assembly. J. Mech. Robot. (2018). https://doi.org/10.1115/1.4041331
Siciliano, B., Sciavicco, L., Villani, L., Oriolo, G.: Robotics: modelling, planning and control. Springer Science & Business Media. (2010)
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M.: Others: mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484 (2016)
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. MIT Press, Cambridge (2018)
Tang, T.: Skill learning for industrial robot manipulators., UC Berkeley (2018)
Tang, T., Lin, H.C., Zhao, Y., Fan, Y., Chen, W., Tomizuka, M.: Teach industrial robots peg-hole-insertion by human demonstration. Paper presented at the 2016 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM, 2016–01–01 (2016)
Thomas, G., Chien, M., Tamar, A., Ojea, J.A., Abbeel, P.: Learning robotic assembly from CAD. Paper presented at the 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018–01–01
Whitney, D.E.: Quasi-static assembly of compliantly supported rigid parts. J. Dyn. Syst. Meas. Control. Trans. ASME 104(1), 65–77 (1982). https://doi.org/10.1115/1.3149634
Williams, C.K., Rasmussen, C.E.: Gaussian processes for machine learning, vol. 2. MIT press, Cambridge (2006)
Xia, Y., Yin, Y., Chen, Z.: Dynamic analysis for peg-in-hole assembly with contact deformation. Int. J. Adv. Manuf. Technol. 30(1–2), 118–128 (2006). https://doi.org/10.1007/s00170-005-0047-4
Xu, J., Hou, Z., Liu, Z., Qiao, H.: Compare contact model-based control and contact model-free learning: a survey of robotic peg-in-hole assembly strategies. arXiv preprint arXiv:1904.05240 (2019a)
Xu, J., Hou, Z., Wang, W., Xu, B., Zhang, K., Chen, K.: Feedback deep deterministic policy gradient with fuzzy reward for robotic multiple peg-in-hole assembly tasks. IEEE T Ind Inform 15(3), 1658–1667 (2019b). https://doi.org/10.1109/TII.2018.2868859
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant Nos. 91748114, 51535004 and the Innovation Group Program of Hubei Province, China under Grant No. 2017CFA003.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhao, X., Zhao, H., Chen, P. et al. Model accelerated reinforcement learning for high precision robotic assembly. Int J Intell Robot Appl 4, 202–216 (2020). https://doi.org/10.1007/s41315-020-00138-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41315-020-00138-z