Evaluating skills in hierarchical reinforcement learning

Davoodabadi Farahani, Marzieh; Mozayani, Nasser

doi:10.1007/s13042-020-01141-3

Evaluating skills in hierarchical reinforcement learning

Original Article
Published: 18 May 2020

Volume 11, pages 2407–2420, (2020)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

385 Accesses
2 Citations
Explore all metrics

Abstract

Despite the benefits mentioned in previous works of automatically acquiring skills for using them in hierarchical reinforcement learning algorithms such as solving the curse of dimensionality, improving exploration, and speeding up value propagation, they have not paid much attention to evaluating the effect of each skill on these factors. In this paper, we show that depending on the given task, a skill may be useful for learning it or not. In addition, the focus of the related work of automatically acquiring skills is on detecting subgoals, i.e., the skill termination condition, but there is not a precise method for extracting the initiation set of skills. In this paper, we propose not only two methods for evaluating skills but also two other methods for pruning the initiation set of them. Experimental results show significant improvements in learning different test domains after evaluating and pruning skills.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human-in-the-loop machine learning: a state of the art

Article Open access 17 August 2022

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Monte Carlo Tree Search: a review of recent modifications and applications

Article Open access 19 July 2022

References

Dulac-Arnold G, Mankowitz D, Hester T (2019) Challenges of real-world reinforcement learning. arXiv preprint arXiv:1904.12901, 29 April 2019
Moerman W (2009) Hierarchical reinforcement learning : assignment of behaviours to subpolicies by self-organization. Ph.D. thesis, Utrecht University
Pfau J (2008) Plans as a means for guiding reinforcement learner. Ph.D. thesis, The University of Melbourn
Nguyen TT, Nguyen ND, Nahavandi S (2018) Deep reinforcement learning for multi-agent systems: A review of challenges, solutions and applications. arXiv preprint arXiv:1812.11794. 31 Dec 2018
McGovern A, Sutton RS (1998) Macro-actions in reinforcement learning: an empirical analysis. University of Massachusetts, Department of Computer Science, Tech. Rep 98–70
Jong NK, Hester T, Stone P (2008) The utility of temporal abstraction in reinforcement learning. In: Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems, vol 1, pp 299–306
Lin LJ (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach Learn 8(3):293–321
MathSciNet Google Scholar
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. IEEE Trans Neural Netw 9(5):1054–1054
Article Google Scholar
Sutton RS, Precup D, Singh S (1999) Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif Intell 112(1):181–211
Article MathSciNet Google Scholar
Dietterich TG (2000) An overview of MAXQ hierarchical reinforcement learning. In: International symposium on abstraction, reformulation, and approximation. Springer, Berlin, Heidelberg, pp 26–44
Chapter Google Scholar
Shoeleh F, Asadpour M (2017) Graph based skill acquisition and transfer learning for continuous reinforcement learning domains. Pattern Recognit Lett 87:104–116
Article Google Scholar
Xiong C, Tianmin S, Socher R (2019) Hierarchical and interpretable skill acquisition in multi-task reinforcement learning. United States patent application
Bacon P, Harb J, Precup D (2017) The option-critic architecture. In: Thirty-first AAAI conference on artificial intelligence, pp 1726–1734
Machado M, Bellemare M, Bowling M (2017) A Laplacian framework for option discovery in reinforcement learning. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 2295–2304
Dann M, Zambetta F (2017) Integrating skills and simulation to solve complex navigation tasks in Infinite Mario. IEEE Trans Games 10:101–106
Article Google Scholar
Fox R, Krishnan S, Stoica I, Goldberg K (2017) Multi-level discovery of deep options. arXiv preprint arXiv:1703.08294
Houthooft R, Chen X, Duan Y, Schulman J, De Turck F, Abbeel P (2016) Vime: variational information maximizing exploration. In: Advances in neural information processing systems, pp 1109–1117
Demir A, Çilden E, Polat F (2016) Local roots: a tree-based subgoal discovery method to accelerate reinforcement learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 361–376
Kulkarni TD, Narasimhan K, Saeedi A, Tenenbaum J (2016) Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. In: Advances in neural information processing systems, pp 3675–3683
Riemer M, Liu M, Tesauro G (2018) Learning abstract options. In: Advances in neural information processing systems, pp 10424–10434
Kaelbling L (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Article Google Scholar
McGovern A, Barto AG (2001) Automatic discovery of subgoals in reinforcement learning using diverse density. In: Machine learning-international workshop then conference, pp 361–368
Menache I, Mannor S, Shimkin N (2002) Q-cut-dynamic discovery of sub-goals in reinforcement learning. In: European conference on machine learning: ECML 2002, pp 295–306
Simşek O (2008) Behavioral building blocks for autonomous agents: description, identification, and learning. Ph.D. Thesis, University of Massachusetts Amherst
Merrick K (2007) Modelling motivation for experience-based attention focus in reinforcement learning. Ph.D. Thesis, School of Information Technologies, University of Sydney
Mehta N, Ray S, Tadepalli P, Dietterich T (2008) Automatic discovery and transfer of MAXQ hierarchies. In: Proceedings of the 25th international conference on machine learning, pp 648–655
Zang P, Zhou P, Minnen D, Isbell C (2009) Discovering options from example trajectories. In: Proceedings of the 26th annual international conference on machine learning, pp 1217–1224
Mannor S, Menache I, Hoze A, Klein U (2004) Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the twenty-first international conference on machine learning, p 71
Murata J (2008) Controlled use of subgoals in reinforcement learning. In: Robotics, automation and control, book, pp 167–182
Davoodabadi Farahani M, Mozayani N (2019) Automatic construction and evaluation of macro-actions in reinforcement learning. Appl Soft Comput 82:105574
Article Google Scholar
Metzen JH (2014) Learning the structure of continuous Markov decision processes. Ph.D. thesis, Universität Bremen
Davoodabadi Farahani M, Mozayani N (2020) A new method for acquiring reusable skills in intrinsically motivated reinforcement learning. J Intell Manuf (submitted)
Barto AG, Singh S, Chentanez N (2004) Intrinsically motivated learning of hierarchical collections of skills. In: Proceedings of the 3rd international conference on development and learning (ICDL 2004), Salk Institute, San Diego
Metzen JH (2013) Learning graph-based representations for continuous reinforcement learning domains. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 8188 LNAI, no PART 1, pp 81–96
Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Article Google Scholar
Davoodabadi Farahani M, Mozayani N (2018) Proposing a new method for acquiring skills in reinforcement learning with the help of graph clustering. Iran J Electr Comput Eng 2(16):131–141
Google Scholar
Sutton RS, Precup D, Singh S (1998) Intra-option learning about temporally abstract actions. In: Proceedings of the fifteenth international conference on machine learning, pp 556–564
Metzen JH (2013) Learning graph-based representations for continuous reinforcement learning domains. Mach Learn Knowl Discov Databases 8188:81–96
Google Scholar
Henderson P, Chang WD, Shkurti F, Hansen J, Meger D, Dudek G (2017) Benchmark environments for multitask learning in continuous domains. arXiv preprint arXiv:1708.04352
François-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J (2018) An introduction to deep reinforcement learning. Found Trends®. Mach Learn 11(3–4):219–354
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering Department, Iran University of Science and Technology, Tehran, Iran
Marzieh Davoodabadi Farahani & Nasser Mozayani

Authors

Marzieh Davoodabadi Farahani
View author publications
You can also search for this author in PubMed Google Scholar
Nasser Mozayani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nasser Mozayani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Davoodabadi Farahani, M., Mozayani, N. Evaluating skills in hierarchical reinforcement learning. Int. J. Mach. Learn. & Cyber. 11, 2407–2420 (2020). https://doi.org/10.1007/s13042-020-01141-3

Download citation

Received: 25 August 2019
Accepted: 02 May 2020
Published: 18 May 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s13042-020-01141-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating skills in hierarchical reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Human-in-the-loop machine learning: a state of the art

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Evaluating skills in hierarchical reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Human-in-the-loop machine learning: a state of the art

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation