Acquiring reusable skills in intrinsically motivated reinforcement learning

Davoodabadi Farahani, Marzieh; Mozayani, Nasser

doi:10.1007/s10845-020-01629-3

Acquiring reusable skills in intrinsically motivated reinforcement learning

Published: 22 July 2020

Volume 32, pages 2147–2168, (2021)
Cite this article

Journal of Intelligent Manufacturing Aims and scope Submit manuscript

524 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

This paper proposes a novel incremental model for acquiring skills and using them in Intrinsically Motivated Reinforcement Learning (IMRL). In this model, the learning process is divided into two phases. In the first phase, the agent explores the environment and acquires task-independent skills by using different intrinsic motivation mechanisms. We present two intrinsic motivation factors for acquiring skills by detecting states that can lead to other states (being a cause) and by detecting states that help the agent to transition to a different region (discounted relative novelty). In the second phase, the agent evaluates the acquired skills to find suitable ones for accomplishing a specific task. Despite the importance of assessing task-independent skills to perform a task, the idea of evaluating skills and pruning them has not been considered in IMRL literature. In this article, two methods are presented for evaluating previously learned skills based on the value function of the assigned task. Using such a two-phase learning model and the skill evaluation capability helps the agent to acquire task-independent skills that can be transferred to other similar tasks. Experimental results in four domains show that the proposed method significantly increases learning speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating skills in hierarchical reinforcement learning

Article 18 May 2020

Marzieh Davoodabadi Farahani & Nasser Mozayani

ELSIM: End-to-End Learning of Reusable Skills Through Intrinsic Motivation

Adaptive Skill Acquisition in Hierarchical Reinforcement Learning

Notes

The results of the OGAHC algorithm are plotted for \( \rho = 1 \) which, had the best results among other \( \rho \) values in [55].

References

Aissani, N., Bekrar, A., Trentesaux, D., & Beldjilali, B. (2012). Dynamic scheduling for multi-site companies: A decisional approach based on reinforcement multi-agent learning. Journal of Intelligent Manufacturing, 23, 2513–2529.
Article Google Scholar
Aubret, A., Matignon, L., & Hassas, S. (2019). A survey on intrinsic motivation in reinforcement learning. Preprint arXiv:1908.06976.
Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(4), 341–379.
Article Google Scholar
Barto, A. G., & Simsek, O. (2005). Intrinsic motivation for reinforcement learning systems. In Proceedings of the thirteenth yale workshop on adaptive and learning systems.
Barto, A. G., Singh, S., & Chentanez, N. (2004). Intrinsically motivated learning of hierarchical collections of skills. In Proceedings of the 3rd international conference on development and learning (ICDL 2004), Salk Institute, San Diego.
Bellemare, M., Srinivasan, S., Ostrovski, G., Schaul, T., Saxton, D., & Munos, R. (2016). Unifying count-based exploration and intrinsic motivation. Advances in Neural Information Processing Systems (pp. 1471–1479).
Berlyne, D. E. (1960). Conflict, arousal, and curiosity. New York: McGraw-Hill.
Book Google Scholar
Bonarini, A., Lazaric, A., Restelli, M., & Vitali, P. (2006). Self-development framework for reinforcement learning agents. In Proceedings of the 5th international conference on development and learning ICDL (Vol. 178, pp. 355–362).
Brandes, U. (2001). A faster algorithm for betweenness centrality. The Journal of Mathematical Sociology, 25(2), 163–177.
Article Google Scholar
Chen, C., Xia, B., Zhou, B., & Lifeng, X. (2015). A reinforcement learning based approach for a multiple-load carrier scheduling problem. Journal of Intelligent Manufacturing, 26, 1233–1245.
Article Google Scholar
Davoodabadi, M., & Beigy, H. (2011). A new method for discovering subgoals and constructing options in reinforcement learning. In proceedings of 5th Indian international conference on artificial intelligence (IICAI-11) (pp. 441–450).
Davoodabadi Farahani, M., & Mozayani, N. (2019). Automatic construction and evaluation of macro-actions in reinforcement learning. Applied Soft Computing, 82, 105574.
Article Google Scholar
Davoodabadi Farahani, M., & Mozayani, N. (2020). Evaluating skills in hierarchical reinforcement learning. International Journal of Machine Learning and Cybernetics. https://doi.org/10.1007/s13042-020-01141-3.
Article Google Scholar
Dhakan, P., Merrick, K., Rañó, I., & Siddique, N. (2018). Intrinsic rewards for maintenance, approach, avoidance, and achievement goal types. Frontiers in Neurorobotics, 12(October), 1–16.
Google Scholar
Florensa, C., Held, D., Geng, X., & Abbeel, P. (2018). Automatic goal generation for reinforcement learning agents. In International conference on machine learning (pp. 1514–1523).
Forestier, S., & Oudeyer, P. Y. (2016). Overlapping waves in tool use development: a curiosity-driven computational model. In The sixth joint IEEE international conference on developmental learning and epigenetic robotics (pp. 238–245).
Groos, K. (1901). The play of man: Chapter 8: The theory of play. D. Appleton.
Haber, N., Mrowca, D., Fei-Fei, L., & Yamins, D. (2018). Emergence of structured behaviors from curiosity-based intrinsic motivation. Preprint arXiv:1802.07461.
Hester, T., & Stone, P. (2012). Intrinsically motivated model learning for a developing curious agent. In AAMAS adaptive learning agents (ALA) workshop.
Hester, T., & Stone, P. (2017). Intrinsically motivated model learning for developing curious robots. Artificial Intelligence, 247, 170–186.
Article Google Scholar
Houthooft, R., Chen, X., Duan, Y., Schulman, J., De Turck, F., & Abbeel, P. (2016). Vime: Variational information maximizing exploration. Advances in Neural Information Processing Systems (pp. 1109–1117).
Jensen, P., Morini, M., Karsai, M., Venturini, T., Vespignani, A., Jacomy, M., et al. (2015). Detecting global bridges in networks. Journal of Complex Networks, 4(3), 319–329.
Article Google Scholar
Jong, N. K., Hester, T., & Stone, P. (2008). The utility of temporal abstraction in reinforcement learning. In Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems-Volume 1 (pp. 299–306).
Konidaris, G., Kuindersma, S., Barto, A., & Grupen, R. (2010). Constructing skill trees for reinforcement learning agents from demonstration trajectories. Advances in Neural Information Processing Systems (NIPS).
Lee, M.-J., Choi, S., & Chung, C.-W. (2016). Efficient algorithms for updating betweenness centrality in fully dynamic graphs. Information Sciences, 326, 278–296.
Article Google Scholar
Li, R. (2019). Reinforcement learning applications. arXiv:1908.06973.
Lin, L. J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8(3), 293–321.
Google Scholar
Mann, T., & Mannor, S. (2014). Scaling up approximate value iteration with options: Better policies with fewer iterations. In Proceedings of the 31st international conference on machine learning.
Mannor, S., Menache, I., Hoze, A., & Klein, U. (2004). Dynamic abstraction in reinforcement learning via clustering. In Proceedings of the twenty-first international conference on Machine learning (p. 71).
McGovern, A., & Sutton, R. S. (1998). Macro-actions in reinforcement learning: An empirical analysis. University of Massachusetts, Department of Computer Science, Tech. Rep (pp. 98–70).
Merrick, K. E. (2012). Intrinsic motivation and introspection in reinforcement learning. IEEE Transactions on Autonomous Mental Development, 4(4), 315–329.
Article Google Scholar
Metzen, J. H. (2013). Learning graph-based representations for continuous reinforcement learning domains. In Lecture Notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (Vol. 8188 LNAI, No. PART 1, pp. 81–96).
Metzen, J. H. (2014). Learning the structure of continuous markov decision processes. PhD thesis, Universität Bremen.
Metzen, J. H., & Kirchner, F. (2013). Incremental learning of skill collections based on intrinsic motivation. Frontiers in Neurorobotics, 7(July), 1–12.
Google Scholar
Mirolli, M., & Baldassarre, G. (Eds.). (2013a). Intrinsically motivated learning in natural and artificial systems. Heidelberg: Springer.
Google Scholar
Mirolli, M., & Baldassarre, G. (2013b). Functions and mechanisms of intrinsic motivations. In G. Baldassarre & M. Mirolli (Eds.), Intrinsically motivated learning in natural and artificial systems (pp. 49–72). Berlin: Springer.
Chapter Google Scholar
Moerman, W. (2009). Hierarchical reinforcement learning : Assignment of behaviours to subpolicies by self-organization. PhD thesis, Utrecht University.
Mohamed, S., & Rezende, D. J. (2015). Variational information maximisation for intrinsically motivated reinforcement learning. Advances in neural Information Processing Systems (pp. 2125–2133).
Murata, J. (2008). Controlled use of subgoals in reinforcement learning. In Robotics, automation and control, book, no. October (pp. 167–182).
Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113.
Article Google Scholar
Oudeyer, P.-Y., & Kaplan, F. (2007). What is intrinsic motivation? A typology of computational approaches. Frontiers in neurorobotics, 1, 6.
Article Google Scholar
Oudeyer, P. Y., Kaplan, F., & Hafner, V. V. (2007). Intrinsic motivation systems for autonomous mental development. IEEE Transactions on Evolutionary Computation, 11(2), 265–286.
Article Google Scholar
Pathak, D., Agrawal, P., Efros, A. A., & Darrell, T. (2017). Curiosity-driven exploration by self-supervised prediction. In IEEE computer society conference on computer vision and pattern recognition workshops (pp. 16–17).
Piaget, J. (1962). Play, dreams and imitation (Vol. 24). New York: Norton.
Santucci, V., Baldassarre, G., & Mirolli, M. (2016). GRAIL: A goal-discovering robotic architecture for intrinsically-motivated learning. IEEE Transactions on Cognitive and Developmental Systems, 8(3), 214–231.
Article Google Scholar
Schembri, M., Mirolli, M., & Baldassarre, G. (2007). Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot. In 2007 IEEE 6th International conference on development and learning, ICDL (pp. 282–287).
Siddique, N., Dhakan, P., Rano, I., & Merrick, K. (2017). A review of the relationship between novelty, intrinsic motivation and reinforcement learning. Journal of Behavioral Robotics, 8(1), 58–69.
Article Google Scholar
Simşek, O. (2008). Behavioral building blocks for autonomous agents: Description, identification, and learning. PhD Thesis, University of Massachusetts Amherst.
Stout, A., & Barto, A. G. (2010). Competence progress intrinsic motivation. In Proceedings of the ninth IEEE international on development and learning.
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. IEEE Transactions on Neural Networks, 9(5), 1054.
Article Google Scholar
Sutton, R. S., Precup, D., & Singh, S. (1998). Intra-option learning about temporally abstract actions. In Proceedings of the fifteenth international conference on machine learning (pp. 556–564).
Sutton, R. S., Precup, D., & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1), 181–211.
Article Google Scholar
Thrun, S. (1995). Exploration in active learning. Handbook of brain science and neural.

Download references

Author information

Authors and Affiliations

Computer Engineering Department, Iran University of Science and Technology, Tehran, Iran
Marzieh Davoodabadi Farahani & Nasser Mozayani

Authors

Marzieh Davoodabadi Farahani
View author publications
You can also search for this author in PubMed Google Scholar
Nasser Mozayani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nasser Mozayani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Davoodabadi Farahani, M., Mozayani, N. Acquiring reusable skills in intrinsically motivated reinforcement learning. J Intell Manuf 32, 2147–2168 (2021). https://doi.org/10.1007/s10845-020-01629-3

Download citation

Received: 02 December 2019
Accepted: 12 July 2020
Published: 22 July 2020
Issue Date: December 2021
DOI: https://doi.org/10.1007/s10845-020-01629-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Acquiring reusable skills in intrinsically motivated reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Evaluating skills in hierarchical reinforcement learning

ELSIM: End-to-End Learning of Reusable Skills Through Intrinsic Motivation

Adaptive Skill Acquisition in Hierarchical Reinforcement Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Acquiring reusable skills in intrinsically motivated reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Evaluating skills in hierarchical reinforcement learning

ELSIM: End-to-End Learning of Reusable Skills Through Intrinsic Motivation

Adaptive Skill Acquisition in Hierarchical Reinforcement Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation