当前位置: X-MOL 学术J. Intell. Manuf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Designing an adaptive production control system using reinforcement learning
Journal of Intelligent Manufacturing ( IF 8.3 ) Pub Date : 2020-07-14 , DOI: 10.1007/s10845-020-01612-y
Andreas Kuhnle , Jan-Philipp Kaiser , Felix Theiß , Nicole Stricker , Gisela Lanza

Modern production systems face enormous challenges due to rising customer requirements resulting in complex production systems. The operational efficiency in the competitive industry is ensured by an adequate production control system that manages all operations in order to optimize key performance indicators. Currently, control systems are mostly based on static and model-based heuristics, requiring significant human domain knowledge and, hence, do not match the dynamic environment of manufacturing companies. Data-driven reinforcement learning (RL) showed compelling results in applications such as board and computer games as well as first production applications. This paper addresses the design of RL to create an adaptive production control system by the real-world example of order dispatching in a complex job shop. As RL algorithms are “black box” approaches, they inherently prohibit a comprehensive understanding. Furthermore, the experience with advanced RL algorithms is still limited to single successful applications, which limits the transferability of results. In this paper, we examine the performance of the state, action, and reward function RL design. When analyzing the results, we identify robust RL designs. This makes RL an advantageous control system for highly dynamic and complex production systems, mainly when domain knowledge is limited.



中文翻译:

使用强化学习设计自适应生产控制系统

由于不断增长的客户需求导致复杂的生产系统,现代生产系统面临着巨大的挑战。适当的生产控制系统可确保竞争行业的运营效率,该系统可管理所有运营,以优化关键绩效指标。当前,控制系统主要基于静态和基于模型的启发式技术,需要大量的人员领域知识,因此与制造公司的动态环境不匹配。数据驱动的强化学习(RL)在诸如棋盘游戏和计算机游戏之类的应用程序以及首次量产应用程序中显示出了令人信服的结果。本文通过一个复杂的车间中的订单分派的实际示例,介绍了RL的设计,以创建一个自适应的生产控制系统。由于RL算法是“黑匣子”方法,因此它们固有地禁止全面理解。此外,高级RL算法的经验仍然仅限于单个成功的应用程序,这限制了结果的可传递性。在本文中,我们检查了状态,动作和奖励函数RL设计的性能。分析结果时,我们会确定可靠的RL设计。这使得RL成为主要用于领域知识有限的,高度动态和复杂的生产系统的有利控制系统。和奖励功能RL设计。分析结果时,我们会确定可靠的RL设计。这使得RL成为主要用于领域知识有限的,高度动态和复杂的生产系统的有利控制系统。和奖励功能RL设计。分析结果时,我们会确定可靠的RL设计。这使得RL成为主要用于领域知识有限的,高度动态和复杂的生产系统的有利控制系统。

更新日期:2020-07-15
down
wechat
bug