当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Flexible control of Discrete Event Systems using environment simulation and Reinforcement Learning
Applied Soft Computing ( IF 8.7 ) Pub Date : 2021-07-18 , DOI: 10.1016/j.asoc.2021.107714
Kallil M.C. Zielinski 1 , Lucas V. Hendges 1 , João B. Florindo 2 , Yuri K. Lopes 1 , Richardson Ribeiro 1 , Marcelo Teixeira 1 , Dalcimar Casanova 1
Affiliation  

Discrete Event Systems (DESs) are classically modeled as Finite State Machines (FSMs), and controlled in a maximally permissive, controllable, and nonblocking way using Supervisory Control Theory (SCT). While SCT is powerful to orchestrate events of DESs, it fail to process events whose control is based on probabilistic assumptions. In this research, we show that some events can be approached as usual in SCT, while others can be processed using Artificial Intelligence. We present a tool to convert SCT controllers into Reinforcement Learning (RL) simulation environments, from where they become suitable for intelligent processing. Then, we propose a RL-based approach that recognizes the context under which the selected set of stochastic events occur, and treats them accordingly, aiming to find suitable decision making as complement to deterministic outcomes of the SCT. The result is an efficient combination of safe and flexible control, which tends to maximize performance for a class of DES that evolves probabilistically. Two RL algorithms are tested, State–Action–Reward–State–Action (SARSA) and N-step SARSA, over a flexible automotive plant control. Results suggest a performance improvement 9 times higher when using the proposed combination in comparison with non-intelligent decisions.



中文翻译:

使用环境模拟和强化学习灵活控制离散事件系统

离散事件系统DES) 被经典地建模为有限状态机(有限状态机),并使用监督控制理论以最大允许、可控和非阻塞的方式进行控制(SCT)。尽管SCT 是强大的编排事件 DES,它无法处理其控制基于概率假设的事件。在这项研究中,我们表明可以像往常一样处理某些事件SCT,而其他人可以使用人工智能进行处理。我们提供了一个转换工具SCT 控制器进入强化学习(强化学习) 模拟环境,从那里它们变得适合智能处理。然后,我们提出一个强化学习一种基于方法的方法,它识别所选择的一组随机事件发生的背景,并相应地对待它们,旨在找到合适的决策作为对确定性结果的补充 SCT. 结果是安全和灵活控制的有效组合,这往往会最大限度地提高一类DES以概率的方式演化。二强化学习算法在灵活的汽车工厂控制上进行了测试,状态-行动-奖励-状态-行动(SARSA)和 N 步 SARSA。结果表明,与非智能决策相比,使用建议的组合时的性能提高了 9 倍。

更新日期:2021-07-23
down
wechat
bug