当前位置: X-MOL 学术Phys. Rev. E › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Symmetry reduction for deep reinforcement learning active control of chaotic spatiotemporal dynamics
Physical Review E ( IF 2.4 ) Pub Date : 2021-07-16 , DOI: 10.1103/physreve.104.014210
Kevin Zeng 1 , Michael D Graham 1
Affiliation  

Deep reinforcement learning (RL) is a data-driven, model-free method capable of discovering complex control strategies for macroscopic objectives in high-dimensional systems, making its application toward flow control promising. Many systems of flow control interest possess symmetries that, when neglected, can significantly inhibit the learning and performance of a naive deep RL approach. Using a test-bed consisting of the Kuramoto-Sivashinsky equation (KSE), equally spaced actuators, and a goal of minimizing dissipation and power cost, we demonstrate that by moving the deep RL problem to a symmetry-reduced space, we can alleviate limitations inherent in the naive application of deep RL. We demonstrate that symmetry-reduced deep RL yields improved data efficiency as well as improved control policy efficacy compared to policies found by naive deep RL. Interestingly, the policy learned by the symmetry aware control agent drives the system toward an equilibrium state of the forced KSE that is connected by continuation to an equilibrium of the unforced KSE, despite having been given no explicit information regarding its existence. That is, to achieve its goal, the RL algorithm discovers and stabilizes an equilibrium state of the system. Finally, we demonstrate that the symmetry-reduced control policy is robust to observation and actuation signal noise, as well as to system parameters it has not observed before.

中文翻译:

对称约简深度强化学习对混沌时空动态的主动控制

深度强化学习 (RL) 是一种数据驱动、无模型的方法,能够为高维系统中的宏观目标发现复杂的控制策略,使其在流量控制中的应用前景广阔。许多对流量控制感兴趣的系统都具有对称性,如果忽略这些对称性,会显着抑制幼稚的深度强化学习方法的学习和性能。使用由 Kuramoto-Sivashinsky 方程 (KSE)、等距执行器和最小化耗散和功率成本的目标组成的试验台,我们证明了通过将深度强化学习问题移动到对称减少的空间,我们可以减轻限制深度强化学习的幼稚应用所固有的。我们证明,与朴素的深度 RL 发现的策略相比,对称性减少的深度 RL 提高了数据效率,并提高了控制策略的有效性。有趣的是,对称感知控制代理学习的策略将系统推向强制 KSE 的平衡状态,该平衡状态通过继续连接到非强制 KSE 的平衡状态,尽管没有给出关于其存在的明确信息。也就是说,为了实现其目标,RL 算法会发现并稳定系统的平衡状态。最后,我们证明了对称性降低的控制策略对观察和驱动信号噪声以及之前未观察到的系统参数具有鲁棒性。对称感知控制代理学习的策略将系统推向强制 KSE 的平衡状态,该平衡状态通过延续连接到非强制 KSE 的平衡状态,尽管没有给出关于其存在的明确信息。也就是说,为了实现其目标,RL 算法会发现并稳定系统的平衡状态。最后,我们证明了对称性降低的控制策略对观察和驱动信号噪声以及之前未观察到的系统参数具有鲁棒性。对称感知控制代理学习的策略将系统推向强制 KSE 的平衡状态,该平衡状态通过延续连接到非强制 KSE 的平衡状态,尽管没有给出关于其存在的明确信息。也就是说,为了实现其目标,RL 算法会发现并稳定系统的平衡状态。最后,我们证明了对称性降低的控制策略对观察和驱动信号噪声以及之前未观察到的系统参数具有鲁棒性。
更新日期:2021-07-16
down
wechat
bug