当前位置: X-MOL 学术Phys. D Nonlinear Phenom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Supervised learning algorithms for controlling underactuated dynamical systems
Physica D: Nonlinear Phenomena ( IF 4 ) Pub Date : 2020-06-18 , DOI: 10.1016/j.physd.2020.132621
Bharat Monga , Jeff Moehlis

Control of underactuated dynamical systems has been studied for decades in robotics, and is now emerging in other fields such as neuroscience. Most of the advances have been in model based control theory, which has limitations when the system under study is very complex and it is not possible to construct a model. This calls for data driven control methods like machine learning, which has spread to many fields in the recent years including control theory. However, the success of such algorithms has been dependent on availability of large datasets. Moreover, due to their black box nature, it is challenging to analyze how such algorithms work, which may be crucial in applications where failure is very costly. In this paper, we develop two related novel supervised learning algorithms. The algorithms are powerful enough to control a wide variety of complex underactuated dynamical systems, and yet have a simple and intelligent structure that allows them to work with a sparse data set even in the presence of noise. Our algorithms output a bang-bang (binary) control input by taking in feedback of the state of the dynamical system. The algorithms learn this control input by maximizing a reward function in both short and long time horizons. We demonstrate the versatility of our algorithms by applying them to a diverse range of applications including: switching between bistable states, changing the phase of an oscillator, desynchronizing a population of synchronized coupled oscillators, and stabilizing an unstable fixed point. For most of these applications we are able to reason why our algorithms work by using traditional dynamical systems and control theory. We also compare our learning algorithms with some traditional control algorithms, and reason why our algorithms work better.



中文翻译:

用于控制欠驱动动力系统的监督学习算法

欠驱动动力系统的控制已经在机器人技术领域进行了数十年的研究,现在在神经科学等其他领域正在兴起。大多数进步都来自基于模型的控制理论,当所研究的系统非常复杂且无法构建模型时,它具有局限性。这就要求采用数据驱动的控制方法,例如机器学习,近年来已扩展到包括控制理论在内的许多领域。但是,此类算法的成功取决于大型数据集的可用性。此外,由于其黑匣子性质,分析此类算法的工作方式具有挑战性,这对于故障成本很高的应用程序可能至关重要。在本文中,我们开发了两种相关的新型监督学习算法。这些算法功能强大,足以控制各种复杂的欠驱动动态系统,但具有简单智能的结构,即使在有噪声的情况下,它们也可以处理稀疏数据集。我们的算法通过获取动态系统状态的反馈来输出bang-bang(二进制)控制输入。算法通过在短期和长期时间内最大化奖励函数来学习此控制输入。我们通过将算法应用于各种应用程序来证明我们算法的多功能性,这些应用程序包括:在双稳态之间切换,改变振荡器的相位,使一组同步耦合振荡器不同步以及稳定不稳定的固定点。对于大多数这些应用程序,我们能够通过使用传统的动力学系统和控制理论来推理为什么我们的算法能够工作。我们还将学习算法与一些传统控制算法进行了比较,并说明了为什么我们的算法能更好地工作。

更新日期:2020-06-18
down
wechat
bug