A real-time HIL control system on rotary inverted pendulum hardware platform based on double deep Q-network,Measurement and Control

当前位置： X-MOL 学术 › Meas. Control › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A real-time HIL control system on rotary inverted pendulum hardware platform based on double deep Q-network
Measurement and Control ( IF 1.3 ) Pub Date : 2021-03-17 , DOI: 10.1177/00202940211000380
Yanyan Dai ₁ , KiDong Lee ₁ , SukGyu Lee ₁

Affiliation

For real applications, rotary inverted pendulum systems have been known as the basic model in nonlinear control systems. If researchers have no deep understanding of control, it is difficult to control a rotary inverted pendulum platform using classic control engineering models, as shown in section 2.1. Therefore, without classic control theory, this paper controls the platform by training and testing reinforcement learning algorithm. Many recent achievements in reinforcement learning (RL) have become possible, but there is a lack of research to quickly test high-frequency RL algorithms using real hardware environment. In this paper, we propose a real-time Hardware-in-the-loop (HIL) control system to train and test the deep reinforcement learning algorithm from simulation to real hardware implementation. The Double Deep Q-Network (DDQN) with prioritized experience replay reinforcement learning algorithm, without a deep understanding of classical control engineering, is used to implement the agent. For the real experiment, to swing up the rotary inverted pendulum and make the pendulum smoothly move, we define 21 actions to swing up and balance the pendulum. Comparing Deep Q-Network (DQN), the DDQN with prioritized experience replay algorithm removes the overestimate of Q value and decreases the training time. Finally, this paper shows the experiment results with comparisons of classic control theory and different reinforcement learning algorithms.

中文翻译：

基于双深度Q网络的旋转倒立摆硬件平台上的实时HIL控制系统

对于实际应用，旋转倒立摆系统是非线性控制系统中的基本模型。如果研究人员对控制没有深入的了解，则很难使用经典的控制工程模型来控制旋转倒立摆平台，如第2.1节所示。因此，在没有经典控制理论的情况下，本文通过训练和测试强化学习算法来控制平台。强化学习（RL）的许多最新成就已经成为可能，但缺乏研究使用真实的硬件环境快速测试高频RL算法的研究。在本文中，我们提出了一种实时的硬件在环（HIL）控制系统，以训练和测试从模拟到实际硬件实现的深度强化学习算法。使用具有优先级的经验重播强化学习算法的Double Deep Q网络（DDQN），无需深入了解经典控制工程，即可用于实现代理。对于真实的实验，要使旋转的倒立摆摆起来并使摆平稳移动，我们定义了21种动作来摆起并平衡摆。与深度Q网络（DQN）相比，具有优先体验重放算法的DDQN消除了对Q值的高估并减少了训练时间。最后，通过对比经典控制理论和不同的强化学习算法，给出了实验结果。为了摆起旋转倒立摆并使摆平稳地移动，我们定义了21种动作来摆起并平衡摆。与深度Q网络（DQN）相比，具有优先体验重放算法的DDQN消除了对Q值的高估并减少了训练时间。最后，通过对比经典控制理论和不同的强化学习算法，给出了实验结果。为了摆起旋转倒立摆并使摆平稳地移动，我们定义了21种动作来摆起并平衡摆。与深度Q网络（DQN）相比，具有优先体验重放算法的DDQN消除了对Q值的高估并减少了训练时间。最后，通过对比经典控制理论和不同的强化学习算法，给出了实验结果。

更新日期：2021-03-17

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文