当前位置: X-MOL 学术Int. J. Robust Nonlinear Control › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A self-adaptive SAC-PID control approach based on reinforcement learning for mobile robots
International Journal of Robust and Nonlinear Control ( IF 3.9 ) Pub Date : 2021-07-09 , DOI: 10.1002/rnc.5662
Xinyi Yu 1 , Yuehai Fan 1 , Siyu Xu 1 , Linlin Ou 1
Affiliation  

Proportional–integral–derivative (PID) control is the most widely used in industrial control, robot control, and other fields. However, traditional PID control is not competent when the system cannot be accurately modeled and the operating environment is variable in real time. To tackle these problems, we propose a self-adaptive model-free SAC-PID control approach based on reinforcement learning for automatic control of mobile robots. A new hierarchical structure is developed, which includes the upper controller based on soft actor-critic (SAC), one of the most competitive continuous control algorithms, and the lower controller based on incremental PID controller. SAC receives the dynamic information of the mobile robot as input and simultaneously outputs the optimal parameters of incremental PID controllers to compensate for the error between the path and the mobile robot in real time. In addition, the combination of 24-neighborhood method and polynomial fitting is developed to improve the adaptability of SAC-PID control method to complex environment. The effectiveness of the SAC-PID control method is verified with several different difficulty paths both on Gazebo and real mecanum mobile robot. Furthermore, compared with fuzzy PID control, the SAC-PID method has merits of strong robustness, generalization, and real-time performance.

中文翻译:

基于强化学习的移动机器人自适应SAC-PID控制方法

比例-积分-微分(PID)控制在工业控制、机器人控制等领域应用最为广泛。然而,当系统无法准确建模且运行环境实时变化时,传统的PID控制就无法胜任。为了解决这些问题,我们提出了一种基于强化学习的自适应无模型 SAC-PID 控制方法,用于移动机器人的自动控制。开发了一种新的层次结构,其中包括基于最有竞争力的连续控制算法之一的软actor-critic (SAC) 的上层控制器和基于增量PID 控制器的下层控制器。SAC接收移动机器人的动态信息作为输入,同时输出增量PID控制器的最优参数,实时补偿路径与移动机器人之间的误差。此外,将24邻域法与多项式拟合相结合,提高了SAC-PID控制方法对复杂环境的适应性。SAC-PID 控制方法的有效性在 Gazebo 和真实麦克纳姆移动机器人上通过几种不同的难度路径进行了验证。此外,与模糊PID控制相比,SAC-PID方法具有鲁棒性强、泛化性强、实时性强等优点。将24邻域法与多项式拟合相结合,提高SAC-PID控制方法对复杂环境的适应性。SAC-PID 控制方法的有效性在 Gazebo 和真实麦克纳姆移动机器人上通过几种不同的难度路径进行了验证。此外,与模糊PID控制相比,SAC-PID方法具有鲁棒性强、泛化性强、实时性强等优点。将24邻域法与多项式拟合相结合,提高SAC-PID控制方法对复杂环境的适应性。SAC-PID 控制方法的有效性在 Gazebo 和真实麦克纳姆移动机器人上通过几种不同的难度路径进行了验证。此外,与模糊PID控制相比,SAC-PID方法具有鲁棒性强、泛化性强、实时性强等优点。
更新日期:2021-07-09
down
wechat
bug