当前位置: X-MOL 学术Neurocomputing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adaptive Neuro-Fuzzy PID Controller based on Twin Delayed Deep Deterministic Policy Gradient Algorithm
Neurocomputing ( IF 6 ) Pub Date : 2020-08-01 , DOI: 10.1016/j.neucom.2020.03.063
Qian Shi , Hak-Keung Lam , Chengbin Xuan , Ming Chen

Abstract This paper presents an adaptive neuro-fuzzy PID controller based on twin delayed deep deterministic policy gradient (TD3) algorithm for nonlinear systems. In this approach, the observation of the environment is embedded with information of a multiple input single output (MISO) fuzzy inference system (FIS) and have a specially defined fuzzy PID controller in neural network (NN) formation acting as the actor in the TD3 algorithm, which achieves automatic tuning of gains of fuzzy PID controller. From the control perspective, the controller combines the merits of both FIS and PID controller and utilizes reinforcement learning algorithm for optimizing parameters. From the reinforcement learning point of view, embedding the prior knowledge into the fuzzy PID controller incorporated in the actor network helps reduce the learning difficulty in the training process. The proposed method was tested on the cart-pole system in simulation environment with comparison of a linear PID controller, which demonstrates the robustness and generalization of the proposed approach.

中文翻译:

基于双时延深度确定性策略梯度算法的自适应神经模糊PID控制器

摘要 本文提出了一种基于双延迟深度确定性策略梯度(TD3)算法的非线性系统自适应神经模糊PID控制器。在这种方法中,环境的观察嵌入了多输入单输出 (MISO) 模糊推理系统 (FIS) 的信息,并在神经网络 (NN) 形成中具有一个专门定义的模糊 PID 控制器作为 TD3 中的参与者算法,实现模糊PID控制器增益的自动整定。从控制的角度来看,该控制器结合了 FIS 和 PID 控制器的优点,并利用强化学习算法来优化参数。从强化学习的角度来看,将先验知识嵌入到演员网络中的模糊 PID 控制器中有助于降低训练过程中的学习难度。所提出的方法在仿真环境中的车杆系统上进行了测试,并与线性 PID 控制器进行了比较,这证明了所提出方法的鲁棒性和通用性。
更新日期:2020-08-01
down
wechat
bug