当前位置: X-MOL 学术Front. Inform. Technol. Electron. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems
Frontiers of Information Technology & Electronic Engineering ( IF 2.7 ) Pub Date : 2021-01-08 , DOI: 10.1631/fitee.1900610
Haiyun Zhang , Deyuan Meng , Jin Wang , Guodong Lu

We present a novel indirect adaptive fuzzy-regulated optimal control scheme for continuous-time nonlinear systems with unknown dynamics, mismatches, and disturbances. Initially, the Hamilton-Jacobi-Bellman (HJB) equation associated with its performance function is derived for the original nonlinear systems. Unlike existing adaptive dynamic programming (ADP) approaches, this scheme uses a special non-quadratic variable performance function as the reinforcement medium in the actor-critic architecture. An adaptive fuzzy-regulated critic structure is correspondingly constructed to configure the weighting matrix of the performance function for the purpose of approximating and balancing the HJB equation. A concurrent self-organizing learning technique is designed to adaptively update the critic weights. Based on this particular critic, an adaptive optimal feedback controller is developed as the actor with a new form of augmented Riccati equation to optimize the fuzzy-regulated variable performance function in real time. The result is an online indirect adaptive optimal control mechanism implemented as an actor-critic structure, which involves continuous-time adaptation of both the optimal cost and the optimal control policy. The convergence and closed-loop stability of the proposed system are proved and guaranteed. Simulation examples and comparisons show the effectiveness and advantages of the proposed method.



中文翻译:

未知连续非线性系统的间接自适应模糊调节最优控制

我们提出了一种具有未知动力学,失配和扰动的连续时间非线性系统的新型间接自适应模糊调节最优控制方案。最初,为原始非线性系统导出与其性能函数相关的汉密尔顿-雅各比-贝尔曼(HJB)方程。与现有的自适应动态规划(ADP)方法不同,此方案使用特殊的非二次变量性能函数作为参与者评判体系中的增强介质。相应地构造自适应模糊调节的批评者结构,以配置性能函数的加权矩阵,以逼近和平衡HJB方程。一种并发的自组织学习技术旨在自适应地更新评论者的权重。根据这位评论家,自适应最优反馈控制器被开发为具有新形式的增强Riccati方程的演员,以实时优化模糊调节变量性能函数。结果是实现为行为者批判结构的在线间接自适应最优控制机制,该机制涉及最优成本和最优控制策略的连续时间自适应。证明并保证了所提出系统的收敛性和闭环稳定性。仿真算例和比较结果表明了该方法的有效性和优势。结果是实现为行为者批判结构的在线间接自适应最优控制机制,该机制涉及最优成本和最优控制策略的连续时间自适应。证明并保证了所提出系统的收敛性和闭环稳定性。仿真算例和比较结果表明了该方法的有效性和优势。结果是实现为行为者批判结构的在线间接自适应最优控制机制,该机制涉及最优成本和最优控制策略的连续时间自适应。证明并保证了所提出系统的收敛性和闭环稳定性。仿真算例和比较结果表明了该方法的有效性和优势。

更新日期:2021-01-08
down
wechat
bug