当前位置: X-MOL 学术Aerosp. Sci. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel model-free robust saturated reinforcement learning-based controller for quadrotors guaranteeing prescribed transient and steady state performance
Aerospace Science and Technology ( IF 5.0 ) Pub Date : 2021-09-24 , DOI: 10.1016/j.ast.2021.107128
Omid Elhaki 1 , Khoshnam Shojaei 1, 2
Affiliation  

For the purpose of improving the performance of trajectory tracking for quadrotors with the control input saturation, a novel model-free saturated prescribed performance reinforcement learning framework is proposed in the presence of the model uncertainties, nonlinearities and external disturbances. In this paper, saturation functions are employed to deal with input saturation, and the actuator's saturation nonlinearity is compensated by an intelligent method to decrease the saturation effects. Moreover, the prescribed performance control is utilized to ensure an adjustable transient and steady state response for the tracking errors. Besides, adaptive robust controllers are introduced to handle the effects of external disturbances online. A novel controller is proposed in collaboration with a reinforcement learning method based on actor-critic neural networks. The actor neural network is employed to estimate nonlinearities, actuator saturation nonlinearity, and model uncertainties, and the critic neural network is applied to estimate the reinforcement signals, which regulates the control action of the actor neural network online. The proposed actor-critic-based control structure benefits from a model-free calculation and only depends on the measurable signals of the closed-loop system. This freedom from system dynamics leads to a significant low computational load for the controller and, therefore, the proposed control method is computationally cost-effective. The adaptive robust controllers and the proposed actor-critic structures are trained online, and the convergence behavior of their learning laws is investigated in the course of stability examination. For the proof of stability, Lyapunov's direct method is used to show that all error variables of the closed-loop nonlinear control system are uniformly ultimately bounded. Finally, simulations along with some quantitative comparisons verify the efficiency and usefulness of the proposed control scheme.



中文翻译:

一种新型的无模型鲁棒饱和强化学习四旋翼控制器,保证规定的瞬态和稳态性能

为了提高控制输入饱和的四旋翼飞行器轨迹跟踪的性能,在存在模型不确定性、非线性和外部干扰的情况下,提出了一种新的无模型饱和规定性能强化学习框架。本文采用饱和函数来处理输入饱和,并通过智能方法补偿执行器的饱和非线性,以减少饱和效应。此外,规定的性能控制用于确保跟踪误差的可调节瞬态和稳态响应。此外,引入了自适应鲁棒控制器来在线处理外部干扰的影响。提出了一种新型控制器与基于演员-评论家神经网络的强化学习方法。使用actor神经网络来估计非线性、执行器饱和非线性和模型不确定性,使用critic神经网络来估计增强信号,在线调节actor神经网络的控制动作。所提出的基于 actor-critic 的控制结构受益于无模型计算,并且仅取决于闭环系统的可测量信号。这种不受系统动力学影响的自由导致控制器的计算负荷显着降低,因此,所提出的控制方法在计算上具有成本效益。自适应鲁棒控制器和提议的演员-评论家结构在线训练,在稳定性检验过程中考察了它们的学习规律的收敛行为。为了证明稳定性,采用李雅普诺夫直接法证明闭环非线性控制系统的所有误差变量都是一致最终有界的。最后,模拟以及一些定量比较验证了所提出的控制方案的效率和有用性。

更新日期:2021-10-04
down
wechat
bug