当前位置: X-MOL 学术Int. J. Adapt. Control Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reinforcement learning based closed‐loop reference model adaptive flight control system design
International Journal of Adaptive Control and Signal Processing ( IF 3.1 ) Pub Date : 2020-10-07 , DOI: 10.1002/acs.3181
Burak Yuksek 1 , Gokhan Inalhan 2
Affiliation  

In this study, we present a reinforcement learning (RL)‐based flight control system design method to improve the transient response performance of a closed‐loop reference model (CRM) adaptive control system. The methodology, known as RL‐CRM, relies on the generation of a dynamic adaption strategy by implementing RL on the variable factor in the feedback path gain matrix of the reference model. An actor‐critic RL agent is designed using the performance‐driven reward functions and tracking error observations from the environment. In the training phase, a deep deterministic policy gradient algorithm is utilized to learn the time‐varying adaptation strategy of the design parameter in the reference model feedback gain matrix. The proposed control structure provides the possibility to learn numerous adaptation strategies across a wide range of flight and vehicle conditions instead of being driven by high‐fidelity simulators or flight testing and real flight operations. The performance of the proposed system was evaluated on an identified and verified mathematical model of an agile quadrotor platform. Monte‐Carlo simulations and worst case analysis were also performed over a benchmark helicopter example model. In comparison to the classical model reference adaptive control and CRM‐adaptive control system designs, the proposed RL‐CRM adaptive flight control system design improves the transient response performance on all associated metrics and provides the capability to operate over a wide range of parametric uncertainties.

中文翻译:

基于强化学习的闭环参考模型自适应飞行控制系统设计

在本研究中,我们提出了一种基于强化学习(RL)的飞行控制系统设计方法,以改善闭环参考模型(CRM)自适应控制系统的瞬态响应性能。该方法称为RL-CRM,它通过在参考模型的反馈路径增益矩阵中的可变因子上实现RL来依赖于动态自适应策略的生成。使用性能驱动的奖励功能并跟踪来自环境的错误观察结果来设计行为准则的RL代理。在训练阶段,使用深度确定性策略梯度算法来学习参考模型反馈增益矩阵中设计参数的时变自适应策略。所提出的控制结构使人们有可能学习广泛的飞行和飞行条件下的多种适应策略,而不必由高保真模拟器或飞行测试以及真实的飞行操作来驱动。所提出的系统的性能在敏捷四旋翼平台的已识别和验证的数学模型上进行了评估。还对基准直升机示例模型进行了蒙特卡洛模拟和最坏情况分析。与经典模型参考自适应控制和CRM自适应控制系统设计相比,拟议的RL-CRM自适应飞行控制系统设计提高了所有相关度量的瞬态响应性能,并提供了在各种参数不确定性上运行的能力。
更新日期:2020-10-07
down
wechat
bug