当前位置: X-MOL 学术IEEE Trans. Cogn. Dev. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adaptive Image-Based Visual Servoing for Hovering Control of Quad-rotor
IEEE Transactions on Cognitive and Developmental Systems ( IF 5 ) Pub Date : 2020-09-01 , DOI: 10.1109/tcds.2019.2908923
Haobin Shi , Lin Shi , Gang Sun , Kao-Shing Hwang

Image-based visual servoing (IBVS) achieves precise positioning and motion control for a relatively stationary target by visual feedback, but problems persist with convergence and stability. Appropriate servoing gains for the IBVS are critical to the convergence and stability, but this control gain is heuristically a constant for most IBVS applications. This paper proposes an integrated method that allows adaptive adjustment of the servoing gain by reinforcement learning (RL) for IBVS control. The proposed method learns a policy to determine the value of the servoing gain on the fly. To ensure rapid convergence for the RL, truncating ${Q}$ -learning (TQL) with faster convergence is used as learning algorithm, which uses truncated temporal differences (TDs) to update the TD. A nonuniform state space partitioning as a state encoder for RL allows more efficient policy. A strategy that uses the Metropolis derived from the simulated annealing is introduced for selecting the action, in order to balance exploration and exploitation so as to accelerate the learning speed. The integrated IBVS control system is tested using experiments involving a quad-rotor helicopter hovering control. The results of simulation and experiment show that the integrated IBVS method increases stability and ensures more rapid convergence than other methods.

中文翻译:

用于四旋翼悬停控制的自适应图像视觉伺服

基于图像的视觉伺服(IBVS)通过视觉反馈对相对静止的目标实现精确定位和运动控制,但收敛性和稳定性问题仍然存在。IBVS 的适当伺服增益对于收敛性和稳定性至关重要,但对于大多数 IBVS 应用,该控制增益在启发式上是一个常数。本文提出了一种集成方法,该方法允许通过 IBVS 控制的强化学习 (RL) 自适应调整伺服增益。所提出的方法学习一种策略来动态确定伺服增益的值。为了确保 RL 的快速收敛,使用收敛速度更快的截断 ${Q}$ -learning (TQL) 作为学习算法,该算法使用截断的时间差异 (TD) 来更新 TD。作为 RL 的状态编码器的非均匀状态空间分区允许更有效的策略。引入了一种使用模拟退火衍生的 Metropolis 的策略来选择动作,以平衡探索和开发,从而加快学习速度。集成 IBVS 控制系统使用涉及四旋翼直升机悬停控制的实验进行测试。仿真和实验结果表明,与其他方法相比,集成IBVS方法提高了稳定性并保证了更快的收敛。集成 IBVS 控制系统使用涉及四旋翼直升机悬停控制的实验进行测试。仿真和实验结果表明,与其他方法相比,集成IBVS方法提高了稳定性并保证了更快的收敛。集成 IBVS 控制系统使用涉及四旋翼直升机悬停控制的实验进行测试。仿真和实验结果表明,与其他方法相比,集成IBVS方法提高了稳定性并保证了更快的收敛。
更新日期:2020-09-01
down
wechat
bug