Actor–Critic Reinforcement Learning and Application in Developing Computer-Vision-Based Interface Tracking,Engineering

当前位置： X-MOL 学术 › Engineering › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Actor–Critic Reinforcement Learning and Application in Developing Computer-Vision-Based Interface Tracking
Engineering ( IF 10.1 ) Pub Date : 2021-08-14 , DOI: 10.1016/j.eng.2021.04.027
Oguzhan Dogru ₁ , Kirubakaran Velswamy ₁ , Biao Huang ₁

Affiliation

This paper synchronizes control theory with computer vision by formalizing object tracking as a sequential decision-making process. A reinforcement learning (RL) agent successfully tracks an interface between two liquids, which is often a critical variable to track in many chemical, petrochemical, metallurgical, and oil industries. This method utilizes less than 100 images for creating an environment, from which the agent generates its own data without the need for expert knowledge. Unlike supervised learning (SL) methods that rely on a huge number of parameters, this approach requires far fewer parameters, which naturally reduces its maintenance cost. Besides its frugal nature, the agent is robust to environmental uncertainties such as occlusion, intensity changes, and excessive noise. From a closed-loop control context, an interface location-based deviation is chosen as the optimization goal during training. The methodology showcases RL for real-time object-tracking applications in the oil sands industry. Along with a presentation of the interface tracking problem, this paper provides a detailed review of one of the most effective RL methodologies: actor–critic policy.

中文翻译：

Actor-Critic 强化学习及其在开发基于计算机视觉的界面跟踪中的应用

本文通过将对象跟踪形式化为顺序决策过程，将控制理论与计算机视觉同步。强化学习 (RL) 代理成功地跟踪了两种液体之间的界面，这通常是许多化学、石化、冶金和石油行业中跟踪的关键变量。该方法使用不到 100 张图像来创建环境，代理从中生成自己的数据，而无需专业知识。与依赖大量参数的监督学习 (SL) 方法不同，这种方法需要的参数少得多，这自然降低了其维护成本。除了其节俭的性质外，该代理还对环境不确定性（例如遮挡、强度变化和过度噪音）具有鲁棒性。从闭环控制上下文来看，选择基于界面位置的偏差作为训练期间的优化目标。该方法展示了 RL 在油砂行业中的实时对象跟踪应用。除了介绍界面跟踪问题外，本文还详细回顾了最有效的 RL 方法之一：actor-critic 策略。

更新日期：2021-08-14

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11