当前位置: X-MOL 学术J. Real-Time Image Proc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
$$\text{C}^{3}\text{Net}$$ C 3 Net : end-to-end deep learning for efficient real-time visual active camera control
Journal of Real-Time Image Processing ( IF 2.9 ) Pub Date : 2021-02-20 , DOI: 10.1007/s11554-021-01077-z
Christos Kyrkou

The need for automated real-time visual systems in applications such as smart camera surveillance, smart environments, and drones necessitates the improvement of methods for visual active monitoring and control. Traditionally, the active monitoring task has been handled through a pipeline of modules such as detection, filtering, and control. However, such methods are difficult to jointly optimize and tune their various parameters for real-time processing in resource constraint systems. In this paper a deep Convolutional Camera Controller Neural Network is proposed to go directly from visual information to camera movement to provide an efficient solution to the active vision problem. It is trained end-to-end without bounding box annotations to control a camera and follow multiple targets from raw pixel values. Evaluation through both a simulation framework and real experimental setup, indicate that the proposed solution is robust to varying conditions and able to achieve better monitoring performance than traditional approaches both in terms of number of targets monitored as well as in effective monitoring time. The advantage of the proposed approach is that it is computationally less demanding and can run at over 10 FPS (\(\sim 4\times \) speedup) on an embedded smart camera providing a practical and affordable solution to real-time active monitoring.



中文翻译:

$$ \ text {C} ^ {3} \ text {Net} $$ C 3 Net:用于高效实时可视主动摄像机控制的端到端深度学习

在诸如智能相机监视,智能环境和无人机之类的应用中,对自动实时视觉系统的需求使得有必要改进用于视觉主动监视和控制的方法。传统上,主动监视任务是通过诸如检测,过滤和控制之类的模块管道来处理的。但是,这样的方法很难共同优化和调整其各种参数,以便在资源约束系统中进行实时处理。在本文中,提出了一种深度卷积相机控制器神经网络,该神经网络可以直接将视觉信息转化为相机运动,从而为主动视觉问题提供有效的解决方案。它经过端到端训练,没有边界框注释,可以控制摄像机并从原始像素值跟踪多个目标。通过仿真框架和实际实验设置进行的评估表明,所提出的解决方案在变化的条件下具有鲁棒性,并且在监控目标数量和有效监控时间方面均比传统方法具有更好的监控性能。拟议方法的优点是计算要求较低,并且可以以超过10 FPS的速度运行(嵌入式智能相机上的\(\ sim 4 \ times \}加速)为实时主动监控提供了实用且价格合理的解决方案。

更新日期:2021-02-21
down
wechat
bug