当前位置: X-MOL 学术PeerJ Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Online supervised attention-based recurrent depth estimation from monocular video
PeerJ Computer Science ( IF 3.5 ) Pub Date : 2020-11-23 , DOI: 10.7717/peerj-cs.317
Dmitrii Maslov 1 , Ilya Makarov 1, 2
Affiliation  

Autonomous driving highly depends on depth information for safe driving. Recently, major improvements have been taken towards improving both supervised and self-supervised methods for depth reconstruction. However, most of the current approaches focus on single frame depth estimation, where quality limit is hard to beat due to limitations of supervised learning of deep neural networks in general. One of the way to improve quality of existing methods is to utilize temporal information from frame sequences. In this paper, we study intelligent ways of integrating recurrent block in common supervised depth estimation pipeline. We propose a novel method, which takes advantage of the convolutional gated recurrent unit (convGRU) and convolutional long short-term memory (convLSTM). We compare use of convGRU and convLSTM blocks and determine the best model for real-time depth estimation task. We carefully study training strategy and provide new deep neural networks architectures for the task of depth estimation from monocular video using information from past frames based on attention mechanism. We demonstrate the efficiency of exploiting temporal information by comparing our best recurrent method with existing image-based and video-based solutions for monocular depth reconstruction.

中文翻译:

基于单眼视频的在线监督基于注意力的循环深度估计

自主驾驶高度依赖于深度信息来确保安全驾驶。近来,已经对改善深度重建的监督方法和自我监督方法进行了重大改进。但是,当前的大多数方法都集中在单帧深度估计上,由于一般深度神经网络的监督学习的局限性,难以克服质量极限。改善现有方法质量的方法之一是利用来自帧序列的时间信息。在本文中,我们研究了在常规监督深度估计管线中集成递归块的智能方法。我们提出了一种新方法,该方法利用了卷积门控循环单元(convGRU)和卷积长短期记忆(convLSTM)。我们比较convGRU和convLSTM块的使用情况,并确定用于实时深度估计任务的最佳模型。我们会仔细研究训练策略,并根据注意力机制使用过去帧中的信息,为单眼视频的深度估计任务提供新的深度神经网络架构。通过将我们最好的循环方法与现有的基于图像和基于视频的解决方案进行单眼深度重建相比较,我们证明了利用时间信息的效率。
更新日期:2020-11-23
down
wechat
bug