MiniNet: An extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation,ISPRS Journal of Photogrammetry and Remote Sensing

当前位置： X-MOL 学术 › ISPRS J. Photogramm. Remote Sens. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

MiniNet: An extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 10.6 ) Pub Date : 2020-06-23 , DOI: 10.1016/j.isprsjprs.2020.06.004
Jun Liu , Qing Li , Rui Cao , Wenming Tang , Guoping Qiu

Predicting depth from a single image is an attractive research topic since it provides one more dimension of information to enable machines to better perceive the world. Recently, deep learning has emerged as an effective approach to monocular depth estimation. As obtaining labeled data is costly, there is a recent trend to move from supervised learning to unsupervised learning to obtain monocular depth. However, most unsupervised learning methods capable of achieving high depth prediction accuracy will require a deep network architecture which will be too heavy and complex to run on embedded devices with limited storage and memory spaces. To address this issue, we propose a new powerful network with a recurrent module to achieve the capability of a deep network while at the same time maintaining an extremely lightweight size for real-time high performance unsupervised monocular depth prediction from video sequences. Besides, a novel efficient upsample block is proposed to fuse the features from the associated encoder layer and recover the spatial size of features with the small number of model parameters. We validate the effectiveness of our approach via extensive experiments on the KITTI dataset. Our new model can run at a speed of about 110 frames per second (fps) on a single GPU, 37 fps on a single CPU, and 2 fps on a Raspberry Pi 3. Moreover, it achieves higher depth accuracy with nearly 33 times fewer model parameters than state-of-the-art models. To the best of our knowledge, this work is the first extremely lightweight neural network trained on monocular video sequences for real-time unsupervised monocular depth estimation, which opens up the possibility of implementing deep learning-based real-time unsupervised monocular depth prediction on low-cost embedded devices.

中文翻译：

MiniNet：一种极轻量级的卷积神经网络，用于实时无监督单眼深度估计

从单个图像预测深度是一个有吸引力的研究主题，因为它提供了更多的信息维度，使机器可以更好地感知世界。最近，深度学习已成为一种有效的单眼深度估计方法。由于获取标记数据的成本很高，因此最近出现了从有监督的学习过渡到无监督的学习以获得单眼深度的趋势。然而，大多数能够实现高深度预测精度的无监督学习方法将需要深度网络架构，该网络架构将过于沉重和复杂，无法在存储和存储空间有限的嵌入式设备上运行。为了解决这个问题，我们提出了一个具有递归模块的新型强大网络，以实现深层网络的功能，同时保持极轻量级的大小，以便根据视频序列进行实时高性能，无监督单眼深度预测。此外，提出了一种新颖的有效上采样块，以融合来自相关编码器层的特征，并以少量的模型参数来恢复特征的空间大小。我们通过在KITTI数据集上进行广泛的实验来验证我们方法的有效性。我们的新模型可以在单个GPU上以约110帧/秒的速度运行，在单个CPU上以37 fps的速度运行，在Raspberry Pi 3上以2 fps的速度运行。此外，它以更低的33倍实现了更高的深度精度。模型参数，而不是最新模型。据我们所知，

更新日期：2020-06-23

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11