Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2020-05-04 , DOI: 10.1109/tpami.2020.2992393
Longlong Jing , Yingli Tian

Large-scale labeled data are generally required to train deep neural networks in order to obtain better performance in visual feature learning from images or videos for computer vision applications. To avoid extensive cost of collecting and annotating large-scale datasets, as a subset of unsupervised learning methods, self-supervised learning methods are proposed to learn general image and video features from large-scale unlabeled data without using any human-annotated labels. This paper provides an extensive review of deep learning-based self-supervised general visual feature learning methods from images or videos. First, the motivation, general pipeline, and terminologies of this field are described. Then the common deep neural network architectures that used for self-supervised learning are summarized. Next, the schema and evaluation metrics of self-supervised learning methods are reviewed followed by the commonly used datasets for images, videos, audios, and 3D data, as well as the existing self-supervised visual feature learning methods. Finally, quantitative performance comparisons of the reviewed methods on benchmark datasets are summarized and discussed for both image and video feature learning. At last, this paper is concluded and lists a set of promising future directions for self-supervised visual feature learning.

中文翻译：

使用深度神经网络进行自我监督的视觉特征学习：一项调查

通常需要大规模标记数据来训练深度神经网络，以便在计算机视觉应用的图像或视频中获得更好的视觉特征学习性能。为了避免收集和标注大规模数据集的大量成本，作为无监督学习方法的一个子集，自监督学习方法被提出，在不使用任何人工标注标签的情况下，从大规模未标记数据中学习一般图像和视频特征。本文对图像或视频中基于深度学习的自监督通用视觉特征学习方法进行了广泛的回顾。首先，描述了该领域的动机、一般管道和术语。然后总结了用于自监督学习的常见深度神经网络架构。下一个，回顾了自监督学习方法的模式和评估指标，然后是常用的图像、视频、音频和 3D 数据数据集，以及现有的自监督视觉特征学习方法。最后，针对图像和视频特征学习，总结和讨论了所审查方法在基准数据集上的定量性能比较。最后，本文总结并列出了一组有前途的自监督视觉特征学习的未来方向。针对图像和视频特征学习，总结和讨论了在基准数据集上审查的方法的定量性能比较。最后，本文总结并列出了一组有前途的自监督视觉特征学习的未来方向。针对图像和视频特征学习，总结和讨论了在基准数据集上审查的方法的定量性能比较。最后，本文总结并列出了一组有前途的自监督视觉特征学习的未来方向。

更新日期：2020-05-04

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>