Data Augmentation for Object Detection via Differentiable Neural Rendering,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Data Augmentation for Object Detection via Differentiable Neural Rendering
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-03-04 , DOI: arxiv-2103.02852
Guanghan Ning, Guang Chen, Chaowei Tan, Si Luo, Liefeng Bo, Heng Huang

It is challenging to train a robust object detector when annotated data is scarce. Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data, self-supervised learning that exploit signals within unlabeled data via pretext tasks. Without changing the supervised learning paradigm, we introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views. Specifically, our proposed system generates controllable views of training images based on differentiable neural rendering, together with corresponding bounding box annotations which involve no human intervention. Firstly, we extract and project pixel-aligned image features into point clouds while estimating depth maps. We then re-project them with a target camera pose and render a novel-view 2d image. Objects in the form of keypoints are marked in point clouds to recover annotations in new views. It is fully compatible with online data augmentation methods, such as affine transform, image mixup, etc. Extensive experiments show that our method, as a cost-free tool to enrich images and labels, can significantly boost the performance of object detection systems with scarce training data. Code is available at \url{https://github.com/Guanghan/DANR}.

中文翻译：

通过差分神经渲染进行对象检测的数据增强

在缺少注释数据的情况下，训练鲁棒的对象检测器具有挑战性。解决该问题的现有方法包括：半监督学习，用于从未标记数据中插入标记数据；自我监督学习，其通过借口任务利用未标记数据中的信号。在不更改监督学习范式的情况下，我们引入了一种用于对象检测的离线数据增强方法，该方法在语义上以新颖的观点对训练数据进行插值。具体而言，我们提出的系统基于可区分的神经渲染以及不涉及人工干预的相应边界框注释生成训练图像的可控视图。首先，我们在估计深度图的同时将像素对齐的图像特征提取并投影到点云中。然后，我们使用目标相机姿势重新投影它们，并渲染一个新颖的2D图像。在点云中标记关键点形式的对象，以在新视图中恢复注释。它与仿射变换，图像混合等在线数据增强方法完全兼容。广泛的实验表明，作为一种免费的丰富图像和标签的工具，我们的方法可以显着提高物资检测系统的性能，而这种资源很少训练数据。可以从\ url {https://github.com/Guanghan/DANR}获得代码。作为丰富的图像和标签的免费工具，可以通过缺乏培训数据显着提高对象检测系统的性能。可以从\ url {https://github.com/Guanghan/DANR}获得代码。作为丰富的图像和标签的免费工具，可以通过缺乏培训数据显着提高对象检测系统的性能。可以从\ url {https://github.com/Guanghan/DANR}获得代码。

更新日期：2021-03-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>