Weakly-supervised multi-class object localization using only object counts as labels,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Weakly-supervised multi-class object localization using only object counts as labels
arXiv - CS - Artificial Intelligence Pub Date : 2021-02-23 , DOI: arxiv-2102.11743
Kyle Mills, Isaac Tamblyn

We demonstrate the use of an extensive deep neural network to localize instances of objects in images. The EDNN is naturally able to accurately perform multi-class counting using only ground truth count values as labels. Without providing any conceptual information, object annotations, or pixel segmentation information, the neural network is able to formulate its own conceptual representation of the items in the image. Using images labelled with only the counts of the objects present,the structure of the extensive deep neural network can be exploited to perform localization of the objects within the visual field. We demonstrate that a trained EDNN can be used to count objects in images much larger than those on which it was trained. In order to demonstrate our technique, we introduce seven new data sets: five progressively harder MNIST digit-counting data sets, and two datasets of 3d-rendered rubber ducks in various situations. On most of these datasets, the EDNN achieves greater than 99% test set accuracy in counting objects.

中文翻译：

仅使用对象计数作为标签的弱监督多类对象本地化

我们演示了使用广泛的深度神经网络来定位图像中对象的实例。EDNN自然仅使用地面真实计数值作为标签就可以准确地执行多类别计数。在不提供任何概念性信息，对象注释或像素分割信息的情况下，神经网络能够为图像中的项目制定自己的概念性表示。使用仅标记存在的对象计数的图像，可以利用广泛的深度神经网络的结构在视野内执行对象的定位。我们证明，经过训练的EDNN可以用于对比经过训练的图像大得多的图像中的对象进行计数。为了演示我们的技术，我们引入了七个新的数据集：五个逐渐变硬的MNIST数字计数数据集，以及两个在各种情况下3D渲染的橡皮鸭的数据集。在大多数这些数据集上，EDNN在计数对象时达到了超过99％的测试集精度。

更新日期：2021-02-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文