当前位置: X-MOL 学术Comp. Visual Media › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
BING: Binarized normed gradients for objectness estimation at 300fps
Computational Visual Media ( IF 17.3 ) Pub Date : 2019-04-08 , DOI: 10.1007/s41095-018-0120-1
Ming-Ming Cheng , Yun Liu , Wen-Yan Lin , Ziming Zhang , Paul L. Rosin , Philip H. S. Torr

Training a generic objectness measure to produce object proposals has recently become of significant interest. We observe that generic objects with well-defined closed boundaries can be detected by looking at the norm of gradients, with a suitable resizing of their corresponding image windows to a small fixed size. Based on this observation and computational reasons, we propose to resize the window to 8 × 8 and use the norm of the gradients as a simple 64D feature to describe it, for explicitly training a generic objectness measure. We further show how the binarized version of this feature, namely binarized normed gradients (BING), can be used for efficient objectness estimation, which requires only a few atomic operations (e.g., add, bitwise shift, etc.). To improve localization quality of the proposals while maintaining efficiency, we propose a novel fast segmentation method and demonstrate its effectiveness for improving BING’s localization performance, when used in multi-thresholding straddling expansion (MTSE) post-processing. On the challenging PASCAL VOC2007 dataset, using 1000 proposals per image and intersection-over-union threshold of 0.5, our proposal method achieves a 95.6% object detection rate and 78.6% mean average best overlap in less than 0.005 second per image.

中文翻译:

BING:用于300fps的客观性估计的二值化范数梯度

训练通用的客观性度量以产生目标提议最近已引起人们的极大兴趣。我们观察到,可以通过查看梯度范数来检测具有明确定义的封闭边界的通用对象,并将其相应图像窗口的大小适当调整为较小的固定大小。基于这种观察和计算原因,我们建议将窗口大小调整为8×8,并使用梯度范数作为简单的64D特征对其进行描述,以明确地训练通用的客观性度量。我们进一步展示了此功能的二值化版本,即二值化范数梯度(BING),如何用于有效的对象估计,该估计仅需要几个原子操作(例如,加,按位平移等)。为了提高提案的本地化质量,同时保持效率,我们提出了一种新颖的快速分割方法,并在多阈值跨界展开(MTSE)后处理中使用时,证明了其对提高BING定位性能的有效性。在具有挑战性的PASCAL VOC2007数据集上,使用每幅图像1000个建议和0.5的交叉重叠阈值,我们的建议方法可在不到0.005秒的时间内实现95.6%的对象检测率和78.6%的平均平均最佳重叠。
更新日期:2019-04-08
down
wechat
bug