当前位置: X-MOL 学术Image Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Weighted boxes fusion: Ensembling boxes from different object detection models
Image and Vision Computing ( IF 4.7 ) Pub Date : 2021-02-03 , DOI: 10.1016/j.imavis.2021.104117
Roman Solovyev , Weimin Wang , Tatiana Gabruseva

Object detection is a crucial task in computer vision systems with a wide range of applications in autonomous driving, medical imaging, retail, security, face recognition, robotics, and others. Nowadays, neural networks-based models are used to localize and classify instances of objects of particular classes. When real-time inference is not required, ensembles of models help to achieve better results.

In this work, we present a novel method for fusing predictions from different object detection models: weighted boxes fusion. Our algorithm utilizes confidence scores of all proposed bounding boxes to construct averaged boxes.

We tested the method on several datasets and evaluated it in the context of Open Images and COCO Object Detection challenges, achieving top results in these challenges. The 3D version of boxes fusion was successfully applied by the winning teams of Waymo Open Dataset and Lyft 3D Object Detection for Autonomous Vehicles challenges. The source code is publicly available at GitHub (Solovyev, 2019 [31]).

We present a novel method for combining predictions in ensembles of different object detection models: weighted boxes fusion. This method significantly improves the quality of the fused predicted rectangles for an ensemble.

We tested the method on several datasets and evaluated it in the context of the Open Images and COCO Object Detection challenges. It helped to achieve top results in these challenges. The source code is publicly available at GitHub.



中文翻译:

加权盒融合:组装来自不同物体检测模型的盒子

在自动驾驶,医学成像,零售,安全性,人脸识别,机器人技术等领域具有广泛应用的计算机视觉系统中,目标检测是一项至关重要的任务。如今,基于神经网络的模型已用于对特定类的对象实例进行本地化和分类。当不需要实时推理时,模型集成有助于获得更好的结果。

在这项工作中,我们提出了一种融合来自不同物体检测模型的预测的新颖方法:加权框融合。我们的算法利用所有建议边界框的置信度得分来构造平均框。

我们在多个数据集上测试了该方法,并在“开放图像”和“ COCO对象检测”挑战的背景下对其进行了评估,从而在这些挑战中取得了最佳成绩。Waymo Open Dataset和Lyft 3D Object Detection的获胜团队成功地将3D版本的Box Fusion应用于自动驾驶汽车挑战赛。源代码可在GitHub上公开获得(Solovyev,2019 [31])。

我们提出了一种结合不同对象检测模型的集合中预测的新颖方法:加权框融合。该方法显着提高了融合预测矩形的质量。

我们在几个数据集上测试了该方法,并在“开放图像”和“ COCO对象检测”挑战的背景下对其进行了评估。它有助于在这些挑战中取得最佳结果。源代码可在GitHub上公开获得。

更新日期:2021-02-15
down
wechat
bug