Journal of Ambient Intelligence and Humanized Computing Pub Date : 2021-06-19 , DOI: 10.1007/s12652-021-03340-4 Shaodan Lin , Kexin Zhu , Chen Feng , Zhide Chen
Object detection is a classic problem in computer vision. The main bottleneck of object detection lies in the fusion of multi-scale features. In this paper, we systematically study the design choices of neural network architecture for real-time object detection, and propose an Align-Yolact to improve the instance segmentation accuracy. Firstly, we propose a weighted bounding box, which improves the accurate positioning of the bounding box. Secondly, we add a bi-directional feature pyramid network to the feature fusion, which improves the mask quality and small target accuracy. Owing to these optimizations and better backbones, we achieve the SOTA results including both detection efficiency and accuracy.
中文翻译:
Align-Yolact:用于实时对象检测的单阶段语义分割网络
目标检测是计算机视觉中的一个经典问题。目标检测的主要瓶颈在于多尺度特征的融合。在本文中,我们系统地研究了用于实时对象检测的神经网络架构的设计选择,并提出了一种 Align-Yolact 来提高实例分割的准确性。首先,我们提出了一个加权边界框,它提高了边界框的准确定位。其次,我们在特征融合中加入了双向特征金字塔网络,提高了掩码质量和小目标精度。由于这些优化和更好的主干,我们实现了 SOTA 结果,包括检测效率和准确性。