当前位置: X-MOL 学术Image Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CAM: A fine-grained vehicle model recognition method based on visual attention model
Image and Vision Computing ( IF 4.2 ) Pub Date : 2020-10-02 , DOI: 10.1016/j.imavis.2020.104027
Ye Yu , Longdao Xu , Wei Jia , Wenjia Zhu , Yunxiang Fu , Qiang Lu

Vehicle model recognition (VMR) is a typical fine-grained classification task in computer vision. To improve the representation power of classical CNN networks for this special task, we focus on enhancing the subtle difference of features and their spatial encoding based on the attention mechanism, and then propose a novel architectural unit, which we term the “convolutional attention model” (CAM). It adopts a two-stage attention mechanism for VMR, which includes the global feature map attention (GFMA) algorithm, applied at the lower part of the main network flow to enhance the subtle feature difference from the beginning, and the feature spatial relationship attention (FSRA) algorithm, applied at the higher part to enhance the spatial relationship of features. The experiments are conducted on the benchmark CompCars web-nature and Stanford Car datasets and demonstrate the effectiveness of CAM when integrated with some classical CNN architectures. CAM can improve the top-1 recognition accuracy by an average of 1.15% and top-5 by an average of 0.78%.



中文翻译:

CAM:基于视觉注意模型的细粒度车辆模型识别方法

车辆模型识别(VMR)是计算机视觉中典型的细粒度分类任务。为了提高经典CNN网络在此特殊任务上的表示能力,我们将重点放在增强特征的细微差异及其基于注意力机制的空间编码上,然后提出一种新颖的架构单元,我们称之为“卷积注意力模型” (CAM)。它为VMR采用两阶段注意机制,其中包括全局特征图注意(GFMA)算法,该算法应用于主网络流的下部,从一开始就增强了细微的特征差异,以及特征空间关系注意( FSRA)算法,在较高部分应用以增强特征的空间关系。实验是在基准CompCars网络性质和斯坦福汽车数据集上进行的,证明了CAM与某些经典CNN架构集成时的有效性。CAM可以将top-1的识别准确度平均提高1.15%,将top-5的识别准确度平均提高0.78%。

更新日期:2020-10-17
down
wechat
bug