Baggage Image Retrieval with Attention-Based Network for Security Checks,International Journal of Pattern Recognition and Artificial Intelligence

当前位置： X-MOL 学术 › Int. J. Pattern Recognit. Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Baggage Image Retrieval with Attention-Based Network for Security Checks
International Journal of Pattern Recognition and Artificial Intelligence ( IF 0.9 ) Pub Date : 2021-05-14 , DOI: 10.1142/s0218001421550090
Gan Huang ₁ , Li Yang ₁ , Ding Zhang ₂ , Xiaofeng Wang ₁ , Yanfu Wang ₁

Affiliation

Baggage image search is the task of finding the same baggage across several cameras, which can improve the efficiency of security check. Since existing state-of-the-art image retrieval requires a significant amount of training data, we aim to investigate the few-shot situation, utilizing only a few samples for each baggage. In this paper, the framework we introduced is called AttentionNet, exploring the weak region-wise annotations as attention clues to improve the retrieval performance. Specifically, we integrate the semantic and recognition tasks using shared convolutional neural networks by multi-task learning. Semantic attention mechanism is used to explicitly re-weight the feature map for emphasizing the foreground. To prevent overfitting in few-shot training, we adopt a variant of the triplet loss to perform deep metric learning with an online hard triplet mining strategy. Once trained, AttentionNet identifies a probe image by computing its embedding’s cosine distance with images in the gallery. Experiments show that our method achieves state-of-the-art accuracy in image search. For instance, we obtain 82.9% Rank-1 score on the MVB dataset, dramatically outperforming the currently existing methods.

中文翻译：

基于注意力网络的行李图像检索以进行安全检查

行李图像搜索是跨多个摄像头查找相同行李的任务，可以提高安检效率。由于现有的最先进的图像检索需要大量的训练数据，我们的目标是调查少样本的情况，每个行李只使用几个样本。在本文中，我们介绍的框架称为 AttentionNet，探索弱区域标注作为注意力线索，以提高检索性能。具体来说，我们通过多任务学习使用共享卷积神经网络集成语义和识别任务。语义注意机制用于显式地重新加权特征图以强调前景。为了防止在少样本训练中过度拟合，我们采用三元组损失的一种变体，通过在线硬三元组挖掘策略执行深度度量学习。训练完成后，AttentionNet 通过计算其嵌入与图库中图像的余弦距离来识别探测图像。实验表明，我们的方法在图像搜索中达到了最先进的精度。例如，我们在 MVB 数据集上获得了 82.9% 的 Rank-1 分数，大大优于目前现有的方法。

更新日期：2021-05-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11