当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multipattern Mining Using Pattern-Level Contrastive Learning and Multipattern Activation Map.
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2024-07-08 , DOI: 10.1109/tnnls.2022.3218073
Xuefeng Liang 1 , Zhihui Liang 1 , Huiwen Shi 1 , Xiaosong Zhang 1 , Ying Zhou 1 , Yifan Ma 1
Affiliation  

Visual patterns are basic elements in images and represent the discernible regularity in the visual world. Thus, mining visual patterns is a fundamental task in computer vision. Most previous studies consider that only one visual pattern exists in a category, and then builds up a one-to-one mapping using category label. In reality, however, many categories include multiple patterns, which are many-to-one mappings. Without knowing the information of patterns, few existing pattern mining methods can discover and distinguish varied patterns in a category. To tackle this problem, we propose a novel framework, PaclMap, which learns medium-grained features to represent patterns. It includes an unsupervised pattern-level contrastive learning and a multipattern activation map. Their joint optimization encourages the network to mine both discriminative and frequent patterns in a category. Extensive experiments conducted on four benchmark datasets (Place-20, imagenet large scale visual recognition challenge (ILSVRC)-20, visual object classes (VOC), and Travel) demonstrate that PaclMap outperforms six state-of-the-art methods with average improvements of 2.9% on accuracy and 12.3% on frequency, respectively.

中文翻译:


使用模式级对比学习和多模式激活图进行多模式挖掘。



视觉图案是图像的基本元素,代表了视觉世界中可辨别的规律性。因此,挖掘视觉模式是计算机视觉的一项基本任务。之前的大多数研究都认为一个类别中只存在一种视觉模式,然后使用类别标签建立一对一的映射。然而,实际上,许多类别包含多种模式,这是多对一的映射。在不知道模式信息的情况下,现有的模式挖掘方法很少能够发现和区分一个类别中的不同模式。为了解决这个问题,我们提出了一个新颖的框架 PaclMap,它学习中粒度特征来表示模式。它包括无监督模式级对比学习和多模式激活图。它们的联合优化鼓励网络挖掘类别中的区分性模式和频繁模式。在四个基准数据集(Place-20、imagenet 大规模视觉识别挑战 (ILSVRC)-20、视觉对象类 (VOC) 和 Travel)上进行的大量实验表明,PaclMap 的性能优于六种最先进的方法,具有平均改进准确度和频率分别为 2.9% 和 12.3%。
更新日期:2022-11-10
down
wechat
bug