Source-Guided Target Feature Reconstruction for Cross-Domain Classification and Detection,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Source-Guided Target Feature Reconstruction for Cross-Domain Classification and Detection
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2024-04-09 , DOI: 10.1109/tip.2024.3384766
Yifan Jiao ₁ , Hantao Yao ₂ , Bing-Kun Bao ₁ , Changsheng Xu ₂

Affiliation

Existing cross-domain classification and detection methods usually apply a consistency constraint between the target sample and its self-augmentation for unsupervised learning without considering the essential source knowledge. In this paper, we propose a Source-guided Target Feature Reconstruction (STFR) module for cross-domain visual tasks, which applies source visual words to reconstruct the target features. Since the reconstructed target features contain the source knowledge, they can be treated as a bridge to connect the source and target domains. Therefore, using them for consistency learning can enhance the target representation and reduce the domain bias. Technically, source visual words are selected and updated according to the source feature distribution, and applied to reconstruct the given target feature via a weighted combination strategy. After that, consistency constraints are built between the reconstructed and original target features for domain alignment. Furthermore, STFR is connected with the optimal transportation algorithm theoretically, which explains the rationality of the proposed module. Extensive experiments on nine benchmarks and two cross-domain visual tasks prove the effectiveness of the proposed STFR module, e.g., 1) cross-domain image classification: obtaining average accuracy of 91.0%, 73.9%, and 87.4% on Office-31, Office-Home, and VisDA-2017, respectively; 2) cross-domain object detection: obtaining mAP of 44.50% on Cityscapes

$\rightarrow $

Foggy Cityscapes, AP on car of 78.10% on Cityscapes

$\rightarrow $

KITTI, MR

$^{-2}$

of 8.63%, 12.27%, 22.10%, and 40.58% on COCOPersons

$\rightarrow $

Caltech, CityPersons

$\rightarrow $

Caltech, COCOPersons

$\rightarrow $

CityPersons, and Caltech

$\rightarrow $

CityPersons, respectively.

中文翻译：

用于跨域分类和检测的源引导目标特征重建

现有的跨域分类和检测方法通常在目标样本与其自我增强之间施加一致性约束以进行无监督学习，而不考虑本质的源知识。在本文中，我们提出了一种用于跨域视觉任务的源引导目标特征重建（STFR）模块，该模块应用源视觉词来重建目标特征。由于重建的目标特征包含源知识，因此它们可以被视为连接源域和目标域的桥梁。因此，使用它们进行一致性学习可以增强目标表示并减少领域偏差。从技术上讲，源视觉词根据源特征分布进行选择和更新，并通过加权组合策略应用于重建给定的目标特征。之后，在重建的目标特征和原始目标特征之间建立一致性约束以进行域对齐。此外，STFR在理论上与最优运输算法相联系，这说明了该模块的合理性。对九个基准和两个跨域视觉任务的大量实验证明了所提出的 STFR 模块的有效性，例如，1）跨域图像分类：在 Office-31、Office 上获得平均准确度为 91.0%、73.9% 和 87.4% -分别为Home和VisDA-2017； 2) 跨域目标检测：在 Cityscapes 上获得 44.50% 的 mAP

$\右箭头$

Foggy Cityscapes，美联社在 Cityscapes 上的汽车占 78.10%

$\右箭头$

基蒂先生

$^{-2}$

COCOPersons 上的比例分别为 8.63%、12.27%、22.10% 和 40.58%

$\右箭头$

加州理工学院、城市人

$\右箭头$

加州理工学院，COCO人

$\右箭头$

CityPersons 和加州理工学院