Collaborative representation with curriculum classifier boosting for unsupervised domain adaptation

https://doi.org/10.1016/j.patcog.2020.107802Get rights and content

Highlights

  • We present a novel unsupervised domain adaptation solution based on collaborative representation which seeks for the close samples between domains and uses them to assist further predictions. Plenty of experiments validate the effectiveness of our method and more general, we can solve domain adaptation problems without reducing domain discrepancy explicitly, which is different from previous methods.

  • Curriculum sample choosing is proposed to select the close samples between domains based on reconstruction residual. Then these samples are added to training set for subsequent prediction.

  • We propose distance-aware sparsity regularization to learn more reasonable representation, so that samples have smaller distance to the query sample are intended to have larger weights.

Abstract

Domain adaptation aims at leveraging rich knowledge in the source domain to build an accurate classifier in the different but related target domain. Most prior methods attempt to align features or reduce domain discrepancy by means of statistical properties yet ignore the differences among samples. In this paper, we put forward a novel solution based on collaborative representation for classifier adaptation. Similar to instance re-weighting, we aim to learn an adaptive classifier by multi-stage inference and instance rearranging. Specifically, a curriculum learning based sample selection scheme is proposed, then the chosen samples are integrated into training set iteratively. Due to the distribution mismatch of two domains, we propose distance-aware sparsity regularization to learn more flexible representations. Extensive experiments verify that the proposed method is comparable or superior to the state-of-the-art methods.

Introduction

Traditional machine learning deals with the scenario that training and testing data have the same distribution. Despite its appeal, the performance degrades when there is a clear gap between training and testing data. Recently, the literature has witnessed an increasing interest in developing domain adaptation (DA) methods for cross-domain recognition. A basic assumption is that the training (source domain) and testing (target domain) data are different but related and the tasks are to learn the cross-domain knowledge and make accurate predictions in target domain. Domain adaptation has showed remarkable performance in computer vision [1], speech recognition [2] and many other areas.

Previous works seek for appropriate feature space to reduce domain discrepancy. Ben et al. analyzed domain adaptation theoretically and pointed out that we should reduce both the source error and the distance between domains [3]. A common way is to minimize the statistic based on moments, either the first (maximum mean discrepancy, MMD [4]) or the second moment (CORAL [5]). Pan et al. mapped the data to Hilbert space with the goal of minimizing MMD and proposed Transfer Component Analysis (TCA), the major contribution of which is that it provides a closed form solution for learning transfer features [6]. Extended from TCA, Long et al. further adapted within-class features by assigning target samples pseudo labels [7]. Sun et al. attempted to align the covariances of two domains and put forward CORrelation ALignment (CORAL) [5]. Inspired by the rapid development of deep learning, domain adaptation with deep architectures also comes a long way. A bunch of deep methods simply align features in the middle layers like shallow methods. Based on classical ALEXNET, Tzeng et al. added a MMD layer before softmax and proposed Deep Domain Confusion [8]. Long et al. showed that adapting more layers helps to improve the results [9]. Another interesting idea is adversarial training which establishes a feature extractor and a domain discriminator simultaneously. The task of the feature extractor is to generate features from two domains for fooling the discriminator and the domain discriminator is to identify which domain the feature comes from Pei et al. [10].

Although feature-based methods have been widely used, they still need a classifier to make predictions. Naturally, the idea of classifier adaptation [11] comes up: the classifier trained on source domain is biased when applied on target domain due to the distribution mismatch, then to learn a generalized classifier that can be utilized on related domains with domain shift. Collaborative representation (CR) based classification [12] is an variant of nearest subspace classifier which shows attractive performance in many areas. Adaptation with CR also attracts the attention of researchers. Tang et al. exploited the local-neighbor geometrical information to learn adaptive representation of target samples [13]. Zhang et al. learned a common dictionary for two domains and presented a kernelized CR method for domain adaptation [14].

Existing classifier adaptation methods try to learn the decision boundary and align features simultaneously. Similar to feature-extraction based DA methods, they usually minimize statistical discrepancy between two domains, thus are sensitive to outliers and class weight bias [15]. In this paper, we investigate the similarity among samples in two domains and different measures are applied on different domains. The key underlying idea of this paper is two-fold: (1) Since there exists distribution gap between two domains, the cluster assumption and manifold assumption across domains do not always hold true. Furthermore, we assume that the mentioned violation reveals on a subset of samples, so the importance of source samples vary from each other. Consequently, we propose distance-aware sparsity regularization for source samples, which helps the model to learn flexible representation under domain shift. (2) Existing methods make inference on target samples independently, only the correlation of target-to-source is modeled during prediction, which means that they ignore the intrinsic discrimination power (target-to-target) existing inside target domain. Inspired by the success of curriculum learning, we propose curriculum sample choosing to rank and select the samples with more reliable labels, then they are integrated with source samples thus make predictions for hard samples. Our contributions can be summarized as follows.

  • We present a novel unsupervised domain adaptation solution based on collaborative representation which seeks for the close samples between domains and uses them to assist further predictions. Plenty of experiments validate the effectiveness of our method and more general, we can solve domain adaptation problems without reducing domain discrepancy explicitly, which is different from previous methods.

  • To recognize the domain discrepancy and adjust decision boundaries accordingly, we propose distance-aware sparsity regularization. Compared to a unified sparsity, it encourages the samples close to target domain contribute larger in learned representation than those far away.

  • For better exploiting the discrimination power in target domain itself, we propose curriculum sample choosing that employs a multi-stage inference framework. The underlying philosophy is that samples have different difficulties to be predicted, and easy samples along with their assigned labels can assist further predictions for hard samples.

The rest of the paper is organized as follows. Section 2 details the domain adaptation problem and related works, then introduces collaborative representation based classification. Our method is introduced in Section 3 and experimental evaluation is presented in Section 4. At last, we summarize this paper and discuss future work in Section 5.

Section snippets

Related works

In this section, we first introduce the relationship between transfer learning and domain adaptation and then elaborate on collaborative representation based classification.

Methodology

In this section, we describe the proposed method in detail. First we define domain adaptation with necessary symbolic, then the two parts of our method are demonstrated.

Experiments

In this section, we first introduce three commonly used domain adaptation datasets: ImageCLEF, Office-Home and Office31, then report the results of the proposed method and other approaches. The parameters sensitivity analysis and an ablation study is given to show the robustness and effectiveness of our method.

Conclusion

In this paper, we propose a novel model for unsupervised domain adaptation. Unlike existing method, we do not attempt to reduce the domain discrepancy based on statistical properties or adversarial training. Curriculum sample choosing is proposed to select close samples based on the representation residual and we propose K-fold training to integrate them to the training set designedly. To learn flexible representation with dynamic training size, we present distance-aware sparsity

Declaration of Competing Interest

The authors declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Acknowledgement

This research was funded by National Natural Science Foundation of China, grant number 62076204.

Chao Han was born in 1995. He is currently pursuing Ph.D. degree in School of Electronics and Information, Northwestern Polytechnical University, Xi’an, China. His research interests include transfer learning and machine learning.

References (37)

  • M. Long et al.

    Transferable representation learning with deep adaptation networks

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2018)
  • Z. Pei et al.

    Multi-adversarial domain adaptation

    National Conference on Artificial Intelligence

    (2018)
  • L. Zhang, Transfer adaptation learning: a decade survey., in: arXiv:1903.04687,...
  • L. Zhang et al.

    Sparse representation or collaborative representation: which helps face recognition?

    International Conference on Computer Vision

    (2011)
  • S. Tang et al.

    Domain adaptation of image classification exploiting target adaptive collaborative local-neighbor representation

    Active Media Technology

    (2016)
  • G. Zhang et al.

    Domain adaptive collaborative representation based classification

    Multimed. Tools Appl.

    (2019)
  • H. Yan et al.

    Mind the class weight bias: weighted maximum mean discrepancy for unsupervised domain adaptation

    Computer Vision and Pattern Recognition

    (2017)
  • S.J. Pan et al.

    A survey on transfer learning

    IEEE Trans. Knowl. Data Eng.

    (2010)
  • Cited by (16)

    • Prototype-Guided Feature Learning for Unsupervised Domain Adaptation: Prototype-Guided Feature Learning

      2023, Pattern Recognition
      Citation Excerpt :

      Adaptation Regularization: regularized Least Squares (ARLS) [31] learned an adaptive classifier through simultaneously minimizing the structural risk, aligning joint distribution across domains and preserving manifold consistency. The Collaborative Representation with curriculum Classifier Boosting (CRCB) [32] proposed a progressive learning scheme for domain-invariant classifiers by curriculum learning. The Manifold Embedded Distribution Alignment (MEDA) [14] learned a domain-invariant classifier in the manifold space by dynamically adapting the proportion of the conditional and marginal distributions and minimizing structural risk.

    • SWIPENET: Object detection in noisy underwater scenes

      2022, Pattern Recognition
      Citation Excerpt :

      For example, curriculum learning [14,15] and self-pace learning [16,17] are two representatives inspired by the idea of learning easier aspects of the task before moving into a difficult level. Both approaches have been reported to provide better generalisation for the used model [18]. However, Curriculum learning requires the samples in the datasets to be ranked in the order of incremental difficulty levels, but preparing such datasets is not trivial at all in practice.

    View all citing articles on Scopus

    Chao Han was born in 1995. He is currently pursuing Ph.D. degree in School of Electronics and Information, Northwestern Polytechnical University, Xi’an, China. His research interests include transfer learning and machine learning.

    Deyun Zhou was born in 1964. He received the B. Eng degree and Ph.D. degree from Northwestern Polytechnical University. Since 1991, he has been a teacher of Northwestern Polytechnical University. He was promoted to associate professor and full professor in 1993 and 1997, respectively. His research interests are broadly in the area of integrated control and intelligent system, with applications to avionics system, information fusion and mission planning.

    Yu Xie was born in 1993. He is currently pursuing Ph.D. degree in pattern recognition and intelligent systems from the Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University, Xi’an, China. His research interests include deep learning and network structure analytics.

    Maoguo Gong was born in 1979. He received the B. Eng degree and Ph.D. degree from Xidian University. Since 2006, he has been a teacher of Xidian University. He was promoted to associate professor and full professor in 2008 and 2010, respectively, both with exceptive admission. His research interests are broadly in the area of computational intelligence, with applications to optimization, learning, data mining and image understanding. He has published over one hundred papers in journals and conferences, and holds over twenty granted patents as the first inventor. He is leading or has completed over ten projects as the Principle Investigator, funded by the National Natural Science Foundation of China, the National Key Research and Development Program of China, and others. He was the recipient of the prestigious National Program for Support of Top-notch Young Professionals (selected by the Central Organization Department of China), the Excellent Young Scientist Foundation (selected by the National Natural Science Foundation of China), the New Century Excellent Talent in University (selected by the Ministry of Education of China), the Young Teacher Award by the Fok Ying Tung Education Foundation, and the National Natural Science Award of China.. He is the Executive Committee Member of Chinese Association for Artificial Intelligence, Senior Member of IEEE and Chinese Computer Federation, Associate Editor or Editorial Board Member for over five journals including the IEEE Transactions on Evolutionary Computation and the IEEE Transactions on Neural Networks and Learning Systems.

    Yu Lei received the B.S. degree in electronic engineering and the Ph.D. degree in control science and engineering from Xidian University, Xi’an, China, in 2009 and 2015, respectively. He is currently a Lecturer with Northwestern Polytechnical University. His current research interests include computational intelligence and image understanding.

    Jiao Shi received the B.S. degree in electronic engineering and the Ph.D. degree in control science and engineering from Xidian University, Xi’an, China, in 2009 and 2015, respectively. She is currently a Lecturer with Northwestern Polytechnical University. Her current research interests include computational intelligence and image understanding.

    View full text