Multiple-source adaptation theory and algorithms

Zhang, Ningshan; Mohri, Mehryar; Hoffman, Judy

doi:10.1007/s10472-020-09716-0

Multiple-source adaptation theory and algorithms

Published: 05 November 2020

Volume 89, pages 237–270, (2021)
Cite this article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

387 Accesses
4 Citations
Explore all metrics

Abstract

We present a general theoretical and algorithmic analysis of the problem of multiple-source adaptation, a key learning problem in applications. We derive new normalized solutions with strong theoretical guarantees for the cross-entropy loss and other similar losses. We also provide new guarantees that hold in the case where the conditional probabilities for the source domains are distinct. We further present a novel analysis of the convergence properties of density estimation used in distribution-weighted combinations, and study their effects on the learning guarantees. Moreover, we give new algorithms for determining the distribution-weighted combination solution for the cross-entropy loss and other losses. We report the results of a series of experiments with real-world datasets. We find that our algorithm outperforms competing approaches by producing a single robust predictor that performs well on any target mixture distribution. Altogether, our theory, algorithms, and empirical results provide a full solution for the multiple-source adaptation problem with very practical benefits.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Autoencoders and their applications in machine learning: a survey

Article Open access 03 February 2024

Kamal Berahmand, Fatemeh Daneshfar, … Yue Xu

Parsimonious ultrametric Gaussian mixture models

Article Open access 01 April 2024

Carlo Cavicchia, Maurizio Vichi & Giorgia Zaccaria

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

Sebastian Pokutta

References

Arndt, C: Information Measures: Information and its Description in Science and Engineering. Signals and Communication Technology. Springer, New York (2004)
Google Scholar
Ben-David, S., Blitzer, J., Crammer, K., Pereira, F.: Analysis of representations for domain adaptation. In: NIPS, pp 137–144 (2006)
Blanchard, G., Lee, G., Scott, C.: Generalizing from several related classification tasks to a new unlabeled sample. In: NIPS, pp 2178–2186 (2011)
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL, pp 440–447 (2007)
Brouwer, L. E. J.: Über eineindeutige, stetige Transformationen von Flächen in sich. Math. Ann. 69(2), 176–180 (1910). Springer
Article MathSciNet Google Scholar
Chen, X., Deng, X.: Matching algorithmic bounds for finding a brouwer fixed point. J. ACM 55(3), 13:1–13:26 (2008)
Article MathSciNet Google Scholar
Cortes, C., Mohri, M.: Domain adaptation and sample bias correction theory and algorithm for regression. Theor. Comput. Sci. 519, 103–126 (2014)
Article MathSciNet Google Scholar
Cortes, C., Mohri, M., Muñoz Medina, A.: Adaptation algorithm and theory based on generalized discrepancy. In: KDD, pp 169–178 (2015)
Cortes, C., Greenberg, S., Mohri, M.: Relative deviation learning bounds and generalization with unbounded loss functions. Ann. Math. Artif. Intell. 85 (1), 45–70 (2019)
Article MathSciNet Google Scholar
Cover, T. M., Thomas, J. M.: Elements of Information Theory. Wiley-Interscience, New York (2006)
MATH Google Scholar
Crammer, K., Kearns, M. J., Wortman, J.: Learning from multiple sources. J. Mach. Learn. Res. 9, 1757–1774 (2008)
MathSciNet MATH Google Scholar
Daumé, H III.: Frustratingly easy domain adaptation. In: Annual Meeting of the Association for Computational Linguistics (2007)
Deng, J., Zhang, Z., Eyben, F., Schuller, B.: Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Process. Lett. 21(9), 1068–1072 (2014)
Article Google Scholar
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: a deep convolutional activation feature for generic visual recognition. In: ICML, vol. 32, pp 647–655 (2014)
Dredze, M., Crammer, K., Pereira, F.: Confidence-weighted linear classification. In: ICML, vol. 307, pp 264–271 (2008)
Duan, L., Tsang, I. W., Xu, D., Chua, T.: Domain adaptation from multiple sources via auxiliary classifiers. In: ICML, vol. 382, pp 289–296 (2009)
Duan, L., Xu, D., Tsang, I. W.: Domain adaptation from multiple sources: a domain-dependent regularization approach. IEEE Trans. Neural Netw. Learn. Syst. 23(3), 504–518 (2012)
Article Google Scholar
Eaves, B. C.: Homotopies for computation of fixed points. Math. Program. 3(1), 1–22 (1972)
Article MathSciNet Google Scholar
Ganin, Y., Lempitsky, V. S.: Unsupervised domain adaptation by backpropagation. In: ICML, vol. 37, pp 1180–1189 (2015)
Gibbs, A. L., Su, F. E.: On choosing and bounding probability metrics. Int. Stat. Rev./Rev. Int. Stat. 70(3), 419–435 (2002)
Article Google Scholar
Girshick, R. B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp 580–587 (2014)
Glorot, X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: a deep learning approach. In: ICML, pp 513–520 (2011)
Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: CVPR, pp 2066–2073 (2012)
Gong, B., Grauman, K., Sha, F.: Connecting the dots with landmarks: discriminatively learning domain-invariant features for unsupervised domain adaptation. In: ICML, vol. 8, pp 222–230 (2013a)
Gong, B., Grauman, K., Sha, F.: Reshaping visual datasets for domain adaptation. In: NIPS, pp 1286–1294 (2013b)
Gopalan, R., Li, R., Chellappa, R.: Domain adaptation for object recognition: an unsupervised approach. In: ICCV, pp 999–1006. IEEE (2011)
Hirsch, M. D., Papadimitriou, C. H., Vavasis, S. A.: Exponential lower bounds for finding brouwer fix points. J. Complex. 5(4), 379–416 (1989)
Article Google Scholar
Hoffman, J., Kulis, B., Darrell, T., Saenko, K.: Discovering latent domains for multisource domain adaptation. In: ECCV, vol. 7573, pp 702–715 (2012)
Hoffman, J., Rodner, E., Donahue, J., Saenko, K., Darrell, T.: Efficient learning of domain-invariant image representations. In: ICLR (2013)
Hoffman, J., Mohri, M., Zhang, N.: Algorithms and theory for multiple-source adaptation. In: Advances in Neural Information Processing Systems, pp 8246–8256 (2018)
Horst, R., Thoai, N. V.: DC programming: overview. J. Optim. Theory Appl. 103(1), 1–43 (1999)
Article MathSciNet Google Scholar
Huang, J., Smola, A. J., Gretton, A., Borgwardt, K. M., Schölkopf, B.: Correcting sample selection bias by unlabeled data. In: NIPS, pp 601–608 (2006)
Jiang, J., Zhai, C.: Instance weighting for domain adaptation in nlp. In: Annual Meeting of the Association of Computational Linguistics, pp 264–271 (2007)
Khosla, A., Zhou, T., Malisiewicz, T., Efros, A. A., Torralba, A.: Undoing the damage of dataset bias. In: ECCV, vol. 7572, pp 158–171 (2012)
Krizhevsky, A., Sutskever, I., Hinton, G. E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1106–1114 (2012)
Kuhn, H.: Simplicial approximations of fixed points. Proc. Natl Acad. Sci. 61(4), 1238–1242 (1968)
Article MathSciNet Google Scholar
Liao, H.: Speaker adaptation of context dependent deep neural networks. In: ICASSP, pp 7947–7951 (2013)
Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)
Article Google Scholar
Long, M., Cao, Y., Wang, J., Jordan, M. I.: Learning transferable features with deep adaptation networks. In: ICML, vol. 37, pp 97–105 (2015)
Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation with multiple sources. In: NIPS, pp 1041–1048 (2008)
Mansour, Y., Mohri, M., Rostamizadeh, A.: Multiple source adaptation and the Rényi divergence. In: UAI, pp 367–374 (2009a)
Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation: learning bounds and algorithms. In: COLT (2009b)
Martínez, A. M.: Recognizing imprecisely localized, partially occluded, and expression variant faces from a single sample per class. IEEE Trans. Pattern Anal. Mach. Intell. 24(6), 748–763 (2002)
Article Google Scholar
Merrill, O. H.: Applications and Extensions of an Algorithm That Computes Fixed Points of Certain Upper Semi-continuous Point to Set Mappings. PhD thesis, Dept. of Industrial Engineering. University of Michigan (1972)
Muandet, K., Balduzzi, D., Schölkopf, B.: Domain generalization via invariant feature representation. In: ICML, vol. 28, pp 10–18 (2013)
Pan, S. J., Ni, X., Sun, J. -T., Yang, Q., Chen, Z.: Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of the 19th International conference on World Wide Web, pp 751–760 (2010)
Pei, Z., Cao, Z., Long, M., Wang, J.: Multi-adversarial domain adaptation. In: AAAI, pp 3934–3941 (2018)
Rényi, A.: On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, pp 547–561 (1961)
Roark, B., Sproat, R., Allauzen, C., Riley, M., Sorensen, J., Tai, T.: The opengrm open-source finite-state grammar software libraries. In: ACL (System Demonstrations), pp 61–66 (2012)
Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to new domains. In: ECCV, vol. 6314, pp 213–226 (2010)
Scarf, H.: The approximation of fixed points of a continuous mapping. SIAM J. Appl. Math. 15(5), 1328–1343 (1967)
Article MathSciNet Google Scholar
Seide, F., Li, G., Chen, X., Yu, D.: Feature engineering in context-dependent deep neural networks for conversational speech transcription. In: 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, pp 24–29. IEEE (2011)
Sriperumbudur, B. K., Lanckriet, G. R. G.: A proof of convergence of the concave-convex procedure using Zangwill’s theory. Neural Comput. 24(6), 1391–1407 (2012)
Article MathSciNet Google Scholar
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)
Article Google Scholar
Taigman, Y., Polyak, A., Wolf, L.: Unsupervised cross-domain image generation. In: ICLR (2017)
Tao, P. D., An, L. T. H.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)
MathSciNet MATH Google Scholar
Tao, P. D., An, L. T. H.: A DC optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)
Article MathSciNet Google Scholar
Torralba, A., Efros, A. A.: Unbiased look at dataset bias. In: CVPR, pp 1521–1528 (2011)
Tzeng, E., Hoffman, J., Darrell, T., Saenko, K.: Simultaneous deep transfer across domains and tasks. In: ICCV, pp 4068–4076 (2015)
Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: Conference on Computer Vision and Pattern Recognition, pp 7167–7176 (2017)
Valiant, L. G.: A theory of the learnable. In: Annual ACM Symposium on Theory of Computing, pp 436–445 (1984)
Van Erven, T., Harremos, P.: Rényi divergence and kullback-leibler divergence. IEEE Trans. Inf. Theory 60(7), 3797–3820 (2014)
Article Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
von Neumann, J.: Zur theorie der gesellschaftsspiele. Math. Ann. 100(1), 295–320 (1928)
Article MathSciNet Google Scholar
Xu, Z., Li, W., Niu, L., Xu, D.: Exploiting low-rank structure from latent domains for domain generalization. In: ECCV, vol. 8691, pp 628–643 (2014)
Yang, J., Yan, R., Hauptmann, A. G.: Cross-domain video concept detection using adaptive svms. In: ACM Multimedia, pp 188–197 (2007)
Yuille, A. L., Rangarajan, A.: The concave-convex procedure. Neural Comput. 15(4), 915–936 (2003)
Article Google Scholar
Zhang, K., Gong, M., Schölkopf, B.: Multi-source domain adaptation: a causal view. In: AAAI, pp 3150–3157 (2015)

Download references

Acknowledgements

This work was partly funded by NSF CCF-1535987, IIS-1618662, and a Google Research Award.

Author information

Authors and Affiliations

New York University, New York, NY, 10012, USA
Ningshan Zhang
Google Research and Courant Institute of Mathematical Sciences, New York, NY, 10012, USA
Mehryar Mohri
School of Interactive Computing, Georgia Institute of Technology, Atlanta, GA, 30332, USA
Judy Hoffman

Authors

Ningshan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Mehryar Mohri
View author publications
You can also search for this author in PubMed Google Scholar
Judy Hoffman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ningshan Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 317 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, N., Mohri, M. & Hoffman, J. Multiple-source adaptation theory and algorithms. Ann Math Artif Intell 89, 237–270 (2021). https://doi.org/10.1007/s10472-020-09716-0

Download citation

Accepted: 12 October 2020
Published: 05 November 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s10472-020-09716-0

Keywords

Mathematics Subject Classification (2010)

68T05

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple-source adaptation theory and algorithms

Abstract

Access this article

Similar content being viewed by others

Autoencoders and their applications in machine learning: a survey

Parsimonious ultrametric Gaussian mixture models

The Frank-Wolfe Algorithm: A Short Introduction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Electronic supplementary material

(PDF 317 KB)

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Abstract

Access this article

Similar content being viewed by others

Autoencoders and their applications in machine learning: a survey

Parsimonious ultrametric Gaussian mixture models

The Frank-Wolfe Algorithm: A Short Introduction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Electronic supplementary material

(PDF 317 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation