Skip to main content
Log in

Multiple-source adaptation theory and algorithms

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

We present a general theoretical and algorithmic analysis of the problem of multiple-source adaptation, a key learning problem in applications. We derive new normalized solutions with strong theoretical guarantees for the cross-entropy loss and other similar losses. We also provide new guarantees that hold in the case where the conditional probabilities for the source domains are distinct. We further present a novel analysis of the convergence properties of density estimation used in distribution-weighted combinations, and study their effects on the learning guarantees. Moreover, we give new algorithms for determining the distribution-weighted combination solution for the cross-entropy loss and other losses. We report the results of a series of experiments with real-world datasets. We find that our algorithm outperforms competing approaches by producing a single robust predictor that performs well on any target mixture distribution. Altogether, our theory, algorithms, and empirical results provide a full solution for the multiple-source adaptation problem with very practical benefits.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Arndt, C: Information Measures: Information and its Description in Science and Engineering. Signals and Communication Technology. Springer, New York (2004)

    Google Scholar 

  2. Ben-David, S., Blitzer, J., Crammer, K., Pereira, F.: Analysis of representations for domain adaptation. In: NIPS, pp 137–144 (2006)

  3. Blanchard, G., Lee, G., Scott, C.: Generalizing from several related classification tasks to a new unlabeled sample. In: NIPS, pp 2178–2186 (2011)

  4. Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL, pp 440–447 (2007)

  5. Brouwer, L. E. J.: Über eineindeutige, stetige Transformationen von Flächen in sich. Math. Ann. 69(2), 176–180 (1910). Springer

    Article  MathSciNet  Google Scholar 

  6. Chen, X., Deng, X.: Matching algorithmic bounds for finding a brouwer fixed point. J. ACM 55(3), 13:1–13:26 (2008)

    Article  MathSciNet  Google Scholar 

  7. Cortes, C., Mohri, M.: Domain adaptation and sample bias correction theory and algorithm for regression. Theor. Comput. Sci. 519, 103–126 (2014)

    Article  MathSciNet  Google Scholar 

  8. Cortes, C., Mohri, M., Muñoz Medina, A.: Adaptation algorithm and theory based on generalized discrepancy. In: KDD, pp 169–178 (2015)

  9. Cortes, C., Greenberg, S., Mohri, M.: Relative deviation learning bounds and generalization with unbounded loss functions. Ann. Math. Artif. Intell. 85 (1), 45–70 (2019)

    Article  MathSciNet  Google Scholar 

  10. Cover, T. M., Thomas, J. M.: Elements of Information Theory. Wiley-Interscience, New York (2006)

    MATH  Google Scholar 

  11. Crammer, K., Kearns, M. J., Wortman, J.: Learning from multiple sources. J. Mach. Learn. Res. 9, 1757–1774 (2008)

    MathSciNet  MATH  Google Scholar 

  12. Daumé, H III.: Frustratingly easy domain adaptation. In: Annual Meeting of the Association for Computational Linguistics (2007)

  13. Deng, J., Zhang, Z., Eyben, F., Schuller, B.: Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Process. Lett. 21(9), 1068–1072 (2014)

    Article  Google Scholar 

  14. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: a deep convolutional activation feature for generic visual recognition. In: ICML, vol. 32, pp 647–655 (2014)

  15. Dredze, M., Crammer, K., Pereira, F.: Confidence-weighted linear classification. In: ICML, vol. 307, pp 264–271 (2008)

  16. Duan, L., Tsang, I. W., Xu, D., Chua, T.: Domain adaptation from multiple sources via auxiliary classifiers. In: ICML, vol. 382, pp 289–296 (2009)

  17. Duan, L., Xu, D., Tsang, I. W.: Domain adaptation from multiple sources: a domain-dependent regularization approach. IEEE Trans. Neural Netw. Learn. Syst. 23(3), 504–518 (2012)

    Article  Google Scholar 

  18. Eaves, B. C.: Homotopies for computation of fixed points. Math. Program. 3(1), 1–22 (1972)

    Article  MathSciNet  Google Scholar 

  19. Ganin, Y., Lempitsky, V. S.: Unsupervised domain adaptation by backpropagation. In: ICML, vol. 37, pp 1180–1189 (2015)

  20. Gibbs, A. L., Su, F. E.: On choosing and bounding probability metrics. Int. Stat. Rev./Rev. Int. Stat. 70(3), 419–435 (2002)

    Article  Google Scholar 

  21. Girshick, R. B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp 580–587 (2014)

  22. Glorot, X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: a deep learning approach. In: ICML, pp 513–520 (2011)

  23. Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: CVPR, pp 2066–2073 (2012)

  24. Gong, B., Grauman, K., Sha, F.: Connecting the dots with landmarks: discriminatively learning domain-invariant features for unsupervised domain adaptation. In: ICML, vol. 8, pp 222–230 (2013a)

  25. Gong, B., Grauman, K., Sha, F.: Reshaping visual datasets for domain adaptation. In: NIPS, pp 1286–1294 (2013b)

  26. Gopalan, R., Li, R., Chellappa, R.: Domain adaptation for object recognition: an unsupervised approach. In: ICCV, pp 999–1006. IEEE (2011)

  27. Hirsch, M. D., Papadimitriou, C. H., Vavasis, S. A.: Exponential lower bounds for finding brouwer fix points. J. Complex. 5(4), 379–416 (1989)

    Article  Google Scholar 

  28. Hoffman, J., Kulis, B., Darrell, T., Saenko, K.: Discovering latent domains for multisource domain adaptation. In: ECCV, vol. 7573, pp 702–715 (2012)

  29. Hoffman, J., Rodner, E., Donahue, J., Saenko, K., Darrell, T.: Efficient learning of domain-invariant image representations. In: ICLR (2013)

  30. Hoffman, J., Mohri, M., Zhang, N.: Algorithms and theory for multiple-source adaptation. In: Advances in Neural Information Processing Systems, pp 8246–8256 (2018)

  31. Horst, R., Thoai, N. V.: DC programming: overview. J. Optim. Theory Appl. 103(1), 1–43 (1999)

    Article  MathSciNet  Google Scholar 

  32. Huang, J., Smola, A. J., Gretton, A., Borgwardt, K. M., Schölkopf, B.: Correcting sample selection bias by unlabeled data. In: NIPS, pp 601–608 (2006)

  33. Jiang, J., Zhai, C.: Instance weighting for domain adaptation in nlp. In: Annual Meeting of the Association of Computational Linguistics, pp 264–271 (2007)

  34. Khosla, A., Zhou, T., Malisiewicz, T., Efros, A. A., Torralba, A.: Undoing the damage of dataset bias. In: ECCV, vol. 7572, pp 158–171 (2012)

  35. Krizhevsky, A., Sutskever, I., Hinton, G. E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1106–1114 (2012)

  36. Kuhn, H.: Simplicial approximations of fixed points. Proc. Natl Acad. Sci. 61(4), 1238–1242 (1968)

    Article  MathSciNet  Google Scholar 

  37. Liao, H.: Speaker adaptation of context dependent deep neural networks. In: ICASSP, pp 7947–7951 (2013)

  38. Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)

    Article  Google Scholar 

  39. Long, M., Cao, Y., Wang, J., Jordan, M. I.: Learning transferable features with deep adaptation networks. In: ICML, vol. 37, pp 97–105 (2015)

  40. Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation with multiple sources. In: NIPS, pp 1041–1048 (2008)

  41. Mansour, Y., Mohri, M., Rostamizadeh, A.: Multiple source adaptation and the Rényi divergence. In: UAI, pp 367–374 (2009a)

  42. Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation: learning bounds and algorithms. In: COLT (2009b)

  43. Martínez, A. M.: Recognizing imprecisely localized, partially occluded, and expression variant faces from a single sample per class. IEEE Trans. Pattern Anal. Mach. Intell. 24(6), 748–763 (2002)

    Article  Google Scholar 

  44. Merrill, O. H.: Applications and Extensions of an Algorithm That Computes Fixed Points of Certain Upper Semi-continuous Point to Set Mappings. PhD thesis, Dept. of Industrial Engineering. University of Michigan (1972)

  45. Muandet, K., Balduzzi, D., Schölkopf, B.: Domain generalization via invariant feature representation. In: ICML, vol. 28, pp 10–18 (2013)

  46. Pan, S. J., Ni, X., Sun, J. -T., Yang, Q., Chen, Z.: Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of the 19th International conference on World Wide Web, pp 751–760 (2010)

  47. Pei, Z., Cao, Z., Long, M., Wang, J.: Multi-adversarial domain adaptation. In: AAAI, pp 3934–3941 (2018)

  48. Rényi, A.: On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, pp 547–561 (1961)

  49. Roark, B., Sproat, R., Allauzen, C., Riley, M., Sorensen, J., Tai, T.: The opengrm open-source finite-state grammar software libraries. In: ACL (System Demonstrations), pp 61–66 (2012)

  50. Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to new domains. In: ECCV, vol. 6314, pp 213–226 (2010)

  51. Scarf, H.: The approximation of fixed points of a continuous mapping. SIAM J. Appl. Math. 15(5), 1328–1343 (1967)

    Article  MathSciNet  Google Scholar 

  52. Seide, F., Li, G., Chen, X., Yu, D.: Feature engineering in context-dependent deep neural networks for conversational speech transcription. In: 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, pp 24–29. IEEE (2011)

  53. Sriperumbudur, B. K., Lanckriet, G. R. G.: A proof of convergence of the concave-convex procedure using Zangwill’s theory. Neural Comput. 24(6), 1391–1407 (2012)

    Article  MathSciNet  Google Scholar 

  54. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)

    Article  Google Scholar 

  55. Taigman, Y., Polyak, A., Wolf, L.: Unsupervised cross-domain image generation. In: ICLR (2017)

  56. Tao, P. D., An, L. T. H.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)

    MathSciNet  MATH  Google Scholar 

  57. Tao, P. D., An, L. T. H.: A DC optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)

    Article  MathSciNet  Google Scholar 

  58. Torralba, A., Efros, A. A.: Unbiased look at dataset bias. In: CVPR, pp 1521–1528 (2011)

  59. Tzeng, E., Hoffman, J., Darrell, T., Saenko, K.: Simultaneous deep transfer across domains and tasks. In: ICCV, pp 4068–4076 (2015)

  60. Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: Conference on Computer Vision and Pattern Recognition, pp 7167–7176 (2017)

  61. Valiant, L. G.: A theory of the learnable. In: Annual ACM Symposium on Theory of Computing, pp 436–445 (1984)

  62. Van Erven, T., Harremos, P.: Rényi divergence and kullback-leibler divergence. IEEE Trans. Inf. Theory 60(7), 3797–3820 (2014)

    Article  Google Scholar 

  63. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  64. von Neumann, J.: Zur theorie der gesellschaftsspiele. Math. Ann. 100(1), 295–320 (1928)

    Article  MathSciNet  Google Scholar 

  65. Xu, Z., Li, W., Niu, L., Xu, D.: Exploiting low-rank structure from latent domains for domain generalization. In: ECCV, vol. 8691, pp 628–643 (2014)

  66. Yang, J., Yan, R., Hauptmann, A. G.: Cross-domain video concept detection using adaptive svms. In: ACM Multimedia, pp 188–197 (2007)

  67. Yuille, A. L., Rangarajan, A.: The concave-convex procedure. Neural Comput. 15(4), 915–936 (2003)

    Article  Google Scholar 

  68. Zhang, K., Gong, M., Schölkopf, B.: Multi-source domain adaptation: a causal view. In: AAAI, pp 3150–3157 (2015)

Download references

Acknowledgements

This work was partly funded by NSF CCF-1535987, IIS-1618662, and a Google Research Award.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ningshan Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 317 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, N., Mohri, M. & Hoffman, J. Multiple-source adaptation theory and algorithms. Ann Math Artif Intell 89, 237–270 (2021). https://doi.org/10.1007/s10472-020-09716-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-020-09716-0

Keywords

Mathematics Subject Classification (2010)

Navigation