Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Deep transfer operator learning for partial differential equations under conditional shift

A preprint version of the article is available at arXiv.

Abstract

Transfer learning enables the transfer of knowledge gained while learning to perform one task (source) to a related but different task (target), hence addressing the expense of data acquisition and labelling, potential computational power limitations and dataset distribution mismatches. We propose a new transfer learning framework for task-specific learning (functional regression in partial differential equations) under conditional shift based on the deep operator network (DeepONet). Task-specific operator learning is accomplished by fine-tuning task-specific layers of the target DeepONet using a hybrid loss function that allows for the matching of individual target samples while also preserving the global properties of the conditional distribution of the target data. Inspired by conditional embedding operator theory, we minimize the statistical distance between labelled target data and the surrogate prediction on unlabelled target data by embedding conditional distributions onto a reproducing kernel Hilbert space. We demonstrate the advantages of our approach for various transfer learning scenarios involving nonlinear partial differential equations under diverse conditions due to shifts in the geometric domain and model dynamics. Our transfer learning framework enables fast and efficient learning of heterogeneous tasks despite considerable differences between the source and target domains.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The proposed transfer learning framework to approximate PDE solutions using DeepONet.
Fig. 2: A schematic representation of the operator learning benchmarks and TL scenarios under consideration in this work.

Similar content being viewed by others

Data availability

All of the datasets in the study were generated directly from the code in ref. 40.

Code availability

The code used in this study is available in a publicly available GitHub repository40.

References

  1. Chen, R. T., Rubanova, Y., Bettencourt, J. & Duvenaud, D. K. Neural ordinary differential equations. In Advances in Neural Information Processing Systems (eds Garnett, R. et al.) 31 (NeurIPS, 2018).

  2. Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019).

    Article  MathSciNet  MATH  Google Scholar 

  3. Li, Z. et al. Fourier neural operator for parametric partial differential equations. In Proc. International Conference on Learning Representations (ICLR, 2021).

  4. Lu, L., Jin, P., Pang, G., Zhang, Z. & Karniadakis, G. E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. 3, 218–229 (2021).

    Article  Google Scholar 

  5. Chatterjee, T., Chakraborty, S., Goswami, S., Adhikari, S. & Friswell, M. I. Robust topological designs for extreme metamaterial micro-structures. Sci. Rep. 11, 1–14 (2021).

    Article  Google Scholar 

  6. Olivier, A., Shields, M. D. & Graham-Brady, L. Bayesian neural networks for uncertainty quantification in data-driven materials modeling. Comput. Methods Appl. Mech. Eng. 386, 114079 (2021).

    Article  MathSciNet  MATH  Google Scholar 

  7. Niu, S., Liu, Y., Wang, J. & Song, H. A decade survey of transfer learning (2010–2020). IEEE Trans. Artif. Intell. 1, 151–166 (2020).

    Article  Google Scholar 

  8. Gao, Y. & Mosalam, K. M. Deep transfer learning for image-based structural damage recognition. Comput. Aided Civ. Inf. Eng. 33, 748–768 (2018).

    Article  Google Scholar 

  9. Yang, X., Zhang, Y., Lv, W. & Wang, D. Image recognition of wind turbine blade damage based on a deep learning model with transfer learning and an ensemble learning classifier. Renew. Energy 163, 386–397 (2021).

    Article  Google Scholar 

  10. Ruder, S., Peters, M. E., Swayamdipta, S. & Wolf, T. Transfer learning in natural language processing In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials (eds Strube, M. & Sarkar, A.) 15–18 (Association for Computational Linguistics, 2019).

  11. Zhang, S. et al. Combining cross-modal knowledge transfer and semi-supervised learning for speech emotion recognition. Knowl. Based Syst. 229, 107340 (2021).

    Article  Google Scholar 

  12. Zhuang, F. et al. A comprehensive survey on transfer learning. IEEE 109, 43–76 (2020).

    Article  Google Scholar 

  13. Certo, S. T., Busenbark, J. R., Woo, H.-s & Semadeni, M. Sample selection bias and Heckman models in strategic management research. Strateg. Manag. J. 37, 2639–2657 (2016).

    Article  Google Scholar 

  14. Chen, X., Wang, S., Wang, J. & Long, M. Representation subspace distance for domain adaptation regression. In Proc. 38th International Conference on Machine Learning 1749–1759 (PMLR, 2021).

  15. Pardoe, D. & Stone, P. Boosting for regression transfer. In Proc. 27th International Conference on Machine Learning 863–870 (PMLR, 2010).

  16. Wang, X., Huang, T.-K. & Schneider, J. Active transfer learning under model shift. In Proc. 31st International Conference on Machine Learning 1305–1313 (PMLR, 2014).

  17. Du, S. S., Koushik, J., Singh, A. & Póczos, B. Hypothesis transfer learning via transformation functions. In Advances in Neural Information Processing Systems 30 (NeurIPS, 2017).

  18. Zhang, K., Schöolkopf, B., Muandet, K. & Wang, Z. Domain adaptation under target and conditional shift. In Proc. International Conference on Machine Learning 819–827 (PMLR, 2013).

  19. Chen, G., Li, Y. & Liu, X. Transfer learning under conditional shift based on fuzzy residual. IEEE Trans. Cybernetics 52, 960–970 (2020).

    Article  Google Scholar 

  20. Liu, X., Li, Y., Meng, Q. & Chen, G. Deep transfer learning for conditional shift in regression. Knowl. Based Syst. 227, 107216 (2021).

    Article  Google Scholar 

  21. Zhang, X. & Garikipati, K. Machine learning materials physics: multi-resolution neural networks learn the free energy and nonlinear elastic response of evolving microstructures. Comput. Methods Appl. Mech. Eng. 372, 113362 (2020).

    Article  MathSciNet  MATH  Google Scholar 

  22. Goswami, S., Anitescu, C., Chakraborty, S. & Rabczuk, T. Transfer learning enhanced physics informed neural network for phase-field modeling of fracture. Theor. Appl. Fracture Mech. 106, 102447 (2020).

    Article  Google Scholar 

  23. Desai, S., Mattheakis, M., Joy, H., Protopapas, P. & Roberts, S. One-shot transfer learning of physics-informed neural networks. In Proc. 2nd AI4Science Workshop at the 39th International Conference on Machine Learning (ICML) (ICML, 2022).

  24. Chen, X. et al. Transfer learning for deep neural network-based partial differential equations solving. Adv. Aerodyn. 3, 1–14 (2021).

    Article  Google Scholar 

  25. Penwarden, M., Zhe, S., Narayan, A. & Kirby, R.M. Physics-informed neural networks (PINNs) for parameterized PDEs: a metalearning approach. Preprint at https://arxiv.org/abs/2110.13361 (2021).

  26. Wang, H., Planas, R., Chandramowlishwaran, A. & Bostanabad, R. Mosaic flows: a transferable deep learning framework for solving PDEs on unseen domains. Comput. Methods Appl. Mech. Eng. 389, 114424 (2022).

    Article  MathSciNet  MATH  Google Scholar 

  27. Neyshabur, B., Sedghi, H. & Zhang, C. What is being transferred in transfer learning? In 34th Conference on Neural Information Processing Systems 33, 512–523 (NeurIPS 2020).

  28. Tripura, T. & Chakraborty, S. Wavelet neural operator: a neural operator for parametric partial differential equations. Preprint at https://arxiv.org/abs/2205.02191 (2022).

  29. Li, Z. et al. Neural operator: graph kernel network for partial differential equations. In Proc. ICLR 2020 Workshop DeepDiffEq Program Chairs (ICLR, 2020).

  30. Lu, L. et al. A comprehensive and fair comparison of two neural operators (with practical extensions) based on FAIR data. Comput. Methods Appl. Mech. Eng. 393, 114778 (2022).

    Article  MathSciNet  MATH  Google Scholar 

  31. Ahmed, N., Rafiq, M., Rehman, M., Iqbal, M. & Ali, M. Numerical modeling of three dimensional Brusselator reaction diffusion system. AIP Adv. 9, 015205 (2019).

    Article  Google Scholar 

  32. Lee, Y. K. & Park, B. U. Estimation of Kullback–Leibler divergence by local likelihood. Ann. Inst. Stat. Math. 58, 327–340 (2006).

    Article  MathSciNet  MATH  Google Scholar 

  33. Yu, S., Shaker, A., Alesiani, F., Principe, J.C. Measuring the discrepancy between conditional distributions: methods, properties and applications. In Proc. 29th International Joint Conference on Artificial Intelligence 2777–2784 (2020).

  34. Muandet, K. et al. Kernel mean embedding of distributions: a review and beyond. Founds. Trends Mach. Learn. 10, 1–141 (2017).

    Article  MATH  Google Scholar 

  35. Gretton, A., Borgwardt, K. M., Rasch, M. J., Schöolkopf, B. & Smola, A. A Kernel two-sample test. J. Mach. Learn. Res. 13, 723–773 (2012).

    MathSciNet  MATH  Google Scholar 

  36. Song, L., Fukumizu, K. & Gretton, A. Kernel embeddings of conditional distributions: a unified kernel framework for nonparametric inference in graphical models. IEEE Signal Processing Magazine 30, 98–111 (2013).

    Article  Google Scholar 

  37. Song, L., Huang, J., Smola, A., Fukumizu, K. Hilbert space embeddings of conditional distributions with applications to dynamical systems. In Proc. 26th Annual International Conference on Machine Learning 961–968 (2009).

  38. Saxe, A. M. et al. On the information bottleneck theory of deep learning. J. Stat. Mech. 2019, 124020 (2019).

    Article  MathSciNet  MATH  Google Scholar 

  39. Yosinski, J., Clune, J., Bengio, Y., Lipson, H. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems 27 (2014).

  40. Kontolati, K., Goswami, S., Shields, M. D. & Karniadakis, G. E. TL-DeepONet: Codes For Deep Transfer Operator Learning for Partial Differential Equations Under Conditional Shift (Zenodo, 2022); https://doi.org/10.5281/zenodo.7195684

Download references

Acknowledgements

For K.K. and M.D.S., this material is based upon work supported by the US Department of Energy, Office of Science, Office of Advanced Scientific Computing Research under award no. DE-SC0020428. S.G. and G.E.K. would like to acknowledge support by the DOE project PhILMs (award no. DE-SC0019453) and the OSD/AFOSR MURI grant FA9550-20-1-0358.

Author information

Authors and Affiliations

Authors

Contributions

S.G. and K.K. were responsible for the data curation, formal analysis, methodology, software, validation and visualization. M.D.S. and G.E.K. acquired funding, and were responsible for the administration, resources and supervision of the project. S.G., K.K. and M.D.S. performed the investigations. All authors conceptualized the project, and wrote, reviewed and edited the manuscript.

Corresponding authors

Correspondence to Somdatta Goswami, Katiana Kontolati, Michael D. Shields or George Em Karniadakis.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Marios Mattheakis, Ethan Pickering and Shaan Desai for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Table 1 Training cost in seconds (s) for all Darcy flow problems (TL1 - TL4)
Extended Data Table 2 Relative L2 error (%) and training cost in seconds (s) for training the target domain without \({{{{\mathcal{L}}}}}_{{{{\rm{CEOD}}}}}\) for the Darcy problem on a triangular domain with a notch (TL3)
Extended Data Table 3 Uncertainty propagation and moment estimation of the Brusselator response by comparing standard MCS with DeepONet trained on target domain and TL-DeepONet

Extended Data Fig. 1 Representative results for the Darcy model (TL1-TL4).

The network takes as input the spatially varying conductivity field and approximates the hydraulic head over the domain. Error fields represent the point-wise error computed as \(\left|\frac{f({{{{\bf{x}}}}}^{T})-{{{{\bf{y}}}}}^{T}}{{{{{\bf{y}}}}}^{T}}\right|\), where yT, f(xT) is the reference response and the model prediction, respectively.

Extended Data Fig. 2 Representative results for the elasticity model (TL5 and TL6).

The DeepONet takes as input the loading condition applied on the right edge of the plate (left frames) and outputs the displacement field (middle frames). Error fields, shown in the right frame, represent the point-wise error computed as (yT − f(xT)), where yT, f(xT) is the reference response and the model prediction, respectively.

Extended Data Fig. 3 Representative results for the Brusselator reaction-diffusion system (TL7 and TL8).

The network takes as input the initial random field depicting the concentration of one of the species. TL7 approximates the transfer of knowledge from a system with damped oscillations to overdamped oscillations, whereas TL8 represents the transfer to a system with periodic oscillations. Error fields represent the point-wise error computed as \(\left|\frac{f({{{{\bf{x}}}}}^{T})-{{{{\bf{y}}}}}^{T}}{{{{{\bf{y}}}}}^{T}}\right|\), where yT, f(xT) is the reference response and the model prediction, respectively.

Supplementary information

Supplementary Information

Supplementary Figs. 1–4.and Tables 1–8.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Goswami, S., Kontolati, K., Shields, M.D. et al. Deep transfer operator learning for partial differential equations under conditional shift. Nat Mach Intell 4, 1155–1164 (2022). https://doi.org/10.1038/s42256-022-00569-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-022-00569-2

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing