Abstract
Transfer learning enables the transfer of knowledge gained while learning to perform one task (source) to a related but different task (target), hence addressing the expense of data acquisition and labelling, potential computational power limitations and dataset distribution mismatches. We propose a new transfer learning framework for task-specific learning (functional regression in partial differential equations) under conditional shift based on the deep operator network (DeepONet). Task-specific operator learning is accomplished by fine-tuning task-specific layers of the target DeepONet using a hybrid loss function that allows for the matching of individual target samples while also preserving the global properties of the conditional distribution of the target data. Inspired by conditional embedding operator theory, we minimize the statistical distance between labelled target data and the surrogate prediction on unlabelled target data by embedding conditional distributions onto a reproducing kernel Hilbert space. We demonstrate the advantages of our approach for various transfer learning scenarios involving nonlinear partial differential equations under diverse conditions due to shifts in the geometric domain and model dynamics. Our transfer learning framework enables fast and efficient learning of heterogeneous tasks despite considerable differences between the source and target domains.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All of the datasets in the study were generated directly from the code in ref. 40.
Code availability
The code used in this study is available in a publicly available GitHub repository40.
References
Chen, R. T., Rubanova, Y., Bettencourt, J. & Duvenaud, D. K. Neural ordinary differential equations. In Advances in Neural Information Processing Systems (eds Garnett, R. et al.) 31 (NeurIPS, 2018).
Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019).
Li, Z. et al. Fourier neural operator for parametric partial differential equations. In Proc. International Conference on Learning Representations (ICLR, 2021).
Lu, L., Jin, P., Pang, G., Zhang, Z. & Karniadakis, G. E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. 3, 218–229 (2021).
Chatterjee, T., Chakraborty, S., Goswami, S., Adhikari, S. & Friswell, M. I. Robust topological designs for extreme metamaterial micro-structures. Sci. Rep. 11, 1–14 (2021).
Olivier, A., Shields, M. D. & Graham-Brady, L. Bayesian neural networks for uncertainty quantification in data-driven materials modeling. Comput. Methods Appl. Mech. Eng. 386, 114079 (2021).
Niu, S., Liu, Y., Wang, J. & Song, H. A decade survey of transfer learning (2010–2020). IEEE Trans. Artif. Intell. 1, 151–166 (2020).
Gao, Y. & Mosalam, K. M. Deep transfer learning for image-based structural damage recognition. Comput. Aided Civ. Inf. Eng. 33, 748–768 (2018).
Yang, X., Zhang, Y., Lv, W. & Wang, D. Image recognition of wind turbine blade damage based on a deep learning model with transfer learning and an ensemble learning classifier. Renew. Energy 163, 386–397 (2021).
Ruder, S., Peters, M. E., Swayamdipta, S. & Wolf, T. Transfer learning in natural language processing In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorials (eds Strube, M. & Sarkar, A.) 15–18 (Association for Computational Linguistics, 2019).
Zhang, S. et al. Combining cross-modal knowledge transfer and semi-supervised learning for speech emotion recognition. Knowl. Based Syst. 229, 107340 (2021).
Zhuang, F. et al. A comprehensive survey on transfer learning. IEEE 109, 43–76 (2020).
Certo, S. T., Busenbark, J. R., Woo, H.-s & Semadeni, M. Sample selection bias and Heckman models in strategic management research. Strateg. Manag. J. 37, 2639–2657 (2016).
Chen, X., Wang, S., Wang, J. & Long, M. Representation subspace distance for domain adaptation regression. In Proc. 38th International Conference on Machine Learning 1749–1759 (PMLR, 2021).
Pardoe, D. & Stone, P. Boosting for regression transfer. In Proc. 27th International Conference on Machine Learning 863–870 (PMLR, 2010).
Wang, X., Huang, T.-K. & Schneider, J. Active transfer learning under model shift. In Proc. 31st International Conference on Machine Learning 1305–1313 (PMLR, 2014).
Du, S. S., Koushik, J., Singh, A. & Póczos, B. Hypothesis transfer learning via transformation functions. In Advances in Neural Information Processing Systems 30 (NeurIPS, 2017).
Zhang, K., Schöolkopf, B., Muandet, K. & Wang, Z. Domain adaptation under target and conditional shift. In Proc. International Conference on Machine Learning 819–827 (PMLR, 2013).
Chen, G., Li, Y. & Liu, X. Transfer learning under conditional shift based on fuzzy residual. IEEE Trans. Cybernetics 52, 960–970 (2020).
Liu, X., Li, Y., Meng, Q. & Chen, G. Deep transfer learning for conditional shift in regression. Knowl. Based Syst. 227, 107216 (2021).
Zhang, X. & Garikipati, K. Machine learning materials physics: multi-resolution neural networks learn the free energy and nonlinear elastic response of evolving microstructures. Comput. Methods Appl. Mech. Eng. 372, 113362 (2020).
Goswami, S., Anitescu, C., Chakraborty, S. & Rabczuk, T. Transfer learning enhanced physics informed neural network for phase-field modeling of fracture. Theor. Appl. Fracture Mech. 106, 102447 (2020).
Desai, S., Mattheakis, M., Joy, H., Protopapas, P. & Roberts, S. One-shot transfer learning of physics-informed neural networks. In Proc. 2nd AI4Science Workshop at the 39th International Conference on Machine Learning (ICML) (ICML, 2022).
Chen, X. et al. Transfer learning for deep neural network-based partial differential equations solving. Adv. Aerodyn. 3, 1–14 (2021).
Penwarden, M., Zhe, S., Narayan, A. & Kirby, R.M. Physics-informed neural networks (PINNs) for parameterized PDEs: a metalearning approach. Preprint at https://arxiv.org/abs/2110.13361 (2021).
Wang, H., Planas, R., Chandramowlishwaran, A. & Bostanabad, R. Mosaic flows: a transferable deep learning framework for solving PDEs on unseen domains. Comput. Methods Appl. Mech. Eng. 389, 114424 (2022).
Neyshabur, B., Sedghi, H. & Zhang, C. What is being transferred in transfer learning? In 34th Conference on Neural Information Processing Systems 33, 512–523 (NeurIPS 2020).
Tripura, T. & Chakraborty, S. Wavelet neural operator: a neural operator for parametric partial differential equations. Preprint at https://arxiv.org/abs/2205.02191 (2022).
Li, Z. et al. Neural operator: graph kernel network for partial differential equations. In Proc. ICLR 2020 Workshop DeepDiffEq Program Chairs (ICLR, 2020).
Lu, L. et al. A comprehensive and fair comparison of two neural operators (with practical extensions) based on FAIR data. Comput. Methods Appl. Mech. Eng. 393, 114778 (2022).
Ahmed, N., Rafiq, M., Rehman, M., Iqbal, M. & Ali, M. Numerical modeling of three dimensional Brusselator reaction diffusion system. AIP Adv. 9, 015205 (2019).
Lee, Y. K. & Park, B. U. Estimation of Kullback–Leibler divergence by local likelihood. Ann. Inst. Stat. Math. 58, 327–340 (2006).
Yu, S., Shaker, A., Alesiani, F., Principe, J.C. Measuring the discrepancy between conditional distributions: methods, properties and applications. In Proc. 29th International Joint Conference on Artificial Intelligence 2777–2784 (2020).
Muandet, K. et al. Kernel mean embedding of distributions: a review and beyond. Founds. Trends Mach. Learn. 10, 1–141 (2017).
Gretton, A., Borgwardt, K. M., Rasch, M. J., Schöolkopf, B. & Smola, A. A Kernel two-sample test. J. Mach. Learn. Res. 13, 723–773 (2012).
Song, L., Fukumizu, K. & Gretton, A. Kernel embeddings of conditional distributions: a unified kernel framework for nonparametric inference in graphical models. IEEE Signal Processing Magazine 30, 98–111 (2013).
Song, L., Huang, J., Smola, A., Fukumizu, K. Hilbert space embeddings of conditional distributions with applications to dynamical systems. In Proc. 26th Annual International Conference on Machine Learning 961–968 (2009).
Saxe, A. M. et al. On the information bottleneck theory of deep learning. J. Stat. Mech. 2019, 124020 (2019).
Yosinski, J., Clune, J., Bengio, Y., Lipson, H. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems 27 (2014).
Kontolati, K., Goswami, S., Shields, M. D. & Karniadakis, G. E. TL-DeepONet: Codes For Deep Transfer Operator Learning for Partial Differential Equations Under Conditional Shift (Zenodo, 2022); https://doi.org/10.5281/zenodo.7195684
Acknowledgements
For K.K. and M.D.S., this material is based upon work supported by the US Department of Energy, Office of Science, Office of Advanced Scientific Computing Research under award no. DE-SC0020428. S.G. and G.E.K. would like to acknowledge support by the DOE project PhILMs (award no. DE-SC0019453) and the OSD/AFOSR MURI grant FA9550-20-1-0358.
Author information
Authors and Affiliations
Contributions
S.G. and K.K. were responsible for the data curation, formal analysis, methodology, software, validation and visualization. M.D.S. and G.E.K. acquired funding, and were responsible for the administration, resources and supervision of the project. S.G., K.K. and M.D.S. performed the investigations. All authors conceptualized the project, and wrote, reviewed and edited the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks Marios Mattheakis, Ethan Pickering and Shaan Desai for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Representative results for the Darcy model (TL1-TL4).
The network takes as input the spatially varying conductivity field and approximates the hydraulic head over the domain. Error fields represent the point-wise error computed as \(\left|\frac{f({{{{\bf{x}}}}}^{T})-{{{{\bf{y}}}}}^{T}}{{{{{\bf{y}}}}}^{T}}\right|\), where yT, f(xT) is the reference response and the model prediction, respectively.
Extended Data Fig. 2 Representative results for the elasticity model (TL5 and TL6).
The DeepONet takes as input the loading condition applied on the right edge of the plate (left frames) and outputs the displacement field (middle frames). Error fields, shown in the right frame, represent the point-wise error computed as (yT − f(xT)), where yT, f(xT) is the reference response and the model prediction, respectively.
Extended Data Fig. 3 Representative results for the Brusselator reaction-diffusion system (TL7 and TL8).
The network takes as input the initial random field depicting the concentration of one of the species. TL7 approximates the transfer of knowledge from a system with damped oscillations to overdamped oscillations, whereas TL8 represents the transfer to a system with periodic oscillations. Error fields represent the point-wise error computed as \(\left|\frac{f({{{{\bf{x}}}}}^{T})-{{{{\bf{y}}}}}^{T}}{{{{{\bf{y}}}}}^{T}}\right|\), where yT, f(xT) is the reference response and the model prediction, respectively.
Supplementary information
Supplementary Information
Supplementary Figs. 1–4.and Tables 1–8.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Goswami, S., Kontolati, K., Shields, M.D. et al. Deep transfer operator learning for partial differential equations under conditional shift. Nat Mach Intell 4, 1155–1164 (2022). https://doi.org/10.1038/s42256-022-00569-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-022-00569-2
This article is cited by
-
On the geometry transferability of the hybrid iterative numerical solver for differential equations
Computational Mechanics (2023)