Abstract
A partial least squares regression is proposed for estimating the function-on-function regression model where a functional response and multiple functional predictors consist of random curves with quadratic and interaction effects. The direct estimation of a function-on-function regression model is usually an ill-posed problem. To overcome this difficulty, in practice, the functional data that belong to the infinite-dimensional space are generally projected into a finite-dimensional space of basis functions. The function-on-function regression model is converted to a multivariate regression model of the basis expansion coefficients. In the estimation phase of the proposed method, the functional variables are approximated by a finite-dimensional basis function expansion method. We show that the partial least squares regression constructed via a functional response, multiple functional predictors, and quadratic/interaction terms of the functional predictors is equivalent to the partial least squares regression constructed using basis expansions of functional variables. From the partial least squares regression of the basis expansions of functional variables, we provide an explicit formula for the partial least squares estimate of the coefficient function of the function-on-function regression model. Because the true forms of the models are generally unspecified, we propose a forward procedure for model selection. The finite sample performance of the proposed method is examined using several Monte Carlo experiments and two empirical data analyses, and the results were found to compare favorably with an existing method.
Similar content being viewed by others
References
Aguilera AM, Ocana FA, Valderrama MJ (1999) Forecasting with unequally spaced data by a functional principal component approach. Test 8(1):233–254
Aguilera AM, Escabias M, Preda C, Saporta G (2010) Using basis expansions for estimating functional PLS regression: applications with chemometric data. Chemom Intell Lab Syst 104(2):289–305
Aguilera AM, Escabias M, Preda C, Saporta G (2016) Penalized versions of functional PLS regression. Chemom Intell Lab Syst 154:80–92
Beyaztas U, Shang HL (2020) On function-on-function regression: partial least squares approach. Environ Ecol Stat 27(1):95–114
Chiou J-M, Müller HG, Wang JL (2004) Functional response models. Stat Sin 14:675–693
Chiou J-M, Yang Y-F, Chen Y-T (2016) Multivariate functional linear regression and prediction. J Multivar Anal 146:301–312
Cuevas A (2014) A partial overview of the theory of statistics with functional data. J Stat Plan Inference 147:1–23
Dayal BS, MacGregor JF (1997) Improved PLS algorithms. J Chemom 11(1):73–85
de Jong S (1993) SIMPLS: an alternative approach to partial least squares regression. Chemom Intell Lab Syst 18(3):251–263
de Lathuauwer L, de Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21(4):1253–1278
Delaigle A, Hall P (2012a) Achieving near perfect classification for functional data. J R Stat Soc Ser B 74(2):267–286
Delaigle A, Hall P (2012b) Methodology and theory for partial least squares applied to functional data. Ann Stat 40(1):322–352
Escabias M, Aguilera AM, Valderrama MJ (2007) Functional PLS logit regression model. Comput Stat Data Anal 51(10):4891–4902
Escoufier Y (1970) Echantillonnage dans une population de variables aléatories réelles. Publications de I’Institut de Statistique de I’Université de Paris 19(4):1–47
Febrero-Bande M, Galeano P, Gozalez-Manteiga W (2017) Functional principal component regression and functional partial least-squares regression: an overview and a comparative study. Int Stat Rev 85(1):61–83
Ferraty F, Vieu P (2006) Nonparametric functional data analysis. Springer, New York
Fuchs K, Scheipl F, Greven S (2015) Penalized scalar-on-functions regression with interaction term. Comput Stat Data Anal 81:38–51
He G, Müller HG, Wang JL (2000) Extending correlation and regression from multivariate to functional data. In: Puri ML (ed) Asymptotics in statistics and probability. VSP International Science Publishers, Leiden, pp 197–210
He G, Müller HG, Wang JL, Yang W (2010) Functional linear regression via canonical analysis. Bernoulli 16(3):705–729
Horvath L, Kokoszka P (2012) Inference for functional data with applications. Springer, New York
Hsing T, Eubank R (2015) Theoretical foundations of functional data analysis, with an introduction to linear operators. Wiley, Chennai
Hyndman RJ, Shang HL (2009) Forecasting functional time series. J Korean Stat Soc 38(3):199–211
Ivanescu AE, Staicu A-M, Scheipl F, Greven S (2015) Penalized function-on-function regression. Comput Stat 30(2):539–568
Kokoszka P, Reimherr M (2017) Introduction to functional data analysis. CRC Press, Boca Raton
Krämer N, Boulesteix A-L, Tutz G (2008) Penalized partial least squares with applications to B-spline transformations and functional data. Chemom Intell Lab Syst 94(1):60–69
Luo R, Qi X (2018) FRegSigCom: functional regression using signal compression approach. R package version 0.3.0. https://CRAN.R-project.org/package=FRegSigCom
Luo R, Qi X (2019) Interaction model and model selection for function-on-function regression. J Comput Graph Stat 28(2):309–322
Matsui H (2020) Quadratic regression for functional response models. Econom Stat 13:125–136
Matsui H, Kawano S, Konishi S (2009) Regularized functional regression modeling for functional response and predictors. J Math Ind 1(A3):17–25
Müller H-G, Yao F (2008) Functional additive models. J Am Stat Assoc 103(484):1534–1544
Pan H (2011) , Bivariate B-splines and its applications in spatial data analysis. PhD thesis, Texas A&M University
Preda C, Saporta G (2005) PLS regression on a stochastic process. Comput Stat Data Anal 48(1):149–158
Preda C, Saporta G, Lévéder C (2007) PLS classification of functional data. Comput Stat 22(2):223–235
Ramsay JO, Dalzell CJ (1991) Some tools for functional data analysis. J R Stat Soc B 53(3):539–572
Ramsay JO, Silverman BW (2002) Applied functional data analysis. Springer, New York
Ramsay JO, Silverman BW (2006) Functional data analysis. Springer, New York
Reiss PT, Ogden TR (2007) Functional principal component regression and functional partial least squares. J Am Stat Assoc 102(479):984–996
Scheipl F, Greven S (2016) Identifiability in penalized function-on-function regression models. Electron J Stat 10:495–526
Sun Y, Wang Q (2020) Function-on-function quadratic regression models. Comput Stat Data Anal 142, 106814
Tucker RS (1938) The reasons for price rigidity. Am Econ Rev 28(1):41–54
Usset J, Staicu A-M, Maity A (2016) Interaction models for functional regression. Comput Stat Data Anal 94:317–329
Wang W (2014) Linear mixed function-on-function regression models. Biometrics 70(4):794–801
Wold H (1974) Causal flows with latent variables: partings of the ways in the light of NIPALS modelling. Eur Econ Rev 5(1):67–86
Wood SN, Bravington MV, Hedley SL (2008) Soap film smoothing. J R Stat Soc B 70(Part 5):931–955
Yao F, Müller H-G (2010) Functional quadratic regression. Biometrika 97(1):49–64
Yao F, Müller H-G, Wang J-L (2005) Functional linear regression analysis for longitudinal data. Ann Stat 33(6):2873–2903
Acknowledgements
The authors thank two anonymous referees for their careful reading of our manuscript and valuable suggestions and comments, which have helped us produce a much-improved paper. The authors also acknowledge Dr. Ruiyan Luo and Dr. Xin Qi at Georgia State University for assistance with R code in the simulation studies.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Identifiability of the proposed model
Model identifiability, which is an important theme in the estimation phase, for the function-on-function regression model with only main effects given in (1.1) was theoretically discussed by He et al. (2000) and Chiou et al. (2004). Later, Scheipl and Greven (2016) discussed the identifiability of such models with realistic applications. As is stated in the Proposition 3.1 of Scheipl and Greven (2016), the coefficient function \(\beta _m(s,t)\) in (1.1) is identifiable if and only if the kernel space of the covariance operator of the functional predictors (\(K^{{\mathcal {X}}}\)) is empty except the zero function, i.e., \(ke(K^{{\mathcal {X}}}) = \lbrace 0 \rbrace \). Luo and Qi (2019) extended the theoretical condition of model identifiability for function-on-function regression with only main effects to the model with interaction and quadratic effects. They defined the following \(\lbrace p (p + 3)/2 \rbrace \times \lbrace p (p+3)/2 \rbrace \) dimensional covariance matrix of the predictor functions:
where the \(p \times p\) and \(p \times \lbrace p (p+1)/2 \rbrace \) dimensional submatrices \(\pmb {A}(s,r)\) and \(\pmb {B}(s,r,u)\) equal to \(\text {Cov} \left( {\mathcal {X}}(s) {\mathcal {X}}(r) \right) \) and \(\text {Cov} \left( {\mathcal {X}}(s), {\mathcal {X}}(r) {\mathcal {X}}(u) \right) \), respectively. The \(\lbrace p (p+1)/2 \rbrace \times \lbrace p (p+1)/2 \rbrace \) dimensional matrix \(\pmb {C}(s, r, s^\prime , r^\prime )\) equals to \(\text {Cov} \left( {\mathcal {X}}(s) {\mathcal {X}}(r), {\mathcal {X}}(s^\prime ) {\mathcal {X}}(r^\prime ) \right) \). Then, Luo and Qi (2019) proved that the coefficients in model (1.3) are identifiable if the kernel space of \(\pmb {\Sigma }\) in (5.9) only has the zero function. Please see Proposition S.1. of Luo and Qi (2019) for the proof. Based on the above, our proposed method given in (2.4) is also identifiable since the proposed method is a reformulated version of the model (1.3).
1.2 Proof of Proposition (2.2)
Proof
The proof of Proposition (2.2) is an extended version of the proof of Proposition 2 of Aguilera et al. (2010). Denote by \(\Lambda = \pmb {\Phi }^{1/2} \pmb {c}\) and \(\Pi = \pmb {\Psi }^{1/2} \pmb {d}\) the column random vectors.
For any random variable \(Z \in \mathcal {L}_2[0,1]\), the followings hold when \(h = 1\):
The residuals obtained from the first iteration are given by:
Then, we have \(\Lambda _1 = \pmb {\Phi }^{1/2} \pmb {c}_1\) and \(\Pi _1 = \pmb {\Psi }^{1/2} \pmb {d}_1\), where \(\pmb {c}_1\) and \(\pmb {d}_1\) are the random vectors of the basis coefficients of \({\mathcal {Y}}_1(t)\) and \(\pmb {{\mathcal {X}}}_1(s,r)\), respectively. For functional PLS, the residuals calculated from the first iteration are as follows:
On the other hand, the residuals from the PLS regression are obtained as follows:
and thus we have \(\pmb {c}_1 = \pmb {c} - \eta _1 \frac{\mathbb {E} \left( \pmb {c} \eta _1 \right) }{\mathbb {E} \left[ \eta _1^2 \right] }\) and \(\pmb {d}_1 = \pmb {d} - \eta _1 \frac{\mathbb {E} \left[ \pmb {d} \eta _1 \right] }{\mathbb {E} \left[ \eta _1^2 \right] }\), that is, \(W^{{\mathcal {Y}}_1} = W^{\Lambda _1}\) and \(W^{\pmb {{\mathcal {X}}}_1} = W^{\Pi _1}\). Now, assuming \(W^{{\mathcal {Y}}_{\ell }} = W^{\Lambda _{\ell }}\) and \(W^{\pmb {{\mathcal {X}}}_{\ell }} = W^{\Pi _{\ell }}\) for each \(\ell \le h\). Also, let us assume that \(W^{{\mathcal {Y}}_{h+1}} = W^{\Lambda _{h+1}}\) and \(W^{\pmb {{\mathcal {X}}}_{h+1}} = W^{\Pi _{h+1}}\). Then, at step \(h+1\) the followings hold:
As for \(h=1\) and using the orthogonality of \(\eta _i\) for \(i = 1, \ldots , h\), it is shown that \(\Lambda _{h+1} = \pmb {\Phi }^{1/2} \pmb {c}_{h+1}\) and \(\Pi _{h+1} = \pmb {\Psi }^{1/2} \pmb {d}_{h+1}\), which concludes the proof. \(\square \)
1.3 Station names for the North Dakota weather data
See Table 4.
Rights and permissions
About this article
Cite this article
Beyaztas, U., Shang, H.L. A partial least squares approach for function-on-function interaction regression. Comput Stat 36, 911–939 (2021). https://doi.org/10.1007/s00180-020-01058-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-020-01058-z