Abstract
In this article, we consider a new robust estimation procedure for the partial functional linear model (PFLM) with the slope function approximated by spline basis functions. This robust estimation procedure applies a modified Huber’s function with tail function replaced by the exponential squared loss (ESL) to achieve robustness against outliers. A data-driven procedure is presented for selecting the tuning parameters of the new estimation method, which enables us to reach better robustness and efficiency than other methods in the presence of outliers or non-normal errors. We construct robust estimators of both parametric coefficients and function coefficient in the PFLM. Moreover, some asymptotic properties of the resulting estimators are established. The finite sample performance of our proposed method is studied through simulations and illustrated with a data example.
Similar content being viewed by others
References
Aneiros, G., & Vieu, P. (2015). Partial linear modelling with multi-functional covariates. Computational Statistics, 30(3), 647–671.
Aneiros-Pérez, G., & Vieu, P. (2006). Semi-functional partial linear regression. Statistics and Probability Letters, 76(11), 1102–1110.
Cai, T. T., & Hall, P. (2006). Prediction in functional linear regression. The Annals of Statistics, 34(5), 2159–2179.
Cai, T. T., & Yuan, M. (2012). Minimax and adaptive prediction for functional linear regression. Journal of the American Statistical Association, 107(499), 1201–1216.
Cardot, H., Ferraty, F., & Sarda, P. (1999). Functional linear model. Statistics and Probability Letters, 45(1), 11–22.
Chen, D., Hall, P., & Müller, H. G. (2011). Single and multiple index functional regression models with nonparametric link. The Annals of Statistics, 39(3), 1720–1747.
Crambes, C., Kneip, A., & Sarda, P. (2009). Smoothing splines estimators for functional linear regression. The Annals of Statistics, 37(1), 35–72.
de Boor, C. (2001). A practical guide to splines. New York: Springer.
DeVore, R. A., & Lorentz, G. G. (1993). Constructive approximation. Berlin: Springer.
Du, J., Xu, D., & Cao, R. (2018). Estimation and variable selection for partially functional linear models. Journal of the Korean Statistical Society, 47(4), 436–449.
Ferraty, F., & Vieu, P. (2006). Nonparametric functional data analysis: Theory and practice. New York: Springer.
Hall, P., & Horowitz, J. L. (2007). Methodology and convergence rates for functional linear regression. The Annals of Statistics, 35(1), 70–91.
Horváth, L., & Kokoszka, P. (2012). Inference for functional data with applications (Vol. 200). New York: Springer.
Hsing, T., & Eubank, R. (2015). Theoretical foundations of functional data analysis, with an introduction to linear operators. New York: Wiley.
Hu, Y., Xue, L., Zhao, J., & Zhang, L. (2020). Skew-normal partial functional linear model and homogeneity test. Journal of Statistical Planning and Inference, 204, 116–127.
Huang, J. Z. (2003). Local asymptotics for polynomial spline regression. The Annals of Statistics, 31(5), 1600–1635.
Jiang, Y., Wang, Y. G., Fu, L., & Wang, X. (2019). Robust estimation using modified Huber’s functions with new tails. Technometrics, 61(1), 111–122.
Kato, K. (2012). Estimation in functional linear quantile regression. The Annals of Statistics, 40(6), 3108–3136.
Kokoszka, P., & Reimherr, M. (2017). Introduction to functional data analysis. Boca Raton: CRC Press.
Kong, D., Staicu, A. M., & Maity, A. (2016a). Classical testing in functional linear models. Journal of Nonparametric Statistics, 28(4), 813–838.
Kong, D., Xue, K., Yao, F., & Zhang, H. H. (2016b). Partially functional linear regression in high dimensions. Biometrika, 103(1), 147–159.
Lian, H. (2011). Functional partial linear model. Journal of Nonparametric Statistics, 23(1), 115–128.
Lin, Z., Cao, J., Wang, L., & Wang, H. (2017). Locally sparse estimator for functional linear regression models. Journal of Computational and Graphical Statistics, 26(2), 306–318.
Lu, Y., Du, J., & Sun, Z. (2014). Functional partially linear quantile regression model. Metrika, 77(2), 317–332.
Ma, S. (2016). Estimation and inference in functional single-index models. Annals of the Institute of Statistical Mathematics, 68(1), 181–208.
Maronna, R. A., Martin, R. D., Yohai, V. J., & Salibián-Barrera, M. (2018). Robust statistics: theory and methods (with R). New York: Wiley.
Ramsay, J. O., & Silverman, B. W. (2002). Applied functional data analysis: methods and case studies. New York: Springer.
Ramsay, J. O., & Silverman, B. W. (2005). Functional data analysis (2nd ed.). New York: Springer.
Sang, P., Lockhart, R. A., & Cao, J. (2018). Sparse estimation for functional semiparametric additive models. Journal of Multivariate Analysis, 168, 105–118.
Shi, P., & Li, G. (1995). Global convergence rates of B-spline M-estimators in nonparametric regression. Statistica Sinica, 5(1), 303–318.
Shin, H. (2009). Partial functional linear regression. Journal of Statistical Planning and Inference, 139(10), 3405–3418.
Stone, C. J. (1982). Optimal global rates of convergence for nonparametric regression. The Annals of Statistics, 10(4), 1040–1053.
Tekbudak, M. Y., Alfaro-Córdoba, M., Maity, A., & Staicu, A. M. (2019). A comparison of testing methods in scalar-on-function regression. AStA Advances in Statistical Analysis, 103(3), 411–436.
Wang, X., Jiang, Y., Huang, M., & Zhang, H. (2013). Robust variable selection with exponential squared loss. Journal of the American Statistical Association, 108(502), 632–643.
Welsh, A. (1986). Bahadur representations for robust scale estimators based on regression residuals. The Annals of Statistics, 14(3), 1246–1251.
Yohai, V. J. (1987). High breakdown-point and high efficiency robust estimates for regression. The Annals of Statistics, 15(2), 642–656.
Yu, P., Zhang, Z., & Du, J. (2016). A test of linearity in partial functional linear regression. Metrika, 79(8), 953–969.
Yu, P., Zhu, Z., & Zhang, Z. (2018). Robust exponential squared loss-based estimation in semi-functional linear regression models. Computational Statistics, 34(2), 503–525.
Zhang, D., Lin, X., & Sowers, M. (2007). Two-stage functional mixed models for evaluating the effect of longitudinal covariate profiles on a scalar outcome. Biometrics, 63(2), 351–362.
Zhou, J., Chen, Z., & Peng, Q. (2016a). Polynomial spline estimation for partial functional linear regression models. Computational Statistics, 31(3), 1107–1129.
Zhou, J., Du, J., & Sun, Z. (2016b). M-estimation for partially functional linear regression model based on splines. Communications in Statistics-Theory and Methods, 45(21), 6436–6446.
Zhou, S., Shen, X., & Wolfe, D. (1998). Local asymptotics for regression splines and confidence regions. The Annals of Statistics, 26(5), 1760–1782.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (11971001), and the Beijing Natural Science Foundation (1182002).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix : Proofs
Appendix : Proofs
Let \(N_{s, q}\) be the normalized B-splines and set \(B_{s, q}=J_{n}^{1/2} N_{s, q}, s=1, \ldots , J_{n}\). We first present two lemmas, which will be used to prove the theorems.
Lemma 1
If condition (C3) holds, there are positive constants \(M_{1}\) and \(M_{2}\) such that for any spline function \(\sum _{s=1}^{J_{n}}\gamma _{s}B_{s,q}\),
This Lemma follows from the Theorem 4.2 of Chapter 5 in DeVore and Lorentz (1993).
Lemma 2
If conditions (C1), (C3) and (C6) hold, there exist two positive constants \(M_{3}\) and \(M_{4}\), \(M_{3}<M_{4}<\infty \) such that as \(n\rightarrow \infty \),
Lemma 2 follows from the the Lemma 1 in Zhou et al. (2016a).
1.1 Proof of Theorem 1
According to the Theorem XII.1 of de Boor (2001), for \(\beta (t)\) satisfying condition (C2), there exists a spline function \(\widetilde{\beta }(t)=\sum _{s=1}^{J_{n}}\widetilde{\gamma }_{s}B_{s,q}(t) =\mathbf{B} _{q}(t)^{T}\widetilde{{\varvec{\gamma }}}\) with \(\widetilde{{\varvec{\gamma }}}\in \mathbb {R}^{J_{n}}\) such that
Let \(\xi _{n}=J_{n}^{-r}\), we show that for any given \(\delta >0\), there exists a large constant C such that
where \({\varvec{u}}_{1}\) is p-dimensional vector, \({\varvec{u}}_{2}\) is \(J_{n}\)-dimensional vector and \({\varvec{u}}=({\varvec{u}}^{T}_{1}, {\varvec{u}}^{T}_{2})^{T}\). This equation implies with probability at least \(1-\delta \) that exists a local minimum of \(\mathcal {L}_{n}({\varvec{\alpha }},\widetilde{{\varvec{\gamma }}})\) in the ball \(\{({\varvec{\alpha }}^{T}, \widetilde{{\varvec{\gamma }}}^{T})^{T}+\xi _{n} {{\varvec{u}}}: \Vert {\varvec{u}}\Vert \le C\}\) such that \(\Vert \widehat{ {\varvec{\alpha }}}-{\varvec{\alpha }}\Vert =O_{p}(\xi _{n})\) and \(\Vert {\widehat{{\varvec{\gamma}}}}-{\widetilde{{\varvec{\gamma}}}}\Vert =O_{p}(\xi _{n})\).
Define \(R_{i}=\int _{\mathcal {T}}X_{i}(t)\beta (t)d t -{\varvec{H}}_{i}^{T}\widetilde{{\varvec{\gamma }}}\), \(i=1,\ldots ,n\). By Hölder inequality, condition (C1) and (A.2), it holds that \( \mathbb {E}R_{1}=\mathbb {E}\langle X_{1}, \beta -\widetilde{\beta }\rangle \le \mathbb {E}\Vert X_{1}\Vert \Vert \beta -\widetilde{\beta }\Vert =O(J_{n}^{-r})\). Similarly, we can get \(\mathbb {E}R^{2}_{1}=O(J_{n}^{-2r})\).
Since \(J_{n}\asymp n^{1/(2r+1)}\), by Taylor’s expansion, we have
where \(\vartheta _{i}\) is between \(\epsilon _{i}\) and \(\epsilon _{i}-\xi _{n}{{\varvec{Z}}}^{T}_{i}{{\varvec{u}}}_{1} -\xi _{n}{\varvec{H}}^{T}_{i}{\varvec{u}}_{2}\).
For \(I_{1}\), by Taylor’s expansion, one has
where \(\epsilon _{i}^{*}\) is between \(\epsilon _{i}\) and \(\epsilon _{i}+R_{i}\), and \({{\varvec{D}}}_{i}=({{\varvec{Z}}}_{i}^{T}, {{\varvec{H}}}_{i}^{T})^{T}\).
By condition (C6) and the classical central limit theorem, we obtain that
Since \(\epsilon \) is independent of \(({\varvec{Z}}, X)\), we have \(\mathbb {E}\left[ \psi ^{\prime }_{\tau , h}\left( \epsilon \right) {\varvec{D}} R\right] =\mathbb {E}\psi ^{\prime }_{\tau , h}\left( \epsilon \right) \mathbb {E}[{\varvec{D}} R]\). Invoking (A.1) and conditions (C1) and (C4), we can infer that, for \(1\le l\le p\),
and for \(1\le s\le J_{n}\),
Thus, by condition (C6), we have \(\mathbb {E}\left[ \psi ^{\prime }_{\tau , h}\left( \epsilon \right) {\varvec{D}} R\right] =O(J^{-r}_{n})\). Similarly, we can prove that \(\mathrm{Var}[\psi ^{\prime }_{\tau , h}\left( \epsilon \right) {\varvec{D}} R]=o(1)\). Then, we can obtain that
which implies that
Therefore, after some direct calculations, we can deduce that \(|I_{1}|=O_{p}\left( n\xi _{n}^{2}\Vert {{\varvec{u}}}\Vert \right) \).
For \(I_{2}\), by using the similar arguments, we can get
Then, by choosing sufficiently large C, \(I_{2}\) dominates \(I_{1}\) in \(\Vert {\varvec{u}}\Vert =C\). Similarly, we can prove that
Since \(\xi _{n}\rightarrow 0\), it follows that \(\xi \Vert {\varvec{u}}\Vert \rightarrow 0\) with \({\varvec{u}}=C\). Therefore, \(I_{3}\) is also dominated by \(I_{2}\) in \({\varvec{u}}=C\). Moreover, by conditions (C4) and (C5) and Lemma 2, we obtain that \(I_{2}>0\), this completes the proof of Eq. (A.3). According to Lemma 1, we can infer that \(\Vert \widehat{\beta }(t)-\widetilde{\beta }(t)\Vert ^2\asymp \Vert \widehat{{\varvec{\gamma }}}-\widetilde{{\varvec{\gamma }}}\Vert ^2=O_{p}(J_{n}^{-2r})\). Then, we have
The proof of Theorem 1 is completed. \({\square}\)
For \(j=1,\ldots , p\), let \(\breve{Z}_{ij}=\{\psi ^{\prime }_{\tau , h}(\epsilon _{i})\}^{1/2} Z_{ij}\), \(\breve{X}_{i}=\{\psi ^{\prime }_{\tau , h}(\epsilon _{i})\}^{1/2}X_{i}\) and \(\breve{\eta }_{ij}=\{\psi ^{\prime }_{\tau , h}(\epsilon _{i})\}^{1/2}\eta _{ij}\). By condition (C5), we know that \(\breve{\eta }_{ij}=\breve{Z}_{ij}-\langle \breve{X}_{i}, g_{j}\rangle \). Since \(\epsilon _{i}\) is independent of \((X_{i}, {\varvec{Z}}_{i})\), then it follows that \(\mathbb {E}\left[ \breve{{\varvec{\eta }}}_{1}\breve{{\varvec{\eta }}}_{1}^{T}\right] =\mathbb {E}\psi ^{\prime }_{\tau , h}(\epsilon ){\varvec{\Sigma }}\), where \(\breve{{\varvec{\eta }}}_{i}=(\breve{\eta }_{i1},\ldots ,\breve{\eta }_{ip})^{T}\). Denote by \(\breve{{\varvec{Z}}}=(\breve{{\varvec{Z}}}_{1}, \ldots , \breve{{\varvec{Z}}}_{n})^{T}\), \(\breve{{\varvec{H}}}=(\breve{{\varvec{H}}}_{1}, \ldots , \breve{{\varvec{H}}}_{n})^{T}\) and \(\breve{{\varvec{A}}}=\breve{{\varvec{H}}}(\breve{{\varvec{H}}}^{T}\breve{{\varvec{H}}})^{-1}\breve{{\varvec{H}}}^{T}\), where \(\breve{{\varvec{Z}}}_{i}=\{\psi ^{\prime }_{\tau , h}(\epsilon _{i})\}^{1/2}{\varvec{Z}}_{i}\) and \(\breve{{\varvec{H}}}_{i}=\{\psi ^{\prime }_{\tau , h}(\epsilon _{i})\}^{1/2}{\varvec{H}}_{i}\), \(i=1,\ldots ,n\). The following lemma follows from the Lemma 2 in Zhou et al. (2016a).
Lemma 3
Under conditions (C1)–(C6), as \(n\rightarrow \infty \), we have
1.2 Proof of Theorem 2
Let
where \({\varvec{\eta }}=({\varvec{\alpha }}^{T}, {\varvec{\gamma }}^{T})^{T}\) and \({{\varvec{D}}}_{i}=({{\varvec{Z}}}_{i}^{T}, {{\varvec{H}}}_{i}^{T})^{T}\), \(i=1,\ldots ,n\).
Denote by \({\varvec{\eta }}_{0}=({\varvec{\alpha }}^{T}, \widetilde{{\varvec{\gamma }}}^{T})^{T}\). By Taylor’s expansion, there exists a vector \({\varvec{\eta }}^{*}\) on the line segment between \({\varvec{\eta }}_{0}\) and \(\widehat{{\varvec{\eta }}}\) such that
In the light of Theorem 1, we know that, as \(n\rightarrow \infty \), with probability tending to 1, \(\mathcal {L}_{n}({{\varvec{\alpha }}}, {{\varvec{\gamma }}})\) attains the minimal value at \((\widehat{{\varvec{\alpha }}}, \widehat{{\varvec{\gamma }}})\). Then, by Eq. (5), it follows that \(\phi (\widehat{{\varvec{\eta }}})=0\). Therefore, we have
where \(R_{i}=\int _{\mathcal {T}}X_{i}(t)\beta (t)d t -{\varvec{H}}_{i}^{T}\widetilde{{\varvec{\gamma }}}\) and \(\zeta _{i}\) is between \(\epsilon _{i}+R_{i}\) and \(\epsilon _{i}+R_{i}-{\varvec{D}}_{i}^{T}(\widehat{{\varvec{\eta }}}-{\varvec{\eta }}_{0})\). From Theorem 1, we know that \(R_{i}=O_{p}(J_{n}^{-r})\) and \(|\widehat{{\varvec{\eta }}}-{\varvec{\eta }}_{0}|=O_{p}\left( J_{n}^{-r}\right) \). Using the similar arguments to the proof of Theorem 1, we can prove that
and
Combining (A.4), (A.5) and (A.6), after some simple calculations, we get
and
Note that \({{\varvec{D}}}_{i}^{T} ( \widehat{{\varvec{\eta }}}-{{\varvec{\eta }}}_{0})={{\varvec{Z}}}_{i}^{T} ( \widehat{{\varvec{\alpha }}}-{{\varvec{\alpha }}})+{{\varvec{H}}}_{i}^{T} ( \widehat{{\varvec{\gamma }}}-\widetilde{{\varvec{\gamma }}})\). For convenience, denote by
Applying Lemma 2 and by some direct calculations based on Eq. (A.8), we get
Plugging this into Eq. (A.7), we can obtain
Observe that
and
Then, it follows that
Similarly, we can get
Hence, let \(\widetilde{{\varvec{Z}}}_{i}= {\varvec{Z}}_{i}-{\varvec{\Psi }}_{n}^{T}{\varvec{\Phi }}_{n}^{-1}{\varvec{H}}_{i}\), it is easy to show that
Note that
Invoking Lemma 3, we can obtain that
Furthermore, let \({\varvec{\Phi }}=\mathbb {E}[\psi ^{\prime }_{\tau , h}\left( \epsilon \right) {\varvec{H}}{\varvec{H}}^{T}]\) and \({\varvec{\Psi }}=\mathbb {E}[\psi ^{\prime }_{\tau , h}\left( \epsilon \right) {\varvec{H}}{\varvec{Z}}^{T}]\), we can prove that \({\varvec{\Phi }}_{n}={\varvec{\Phi }}+O_{p}(n^{-1/2})\) and \({\varvec{\Psi }}_{n}={\varvec{\Psi }}+O_{p}(n^{-1/2})\). Denote by \(\widetilde{{\varvec{Z}}}^{*}_{i}= {\varvec{Z}}_{i}-{\varvec{\Psi }}^{T}{\varvec{\Phi }}^{-1}{\varvec{H}}_{i}\). Then, we have
which implies that
Since \(\epsilon _{i}\) is independent of \(({\varvec{Z}}_{i},X_{i})\), by condition (C6), we can get that \(\mathbb {E}[\psi _{\tau , h}(\epsilon _{1})\widetilde{{\varvec{Z}}}^{*}_{1}]=0\) and \(\mathbb {E}[\psi ^{2}_{\tau , h}(\epsilon _{1})\widetilde{{\varvec{Z}}}^{*}_{1}\widetilde{{\varvec{Z}}}^{*T}_{1}]=\mathbb {E}\psi ^{2}_{\tau , h}(\epsilon _{1}){\varvec{\Sigma }}\). Using the central limit theorem, we have
Note that
Since \(\widetilde{{\varvec{Z}}}_{i}-\widetilde{{\varvec{Z}}}^{*}_{i}=O_{p}(n^{-1/2})\) and \(\mathbb {E}\psi _{\tau , h}(\epsilon _{i})=0\), we can prove that
which implies that
Next, we need to prove that
For the jth element of \(\widetilde{{\varvec{Z}}}_{i}\), \(j=1,\ldots ,p\),
where \(\breve{{\varvec{Z}}}_{,j}\) is the jth column of \(\breve{{\varvec{Z}}}\), that is, \(\breve{{\varvec{Z}}}_{,j}=(\{\psi ^{\prime }_{\tau , h}(\epsilon _{1})\}^{1/2}Z_{1j}, \ldots , \{\psi ^{\prime }_{\tau , h}(\epsilon _{n})\}^{1/2}Z_{nj})^{T}\). Let \(R_{i}^{*}=\{\psi ^{\prime }_{\tau , h}(\epsilon _{i})\}^{1/2}R_{i}\), \(i=1,\ldots ,n\). Then, by the definition of \(R_{i}\) and condition (C5), one has
Denote by \(\widehat{g}_{j}(t)=\mathbf{B}_{q}(t)^{T}(\breve{{\varvec{H}}}^{T}\breve{{\varvec{H}}})^{-1}\breve{{\varvec{H}}}^{T}\breve{{\varvec{Z}}}_{,j}\). According to Theorem 2 in Zhou et al. (2016a), it holds that \(\Vert \widehat{g}_{j}-g_{j}\Vert =O_{p}(J_{n}^{-r})\). Therefore, we can infer that
which implies that \(\mathbb {E}[\psi ^{\prime }_{\tau , h}(\epsilon _{i})\widetilde{{\varvec{Z}}}_{i}R_{i}]=O\left( J^{-2r}_{n}\right) \). Similarly, we can prove that \(\mathrm{Var}[\psi ^{\prime }_{\tau , h}(\epsilon _{i})\widetilde{{\varvec{Z}}}_{i}R_{i}]=o(1)\). Then, we can obtain that
which implies that (A.12) holds. Then, Theorem 2 follows from (A.9)–(A.12) and the Slutsky’s theorem. \({\square}\)
1.3 Proofs of Theorem 3 and Corollary 1
Using the similar analogous to Theorem 3 and Corollary 1 in Zhou et al. (2016a), the proofs can be immediately obtained, so we omit the details. \({\square}\)
1.4 Proof of Theorem 4
Observe that \( \widehat{Y}_{n+1}={{\varvec{Z}}}_{n+1}^{T}\widehat{{\varvec{\alpha }}}+\int _{\mathcal {T}}X_{n+1}(t) \widehat{\beta }(t)\mathrm {d}t\) and \(Y_{n+1}={{\varvec{Z}}}_{n+1}^{T} {\varvec{\alpha }}+\int _{\mathcal {T}}X_{n+1}(t)\beta (t)\mathrm {d}t\). By \(C_{r}\) inequality and Hölder inequality, we have
By Theorems 1, 2, conditions (C1) and (C4), we can deduce that
This completes the proof of Theorem 4. \({\square}\)
Rights and permissions
About this article
Cite this article
Cai, X., Xue, L. & Lu, F. Robust estimation with a modified Huber’s loss for partial functional linear models based on splines. J. Korean Stat. Soc. 49, 1214–1237 (2020). https://doi.org/10.1007/s42952-020-00052-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42952-020-00052-x