Skip to main content
Log in

Global statistical inference for the difference between two regression mean curves with covariates possibly partially missing

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

In two sample problems it is of interest to examine the difference between the two regression curves or to detect whether certain functions are adequate to describe the overall trend of the difference. In this paper, we propose a simultaneous confidence band (SCB) as a global inference method with asymptotically correct coverage probabilities for the difference curve based on the weighted local linear kernel regression estimates in each sample. Our procedure allows for random designs, different sample sizes, heteroscedastic errors, and especially missing covariates. Simulation studies are conducted to investigate the finite sample properties of the new SCB which support our asymptotic theory. The proposed SCB is used to analyze two data sets, one of which is concerned with human event-related potentials data which are fully observed and the other is concerned with the Canada 2010/2011 youth student survey data with partially missing covariates, leading to a number of discoveries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Al Ahmari T, Alomar A, Al Beeybe J, Asiri N, Al Ajaji R, Al Masoud R, Al-Hazzaa H (2017) Associations of self-esteem with body mass index and body image among Saudi college-age females. Eat Weight Disord St 1:1–9

    Google Scholar 

  • Baringhaus L, Kolbe D (2015) Two-sample tests based on empirical Hankel transforms. Stat Papers 56:597–617

    MathSciNet  MATH  Google Scholar 

  • Bickel P, Rosenblatt M (1973) On some global measures of deviations of density function estimates. Ann Stat 31:1852–1884

    MathSciNet  MATH  Google Scholar 

  • Cai L, Gu L, Wang Q and Wang S (2020) Simultaneous confidence bands for nonparametric regression with missing covariate data. Revised manuscript submitted for publication. https://www.researchgate.net/publication/339642319

  • Cai L, Li L, Huang S, Ma L, Yang L (2020) Oracally efficient estimation for dense functional data with holiday effects. TEST 29:282–306

    MathSciNet  MATH  Google Scholar 

  • Cai L, Liu R, Wang S, Yang L (2019) Simultaneous confidence bands for mean and variance functions based on deterministic design. Stat Sin 29:505–525

    MathSciNet  MATH  Google Scholar 

  • Cai L, Yang L (2015) A smooth simultaneous confidence band for conditional variance function. TEST 24:632–655

    MathSciNet  MATH  Google Scholar 

  • Cai T, Low M, Ma Z (2014) Adaptive confidence bands for nonparametric regression functions. J Am Stat Assoc 109:1054–1070

    MathSciNet  MATH  Google Scholar 

  • Cao G, Wang L, Li Y, Yang L (2016) Oracle-efficient confidence envelopes for covariance functions in dense functional data. Stat Sin 26:359–383

    MathSciNet  MATH  Google Scholar 

  • Cao G, Yang L, Todem D (2012) Simultaneous inference for the mean function based on dense functional data. J Nonparametr Stat 24:359–377

    MathSciNet  MATH  Google Scholar 

  • Claeskens G, Van Keilegom I (2003) Bootstrap confidence bands for regression curves and their derivatives. Ann Stat 31:1852–1884

    MathSciNet  MATH  Google Scholar 

  • Cozzucoli P (2010) Simultaneous confidence intervals on partial means of classes in the two-stage stratified sampling. Stat Papers 51:673–685

    MathSciNet  MATH  Google Scholar 

  • Eubank R, Speckman P (1993) Confidence bands in nonparametric regression. J Am Stat Assoc 88:1287–1301

    MathSciNet  MATH  Google Scholar 

  • Fan J, Gijbels I (1996) Local polynomial modeling and its applications. Chapman and Hall, London

    MATH  Google Scholar 

  • González-Manteiga W, Crujeiras RM (2013) An updated review of Goodness-of-Fit tests for regression models. TEST 22:361–411

    MathSciNet  MATH  Google Scholar 

  • Hall P, Titterington D (1988) On confidence bands in nonparametric density estimation and regression. J Multivar Anal 27:228–254

    MathSciNet  MATH  Google Scholar 

  • Härdle W (1989) Asymptotic maximal deviation of M-smoothers. J Multivar Anal 29:163–179

    MathSciNet  MATH  Google Scholar 

  • Härdle W, Marron J (1991) Bootstrap simultaneous error bars for nonparametric regression. Ann Stat 19:778–796

    MathSciNet  MATH  Google Scholar 

  • Hosmer D, Lemeshow S (2005) Applied logistic regression, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  • Huang X, Wang L, Yang L, Kravchenko AN (2008) Management practice effects on relationships of grain yields with topography and precipitation. Agronomy J 100:1463–1471

    Google Scholar 

  • Johnston G (1982) Probabilities of maximal deviations for nonparametric regression function estimates. J Multivar Anal 12:402–414

    MathSciNet  MATH  Google Scholar 

  • Kling K, Hyde J, Showers C, Buswell B (1999) Gender differences in self-esteem: a meta-analysis. Psychol Bull 125:470–500

    Google Scholar 

  • Leadbetter M, Lindgren G, Rootzén H (1983) Extremes and related properties of random sequences and processes. Springer-Verlag, New York

    MATH  Google Scholar 

  • Liu W, Wu W (2010) Simultaneous nonparametric inference of time series. Ann Stat 38:2388–2421

    MathSciNet  MATH  Google Scholar 

  • Mojirsheibani M, Reese T (2017) Kernel regression estimation for incomplete data with applications. Stat Papers 58:185–209

    MathSciNet  MATH  Google Scholar 

  • Munk A, Dette H (1998) Nonparametric comparison of several regression functions: exact and asymptotic theory. Ann Stat 26:2339–2368

    MathSciNet  MATH  Google Scholar 

  • Neumeyer N, Sperlich S (2006) Comparison of separable components in different samples. Scand J Stat 33:477–501

    MathSciNet  MATH  Google Scholar 

  • Pardo-Fernández JC, Jiménez-Gamero MD, El Ghouch A (2015a) Tests for the equality of conditional variance functions in nonparametric regression. Electron J Stat 9:1826–1851

    MathSciNet  MATH  Google Scholar 

  • Pardo-Fernández JC, Jiménez-Gamero MD, El Ghouch A (2015b) A non-parametric ANOVA-type test for regression curves based on characteristic functions. Scand J Stat 42:197–213

    MathSciNet  MATH  Google Scholar 

  • Park C, Hannig J, Kang K-H (2014) Nonparametric comparison of multiple regression curves in scale-space. J Comput Graph Stat 23:657–677

    MathSciNet  Google Scholar 

  • Rivas-Martínez G, Jiménez-Gamero M, Moreno-Rebollo J (2019) A two-sample test for the error distribution in nonparametric regression based on the characteristic function. Stat Papers 60:1369–1395

    MathSciNet  MATH  Google Scholar 

  • Rosenblatt M (1952) Remarks on a multivariate transformation. Ann I Stat Math 23:470–472

    MathSciNet  MATH  Google Scholar 

  • Silverman B (1986) Density estimation. Chapman and Hall, London

    MATH  Google Scholar 

  • Song Q, Yang L (2009) Spline confidence bands for variance functions. J Nonparametr Stat 5:589–609

    MathSciNet  MATH  Google Scholar 

  • Wang C, Wang S, Zhao L, Ou S-T (1997) Weighted semiparametric estimation in regression analysis with missing covariate data. J Am Stat Assoc 92:512–525

    MathSciNet  MATH  Google Scholar 

  • Wang J, Yang L (2009) Polynomial spline confidence bands for regression curves. Stat Sin 19:325–342

    MathSciNet  MATH  Google Scholar 

  • Wu W, Zhao Z (2007) Inference of trends in time series. J R Stat Soc B 69:391–410

    MathSciNet  Google Scholar 

  • Xia Y (1998) Bias-corrected confidence bands in nonparametric regression. J R Stat Soc Ser B 60:797–811

    MathSciNet  MATH  Google Scholar 

  • Zhao S, Bakoyannis G, Lourens S, Tu W (2020) Comparison of nonlinear curves and surfaces. Comput Stat Data Anal 150:106987

    MathSciNet  MATH  Google Scholar 

  • Zhou S, Wang D, Zhu J (2020) Construction of simultaneous confidence bands for a percentile hyper-plane with predictor variables constrained in an ellipsoidal region. Stat Papers 61:1335–1346

    MathSciNet  MATH  Google Scholar 

  • Zhou Z, Wu W (2010) Simultaneous inference of linear models with time-varying coefficients. J R Stat Soc Ser B 72:513–531

    MathSciNet  MATH  Google Scholar 

  • Zi X, Zou C, Liu Y (2012) Two-sample empirical likelihood method for difference between coefficients in linear regression model. Stat Papers 53:83–93

    MathSciNet  MATH  Google Scholar 

  • Zuckerman M, Li C, Hall J (2016) When men and women differ in self-esteem and when they don’t: a meta-analysis. J Res Pers 64:34–51

    Google Scholar 

Download references

Acknowledgements

We would like to thank the Editors and two referees for their constructive and helpful comments that substantially improved an earlier version of this paper.This research was supported in part by the National Natural Science Foundation of China Award NSFC #11901521, First Class Discipline of Zhejiang–A (Zhejiang Gongshang University–Statistics), Zhejiang Province Statistical Research Program #20TJQN04, and the Simons Foundation Mathematics and Physical Sciences Program Award #499650.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suojin Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

For any functions \(\varphi _{n}\left( x\right) \) and \(\phi _{n}\left( x\right) ,x\in {\mathcal {D}}\), we use \(\varphi _{n}\left( x\right) =\) \( U_{p}\left( \phi _{n}\left( x\right) \right) \) and \(\varphi _{n}\left( x\right) =u_{p}\left( \phi _{n}\left( x\right) \right) \) to mean “\(\varphi _{n}\left( x\right) /\phi _{n}\left( x\right) \) is bounded and \(\varphi _{n}\left( x\right) /\phi _{n}\left( x\right) \) tends to 0 as \(n\rightarrow \infty \) for \(x\in {\mathcal {D}}\) uniformly in probability.”

1.1 A.1 Preliminaries

In this subsection, we give some lemmas that are needed in our theoretical development. The following lemma is Theorem 1 in Cai et al. (2020).

Lemma A.1

Under Assumptions (A1)–(A5), as \(n\rightarrow \infty \),

$$\begin{aligned} {\tilde{m}}_{k}(x)-m_{k}(x)=\psi _{k}\left( x\right) +U_{p}\left( h^{2}\right) , \end{aligned}$$
(A.1)

where \(\psi _{k}\left( x\right) =n_{k}^{-1}f_{k}^{-1}\left( x\right) \sum \limits _{i=1}^{n_{k}}\frac{\delta _{ik}}{\pi _{ik}}K_{h}\left( X_{ik}-x\right) \varepsilon _{ik}\) for \(k=1,2.\)

By (A.4) in Cai et al. (2020), one has that conditional on \(\Delta _{nk}=n_{0k}\), \(\psi _{k}\left( x\right) \) is a stochastic process with mean zero and variance \( n_{k}^{-2}n_{0k}h^{-1}d_{k}\left( x\right) \tau _0 \left\{ 1 \right. \) \(\left. +u_{p}\left( 1\right) \right\} \). Here \(\{ n_{0k}\}\) is a sequence of numbers related to \(n_k\) with \(1\le n_{0k}\le n_k\). By (7) it is clear that there exists a constant \(r_k>0\) such that \(r_k \le \Delta _{nk}/n_{k}\le 1\) in probability as \(n_k \rightarrow \infty \). Therefore, we only need to consider \(n_{0k} \ge r_k\times n_k\). That is, \(n_{0k}\) and \(n_k\) have the same order as \(n_k \rightarrow \infty \).

Due to the i.i.d. assumption of the data, conditional on \(\Delta _{nk}=n_{0k}\) is equivalent to conditional on \({\varvec{\delta }}_k=(\delta _{1k},\ldots ,\delta _{n_{k}k})^T\) in which there are \(n_{0k}\) elements equal to 1 and the other \((n_k-n_{0k})\) elements equal to 0. Without loss of generality, let \(\delta _{ik}=1\) for \(i=1,2,\ldots ,n_{0k}\) and \( \delta _{ik}=0\) for \(i=n_{0k}+1,\ldots ,n_{k}, k=1,2\). Then conditional on \(\Delta _{nk}=n_{0k}\), one can write

$$\begin{aligned} \psi _{k}\left( x\right) =f_{k}^{-1} \left( x\right) n_{k}^{-1} \sum \limits _{i=1}^{n_{0k}}\frac{1}{\pi _{ik}}K_{h}\left( X_{ik}-x\right) \varepsilon _{ik}. \end{aligned}$$
(A.2)

Furthermore, conditional on \(\Delta _{nk}=n_{0k}\) define a rescaled stochastic process of \(\psi _{k}\left( x\right) \):

$$\begin{aligned} \xi _{n_{0k}}\left( x\right) =\left( n _{0k}h\right) ^{1/2}d_{k}^{-1/2}\left( x\right) f_{k}^{-1}\left( x\right) n_{0k}^{-1}\sum \limits _{i=1}^{n_{0k}}\frac{1}{\pi _{ik}}K_{h}\left( X_{ik}-x\right) \varepsilon _{ik}. \end{aligned}$$

Also define a Gaussian stochastic process \(\xi _{n_{0k}}^{*}\left( x\right) \):

$$\begin{aligned} \xi _{n_{0k}}^{*}\! \left( x\right) \! =\! h^{1/2}\! d_{nk}^{-1/2} \! \left( x\right) \! f_{k}^{-1}\! \left( x\right) \! \int \!\! \int \!\!\frac{1}{\pi _{k}\left( m_{k}\! \left( u\right) \! +\! \varepsilon _{k}\right) }K_{h} \! \left( u\! - \!x\right) dW_{\!n_{0k}k}\! \left( T\!\left( u,\varepsilon _{k}\right) \right) , \end{aligned}$$

where \(d_{nk}\left( x\right) =f_{k}^{-2}\left( x\right) f_{X_{1k}|\delta _{1k}=1}\left( x\right) \mathop {\mathrm{E}}\left\{ \varepsilon _{1k}^{2}\pi _{k}^{-2}\left( m_{k}\left( x\right) +\varepsilon _{1k}\right) I\left( \left| \varepsilon _{1k}\right| \le \kappa _{n}\right) \big | \right. \) \(X_{1k}=x,\left. \delta _{1k}=1\right\} \) with \(\kappa _{n}=n^{\gamma },2/\left( 3\eta \right) \le \gamma \le 6^{-1}\), in which \(\eta \) is given in Assumption (A2), and \(W_{n_{0k}k}\left( T\left( u,\varepsilon _{k}\right) \right) \) is a sequence of Wiener processes. Here \(T\left( \cdot ,\cdot \right) \) is the Rosenblatt quantile transformation of \((X_k,\varepsilon _k)\) in Rosenblatt (1952) which produces mutually independent uniform random variables on \(\left[ 0,1\right] ^{2}\). By Lemmas A.6–A.8 in Cai et al. (2020), one obtains the following result.

Lemma A.2

Under Assumptions (A1)–(A5), as \(n_{0k}\rightarrow \infty \),

$$\begin{aligned} \sup _{x\in \left[ a_{0}b_{0}\right] }\left| \xi _{n_{0k}}^{*}\left( x\right) -\xi _{n_{0k}}\left( x\right) \right| =o_{p}\left( \log ^{-1/2}n\right) , \end{aligned}$$
(A.3)

and \(\xi _{n_{0k}}^{*}\left( x\right) \) is a Gaussian process with mean zero and covariance function uniformly approximated by \(\tau _h(x-x^{\prime })\).

1.2 A.2 Proofs of the main results in Section 2

Proof of Proposition 1

Let \(G_{nk}^{*}\left( x\right) =(n\Delta _{nk})^{1/2} n_k^{-1}d_{k}^{1/2}\left( x\right) \xi _{n_{0k}}^{*}\left( x\right) \). Lemma A.2 implies that conditional on \(\Delta _{nk}=n_{0k} \), \(G_{nk}^{*}\left( x\right) \) is a Gaussian process with mean zero and covariance function uniformly approximated by

$$\begin{aligned} \Sigma _{nk}\left( x,x^{\prime } \right) =c_{k}^{2}d_{k}^{1/2}\left( x\right) d_{k}^{1/2}\left( x^{\prime }\right) \tau _h(x-x^{\prime }). \end{aligned}$$

Next, notice that conditional on \(\Delta _{nk}=n_{0k}\), \( \sqrt{nh} \psi _{k}\left( x\right) =(n \Delta _{nk})^{1/2} n_k^{-1} \) \(d_{k}^{1/2} \left( x\right) \xi _{n_{0k}}\left( x\right) . \) This together with (A.3) concludes that conditional on \(\Delta _{nk}=n_{0k}, \)

$$\begin{aligned} \sup _{x\in \left[ a_{0},b_{0}\right] }\left| G_{nk}^{*}\left( x\right) -\sqrt{nh}\psi _{k}\left( x\right) \right| =o_{p}\left( \log ^{-1/2}n\right) . \end{aligned}$$

As is discussed below Lemma A.1, \(n_{0k}\) and \(n_k,n\) have the same order as \(n \rightarrow \infty \). Since \(h\ll n^{-1/5}\log ^{-1/5}n\) given in Assumption (A5), by ( A.1) and the definition of \(G_{nk}(x)\) one has that, conditional on \(\Delta _{nk}=n_{0k}\),

$$\begin{aligned} \sup _{x\in \left[ a_{0},b_{0}\right] }\left| G_{nk}^{*}\left( x\right) -G_{nk}(x)\right|\le & {} \sup _{x\in \left[ a_{0},b_{0}\right] }\left| G_{nk}^{*}\left( x\right) -\sqrt{nh}\psi _{k}\left( x\right) \right| +U_{p}\left( \sqrt{nh}h^{2}\right) \\= & {} o_{p}\left( \log ^{-1/2}n\right) = o_{p}\left( \log ^{-1/2}n_{0k}\right) , \end{aligned}$$

as \(n_{0k} \rightarrow \infty \). That is, for every \(\varepsilon >0\), there exist \(M_{\varepsilon }\) and \(N_{0k}\) such that

$$\begin{aligned} P\left\{ \log ^{1/2}n\sup _{x\in \left[ a_{0}b_{0}\right] }\left| G_{nk}^{*}\left( x\right) -G_{nk}(x)\right| >M_{\varepsilon } \big \vert \Delta _{nk}=n_{0k}\right\} \le \varepsilon /2 \end{aligned}$$

for all \(n_{0k}\ge N_{0k}\). On the other hand, (7) implies that there exists \(N_{k}>N_{0k}\) such that \(P\left( \Delta _{nk}\ge N_{0k}\right) >1-\varepsilon /2\) for \(n_{k}\ge N_{k}\). Therefore, for \(n_{k} \ge N_{k}\),

$$\begin{aligned}&P\left\{ \log ^{1/2}n\sup _{x\in \left[ a_{0}b_{0}\right] }\left| G_{nk}^{*}\left( x\right) -G_{nk}(x)\right|>M_{\varepsilon }\right\} \\&= \!\sum \limits _{n_{0k}=0}^{n_{k}}\! P\left\{ \! \log ^{1/2}n\sup _{x\in \left[ a_{0}b_{0}\right] }\left| G_{nk}^{*}\left( x\right) -G_{nk}(x)\right|> M_{\varepsilon }\big \vert \Delta _{nk}=n_{0k} \!\right\} \! P \!\left( \Delta _{nk}=n_{0k}\right) \\&\le \sum \limits _{n_{0k}=N_{0k}}^{n_{k}}\!\!\!\!P\left\{ \log ^{1/2}n\!\!\sup _{x\in \left[ a_{0}b_{0}\right] }\!\!\left| G_{nk}^{*}\left( x\right) -G_{nk}(x)\right| > M_{\varepsilon }\big \vert \Delta _{nk}\!=\!n_{0k} \!\right\} \!P\left( \Delta _{nk}=n_{0k}\right) \\& +\varepsilon /2 \\&\le \varepsilon /2+\varepsilon /2=\varepsilon . \end{aligned}$$

By definition, this means that unconditionally \(\sup _{x\in \left[ a_{0}b_{0} \right] }\left| G_{nk}^{*}\left( x\right) -G_{nk}(x)\right| =o_{p}\left( \log ^{-1/2}n\right) \), completing the proof. \(\square \)

Proof of Theorem 1

It is readily seen that s(x) is bounded away from zero for all \( x\in [a_{0},b_{0}]\). By Equation (4), it is clear that

$$\begin{aligned} \sup _{x\in \left[ a_{0},b_{0}\right] }\left| D_{n}\left( x\right) -D_{n}^{*}\left( x\right) \right| =o_{p}\left( \log ^{-1/2}n\right) . \end{aligned}$$
(A.4)

Using the triangle inequality repeatedly, one can easily obtain that, for any real numbers a and b,

$$\begin{aligned} \left| \, \vert a \vert - \vert b \vert \, \right| \le \left| a - b \right| . \end{aligned}$$

Thus for any given \(\epsilon > 0\), it is readily shown that for any \(t \in {\mathbb {R}}\)

$$\begin{aligned} P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) | \le t\right)\le & {} P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}^{*}\left( x\right) | \le t + \epsilon \right) \\&+ P\left( \sup _{x\in [a_{0},b_{0}]}\left| \, |D_{n}(x) |- | D_{n}^{*}(x)|\,\right| \ge \epsilon \right) \\\le & {} P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}^{*}\left( x\right) | \le t + \epsilon \right) \\&+ P\left( \sup _{x\in [a_{0},b_{0}]}\left| D_{n}(x) - D_{n}^{*}(x) \right| \ge \epsilon \right) . \end{aligned}$$

Therefore,

$$\begin{aligned}&\sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) | \le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}^{*}\left( x\right) | \le t + \epsilon \right) \right\} \\&\quad \le P\left( \sup _{x\in [a_{0},b_{0}]}\left| D_{n}(x) - D_{n}^{*}(x) \right| \ge \epsilon \right) , \end{aligned}$$

which with (A.4) concludes that

$$\begin{aligned} {\overline{\lim }}_{n\rightarrow \infty }\sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) | \le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } |D_{n}^{*}\left( x\right) | \le t + \epsilon \right) \right\} \le 0. \end{aligned}$$
(A.5)

Similarly, by symmetry,

$$\begin{aligned}&{\underline{\lim }}_{n\rightarrow \infty } \sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) |\le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }| D_{n}^{*}\left( x\right) | \le t - \epsilon \right) \right\} \nonumber \\&\quad ={\underline{\lim }}_{n\rightarrow \infty } \sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) | \le t + \epsilon \right) - P\left( \sup _{x\in \left[ a_{0},b_{0} \right] }| D_{n}^{*}\left( x\right) | \le t \right) \right\} \nonumber \\&\quad =-{\overline{\lim }}_{n\rightarrow \infty } \sup _{t\in {\mathbb {R}}} \! \left\{ \!P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }\! | D_{n}^{*}\left( x\right) | \le t\right) - \!P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }\!| D_{n}\left( x\right) | \le t + \epsilon \right) \! \right\} \nonumber \\&\quad \ge 0. \end{aligned}$$
(A.6)

Note that in light of Lemma A.2\( D_{n}^{*}\left( x\right) \) can be viewed asymptotically as a Gaussian process with mean zero and covariance function \(\Sigma _n(x,x^{\prime })\) which converges to \(\Sigma (x,x^{\prime })\) (that equals 1 when \(x = x^{\prime }\), and equals to 0 otherwise). Then it is seen that \(S_n^* =\sup _{x\in \left[ a_{0},b_{0}\right] }| D_{n}^{*}\left( x\right) | \) has a smooth limiting distribution, say, \(\Psi (t)\). Therefore, one can easily obtain that

$$\begin{aligned} \lim _{n\rightarrow \infty } \sup _{t\in {\mathbb {R}}} | P(S_n^* \le t ) - \Psi (t)| = 0, \end{aligned}$$

and that \(\lim _{\epsilon \rightarrow 0}\sup _{t\in {\mathbb {R}}} |\Psi (t+\epsilon ) -\Psi (t) |= 0\). These two equations imply that for any small \(\epsilon > 0\)

$$\begin{aligned} \lim _{n\rightarrow \infty } \sup _{t\in {\mathbb {R}}} P(t < S_n^* \le t+\epsilon ) = o(\epsilon ). \end{aligned}$$
(A.7)

Moreover, by (A.5) and (A.7) one has that

$$\begin{aligned}&{\overline{\lim }}_{n\rightarrow \infty } \sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) |\le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }| D_{n}^{*}\left( x\right) | \le t \right) \right\} \\&\quad \le {\overline{\lim }}_{n\rightarrow \infty }\sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) | \le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } |D_{n}^{*}\left( x\right) | \le t + \epsilon \right) \right\} \\&\qquad +\;{\overline{\lim }}_{n\rightarrow \infty } \sup _{t\in {\mathbb {R}}} P\left( t < \sup _{x\in \left[ a_{0},b_{0}\right] }| D_{n}^{*}\left( x\right) | \le t+\epsilon \right) \\&\quad \le 0 + o(\epsilon ) = o(\epsilon ). \end{aligned}$$

Since \(\epsilon > 0\) can be arbitrary small and the left hand side does not depend on \(\epsilon \),

$$\begin{aligned} {\overline{\lim }}_{n\rightarrow \infty }\sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) | \le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }| D_{n}^{*}\left( x\right) | \le t \right) \right\} \le 0. \end{aligned}$$

Likewise, by (A.6) and (A.7) one can obtain that

$$\begin{aligned} {\underline{\lim }}_{n\rightarrow \infty } \sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) | \le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }| D_{n}^{*}\left( x\right) | \le t \right) \right\} \ge 0. \end{aligned}$$

Therefore,

$$\begin{aligned} \lim _{n\rightarrow \infty } \left| \sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) | \le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }| D_{n}^{*}\left( x\right) | \le t \right) \right\} \right| = 0. \end{aligned}$$
(A.8)

Similarly, one can also obtain

$$\begin{aligned} \lim _{n\rightarrow \infty } \left| \sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}^{*}\left( x\right) | \le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }| D_{n}\left( x\right) | \le t \right) \right\} \right| = 0. \end{aligned}$$
(A.9)

Define \(|y|_+ = y\) if \(y\ge 0\) and \(|y|_+ = 0\) if \(y < 0\), and \(|y|_- = -y\) if \(y\le 0\) and \(|y|_- = 0\) if \(y > 0\). Then \(|k(t)| = |k(t)|_+ +|k(t)|_-\) for any real function k(t) and \(t \in {\mathbb {R}}\). Thus,

$$\begin{aligned} \sup _{t\in {\mathbb {R}}} \left| k(t) \right|\le & {} \sup _{t\in {\mathbb {R}}} \left| k(t) \right| _+ +\sup _{t\in {\mathbb {R}}} \left| k(t) \right| _- \nonumber \\\le & {} \left| \sup _{t\in {\mathbb {R}}} k(t) \right| +\left| \sup _{t\in {\mathbb {R}}} \left\{ - k(t) \right\} \right| . \end{aligned}$$
(A.10)

Letting \(k(t) = P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) | \le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}^{*}\left( x\right) | \le t \right) \) in (A.10) and using (A.8) and (A.9) one has that

$$\begin{aligned}&{\overline{\lim }}_{n\rightarrow \infty } \sup _{t\in {\mathbb {R}}} \left| P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) |\le t\right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }| D_{n}^{*}\left( x\right) | \le t \right) \right| \\&\quad \le {\overline{\lim }}_{n\rightarrow \infty } \left| \sup _{t\in {\mathbb {R}}} \left\{ P\left( \sup _{x\in \left[ a_{0},b_{0}\right] } | D_{n}\left( x\right) | \le t \right) - P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }| D_{n}^{*}\left( x\right) | \le t \right) \right\} \right| \\&\qquad + \,{\overline{\lim }}_{n\rightarrow \infty } \left| \sup _{t\in {\mathbb {R}}} \! \left\{ \!P\left( \sup _{x\in \left[ a_{0},b_{0} \right] }\! | D_{n}^{*}\left( x\right) | \le t\right) \!-\! P\left( \sup _{x\in \left[ a_{0},b_{0}\right] }\!| D_{n}\left( x\right) | \le t \right) \! \right\} \right| \! = 0, \end{aligned}$$

completing the proof of Theorem 1.

1.3 A.3 Proofs of some statements in the Introduction

This subsection contains the main steps to obtain that under certain strong conditions the extreme value distribution of the standardized version of the estimation error \({\tilde{m}}_{1}(x)-{\tilde{m}}_{2}(x)-\left( m_{1}(x)-m_{2}(x)\right) \) is a standard Gumbel distribution, as stated in Theorem A.1 below. The proofs are similar to those in Cai et al. (2020) for the single population problem. Theorem A.1 reveals that (i) when the data are fully observed, if \(\sigma _{1}^{2}(x)\) is proportional to \(\sigma _{2}^{2}(x)\) and \(f_{1}\left( x\right) \) and \( f_{2}\left( x\right) \) are equal, the extreme value distribution of the estimation error is a Gumbel distribution; (ii) when the covariates are MAR, the result is also true under the strong condition that \(d_1(x)\) is proportional to \(d_2(x)\) which is difficult to check. The following lemmas are needed in the proof of Theorem A.1.

Lemma A.3

If the Gaussian process \(\zeta \left( s\right) ,0\le s\le T\) is stationary with mean zero and variance one, and covariance function satisfying

$$\begin{aligned} \mathrm {corr}\left( \zeta \left( s\right) ,\zeta \left( s+t\right) \right) = {\mathbb {E}}\zeta \left( s\right) \zeta \left( s+t\right) =1-C|t|^{a}+o(|t|^{a})\ as\ t\rightarrow 0 \end{aligned}$$

for some constants \(C>0\) and \(0< a \le 2\), then as \(T\rightarrow \infty \),

$$\begin{aligned} {\mathbb {P}}\left[ \rho _{T}\left\{ \sup _{s\in \left[ 0,T\right] }\left| \zeta \left( s\right) \right| -\gamma _{T}\right\} <z\right] \rightarrow \exp \left\{ -2\exp \left( -z\right) \right\} ,\forall z\in {\mathbb {R}}, \end{aligned}$$

in which \(\rho _{T}=\left( 2\log T\right) ^{1/2}\) and

$$\begin{aligned} \gamma _{T}=\rho _{T}+\rho _{T}^{-1}\left\{ \frac{2-a}{2a}\log \left( \rho _{T}^{2}/2\right) +\log \left( C^{1/a}H_{a}\left( 2\pi \right) ^{-1/2}2^{\left( 2-a\right) /2a}\right) \right\} , \end{aligned}$$

where \(H_a\) is a certain strictly positive constant (\(H_{1}=1,H_{2}=\pi ^{-1/2}\)).

This result is a direct conclusion of Theorems 11.1.5 and 12.3.5 of Leadbetter et al. (1983).

By (A.4) in Cai et al. (2020), one can easily obtain that conditional on \(\Delta _{nk}=n_{0k}\), \(\psi _{1}\left( x\right) -\psi _{2}\left( x\right) \) is a stochastic process with mean 0 and variance

$$\begin{aligned} \left\{ n_{1}^{-2}h^{-1}\Delta _{n1}v_{1}(x)+n_{2}^{-2}h^{-1}\Delta _{n2}v_{2}(x)\right\} \left\{ 1+u\left( 1\right) \right\} , \end{aligned}$$

where \(v_{k}(x) = \tau _0d_k(x)\), for \( k=1,2\). Define the standardized version of \(\psi _{1}\left( x\right) -\psi _{2}\left( x\right) \) as follows:

$$\begin{aligned}&\xi _{n}\left( x\right) = \frac{\psi _{1}\left( x\right) -\psi _{2}\left( x\right) }{ \sqrt{\left\{ n_{1}^{-2}h^{-1}\Delta _{n1}v_{1}(x)+n_{2}^{-2}h^{-1}\Delta _{n2}v_{2}(x)\right\} }} \\& =\frac{n_{1}h^{1/2}\Delta _{n1}^{-1/2}v_{1}^{-1/2}(x)\psi _{1} \left( x\right) }{\sqrt{1+r_{n}\left( x\right) }}-\frac{ n_{2}h^{1/2}\Delta _{n2}^{-1/2}v_{2}^{-1/2}(x)\psi _{2}\left( x\right) }{\sqrt{ 1+r_{n}^{-1}\left( x\right) }} \end{aligned}$$

for \(x\in \left[ a_0,b_{0}\right] \), where \(r_{n}\left( x\right) =\frac{n_{1}^{2}\Delta _{n2}d_{2}(x)}{n_{2}^{2}\Delta _{n1}d_{1}(x)}\).

Lemma A.4

Under Assumptions (A1)–(A5), if there exists a constant \(c>0\) such that \(d_2(x)=cd_1(x)\), then one has that as \(n_{01},n_{02}\rightarrow \infty \), for any \(t\in {\mathbb {R}}\),

$$\begin{aligned} P\!\left\{ \!\rho _{h}\!\left( \sup _{x\in \left[ a_{0},b_{0}\right] } \! |\xi _{n}\left( x\right) |\! -\! \gamma _{h}\!\right) \! \le t \bigg \vert \Delta _{n1}=n_{01},\Delta _{n2}=n_{02}\!\right\} \! \rightarrow \!\exp \left\{ -2\exp \!\left( -t\right) \right\} , \end{aligned}$$

where

$$\begin{aligned}&\rho _{h} =\sqrt{-2\log \left( h/\left( b_{0}-a_{0}\right) \right) } ,\,\,\,\gamma _{h}=\rho _{h}+2^{-1}\rho _{h}^{-1}\log \left( 4^{-1} \pi ^{-2}C_{K} \right) , \\&C_{K} =\tau _0^{-1} \int \left\{ K^{(1)}\left( u\right) \right\} ^{2}du. \end{aligned}$$

Proof of Lemma A.4

By Lemmas A.6–A.8 in Cai et al. (2020), it is readily seen that \(\xi _{n}\left( x\right) \) has the same absolute maximum asymptotic distribution as

$$\begin{aligned} \xi _{n}^{*}\left( x\right) =\frac{h^{1/2}\int K_{h}\left( u-x\right) dW_{1}\left( u\right) }{\tau _0^{1/2}\sqrt{1+r_{n}\left( x\right) }}-\frac{h^{1/2}\int K_{h}\left( u-x\right) dW_{2}\left( u\right) }{\tau _0^{1/2}\sqrt{1+r_{n}^{-1}\left( x\right) }}, \end{aligned}$$

where \(W_{1}\left( u\right) \) and \(W_{2}\left( u\right) \) are two independent two-sided Wiener processes on \(\left( -\infty ,+\infty \right) \) . The absolute maximum of \(\xi _{n}^{*}\left( x\right) \) has the following probability law:

$$\begin{aligned}&{\mathcal {L}}\left\{ \xi _{n}^{*}\left( x\right) ,x\in \left[ a_0,b_{0} \right] \right\} = \\&{\mathcal {L}}\left\{ \! \frac{h^{-1/2}\!\int \! K\left( u/h-t\right) dW_{1}\!\left( u\right) }{\tau _0^{1/2}\sqrt{1+r_{n}\left( ht\right) }}\!-\! \frac{h^{-1/2}\!\int \! K\left( u/h-t\right) dW_{2}\!\left( u\right) }{\tau _0^{1/2}\sqrt{1+r_{n}^{-1}\left( ht\right) }},t\! \in \!\left[ \frac{a_{0}}{h}, \frac{b_{0}}{h}\right] \!\right\} \\&={\mathcal {L}}\left\{ \frac{\int K\left( s-t\right) dW_{1}\!\left( s\right) }{ \tau _0^{1/2}\sqrt{1+r_{n}\left( ht\right) }}-\frac{\int K\left( s-t\right) dW_{2}\!\left( s\right) }{\tau _0^{1/2} \sqrt{1+r_{n}^{-1}\left( ht\right) }},t\in \left[ \frac{a_{0}}{h},\frac{b_{0}}{h}\right] \right\} . \end{aligned}$$

Let \(\eta _{n}\left( t\right) =\frac{\int K\left( s-t\right) dW_{1}\left( s\right) }{\tau _0^{1/2}\sqrt{ 1+r_{n}\left( ht\right) }}-\frac{\int K\left( s-t\right) dW_{2}\left( s\right) }{\tau _0^{1/2}\sqrt{1+r_{n}^{-1}\left( ht\right) }}\). Then \(\eta _{n}\left( t\right) \) is a Gaussian process with mean 0,  variance 1 and covariance function

$$\begin{aligned} \mathop {\mathrm{cov}}\left\{ \eta _{n}\left( t_{1}\right) , \eta _{n}\left( t_{2}\right) \right\}= & {} \mathop {\mathrm{E}}\left\{ \! \left( \! \frac{\int K\left( s-t_{1}\right) dW_{1}\!\left( s\right) }{\tau _0^{1/2}\sqrt{1+r_{n}\left( ht_{1}\right) } }-\frac{\int K\left( s-t_{1}\right) dW_{2}\!\left( s\right) }{\tau _0^{1/2}\sqrt{1+r_{n}^{-1}\left( ht_{1}\right) }}\! \right) \right. \\&\times \! \left. \left( \!\frac{\int K\left( s-t_{2}\right) dW_{1}\!\left( s\right) }{ \tau _0^{1/2}\sqrt{1+r_{n}\left( ht_{2}\right) }}-\frac{ \int K\left( s-t_{2}\right) dW_{2}\!\left( s\right) }{\tau _0^{1/2}\sqrt{1+r_{n}^{-1}\left( ht_{2}\right) }}\!\right) \! \right\} \\= & {} \mathop {\mathrm{E}}\left( \frac{\int K\left( s-t_{1}\right) dW_{1}\left( s\right) }{\tau _0^{1/2}\sqrt{1+r_{n}\left( ht_{1}\right) } }\frac{\int K\left( s-t_{2}\right) dW_{1}\left( s\right) }{\tau _0^{1/2}\sqrt{1+r_{n}\left( ht_{2}\right) }}\right) \\&+\mathop {\mathrm{E}}\left( \frac{\int K\left( s-t_{1}\right) dW_{2}\left( s\right) }{\tau _0^{1/2}\sqrt{1+r_{n}^{-1}\left( ht_{1}\right) }}\frac{\int K\left( s-t_{2}\right) dW_{2}\left( s\right) }{ \tau _0^{1/2}\sqrt{1+r_{n}^{-1}\left( ht_{2}\right) }} \right) \\= & {} \frac{\int K\left( s-t_{1}\right) K\left( s-t_{2}\right) ds}{\tau _0 \sqrt{1+r_{n}\left( ht_{1}\right) }\sqrt{1+r_{n}\left( ht_{2}\right) }}+\frac{\int K\left( s-t_{1}\right) K\left( s-t_{2}\right) ds }{\tau _0 \sqrt{1+r_{n}^{-1}\left( ht_{1}\right) }\sqrt{ 1+r_{n}^{-1}\left( ht_{2}\right) }} \\= & {} \tau _0^{-1}\int K\left( s-t_{1}\right) K\left( s-t_{2}\right) ds =\tau _0^{-1}\int K\left( s\right) K\left( s+t_1-t_{2}\right) ds, \end{aligned}$$

which holds since \(d_2(\cdot )=cd_1(\cdot )\) implying that \(r_n(ht_1)=r_n(ht_2)\).

Define \(\eta ^{*}_{n}\left( t\right) =\eta _{n}\left( t+a_0/h\right) \) for \(0 \le t\le b_0/h-a_0/h\). Then the covariance function of \(\eta ^{*}_{n}\left( t\right) \) satisfies

$$\begin{aligned} \mathop {\mathrm{cov}}\left\{ \eta ^{*} _{n}\left( s\right) , \eta ^{*} _{n}\left( s+t\right) \right\}= & {} \tau _0^{-1}\int K\left( s\right) K\left( s+t_1-t_{2}\right) ds \\= & {} 1-C_Kt^2/2+o(t^2), \ \text {as}\ t\rightarrow 0, \end{aligned}$$

fulfilling the conditions in Lemma A.3 with \(T=b_0/h-a_0/h\) and \(a=2\). Therefore, one has that

$$\begin{aligned}&P\! \left\{ \! \rho _{h}\! \left[ \! \sup _{t\in \left[ 0,T\right] }| \eta ^{*} _{n}\left( t \right) | -\gamma _{h}\right] \! \le t \bigg \vert \Delta _{n1}=n_{01},\Delta _{n2}=n_{02} \! \right\} \! \rightarrow \exp \left\{ -2\exp \left( -t\right) \right\} , \end{aligned}$$

as \(n_{01},n_{02}\rightarrow \infty \). Hence, \( P\{ \rho _{h}( \sup _{t\in \left[ a_0/h,b_0/h\right] }| \eta _{n}\left( t \right) | -\gamma _{h} ) \le t \vert \Delta _{n1}=n_{01}\), \(\Delta _{n2}=n_{02}\} \) and \( P\{ \rho _{h}\left( \sup _{t\in \left[ a_0,b_{0}\right] }| \xi _{n}^{*}\left( t\right) | -\gamma _{h} \right) \le t \vert \Delta _{n1}=n_{01},\Delta _{n2}=n_{02}\} \) have the same limiting distribution as well, completing the proof. \(\square \)

Theorem A.1

Under Assumptions (A1)–(A5), if there exists a constant \(c>0\) such that \(d_2(x)=cd_1(x)\), then one has that for any \(t\in {\mathbb {R}}\)

$$\begin{aligned}&P\left\{ \rho _{h}\left[ \sup _{x\in \left[ a_{0},b_{0}\right] }\frac{ \left| \left( {\tilde{m}}_{1}(x)-{\tilde{m}}_{2}(x)\right) -\left( m_{1}(x)-m_{2}(x)\right) \right| }{h^{-1/2}\sqrt{n_{1}^{-2}\Delta _{n1}v_{1}(x)+n_{2}^{-2}\Delta _{n2}v_{2}(x)}}-\gamma _{h}\right] \le t\right\} , \\& \rightarrow \exp \left\{ -2\exp \left( -t\right) \right\} , \end{aligned}$$

as \(n\rightarrow \infty \), where \(\rho _{h},\gamma _{h},v_{k}(x),k=1,2,\) are given in Lemma A.4.

Proof of Theorem A.1

By Lemma A.4 and using the total probability formula similar to the proof of Theorem 2 in Cai et al. (2020), one can immediately obtain that

$$\begin{aligned}&P\left\{ \rho _{h}\left[ \sup _{x\in \left[ a_{0},b_{0}\right] }\frac{ \left| \psi _{1}\left( x\right) -\psi _{2}\left( x\right) \right| }{h^{-1/2} \sqrt{n_{1}^{-2}\Delta _{n1}v_{1}(x)+n_{2}^{-2}\Delta _{n2}v_{2}(x)}}-\gamma _{h}\right] \le t\right\} \\& \rightarrow \exp \left\{ -2\exp \left( -t\right) \right\} , \end{aligned}$$

as \(n\rightarrow \infty \). Furthermore, according to Lemma A.1, one has that

$$\begin{aligned} \left( {\tilde{m}}_{1}(x)-{\tilde{m}}_{2}(x)\right) -\left( m_{1}(x)-m_{2}(x)\right) =\psi _{1}\left( x\right) - \psi _{2}\left( x\right) +U_{p}\left( h^{2}\right) . \end{aligned}$$

This together with \(h^2\sqrt{nh\log n}\rightarrow 0\) obtained by \(h\ll n^{-1/5}\log ^{-1/5} n \) given in Assumption (A5) and the Slutsky Theorem concludes that

$$\begin{aligned}&P\left\{ \rho _{h}\left[ \sup _{x\in \left[ a_{0},b_{0}\right] }\frac{ \left| \left( {\tilde{m}}_{1}(x)-{\tilde{m}}_{2}(x)\right) -\left( m_{1}(x)-m_{2}(x)\right) \right| }{h^{-1/2}\sqrt{n_{1}^{-2}\Delta _{n1}v_{1}(x)+n_{2}^{-2}\Delta _{n2}v_{2}(x)}}-\gamma _{h}\right] \le t\right\} \\&\rightarrow \exp \left\{ -2\exp \left( -t\right) \right\} . \end{aligned}$$

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cai, L., Wang, S. Global statistical inference for the difference between two regression mean curves with covariates possibly partially missing. Stat Papers 62, 2573–2602 (2021). https://doi.org/10.1007/s00362-020-01208-x

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-020-01208-x

Keywords

Navigation