Appendix
In order to prove Theorem 1, we need following Lemmas.
Lemma 1
Assume that for \(i=1,2, E[g_i(X, \beta _0)g_i^{\prime }(X, \beta _0)]\) is positive definite, \(\frac{\partial g_i(X, \beta )}{\partial \beta }\) is continuous in a neighborhood of the true value \(\beta _0\), \(E \left[ \left( \frac{\partial g_i(X, \beta _0)}{\partial \beta ^{\prime }}\right) \left( \frac{\partial g_i(X, \beta _0)}{\partial \beta ^{\prime }}\right) ^{\prime } \right]\), \(E \left[ \frac{\partial ^2 g_i(X, \beta )}{\partial \beta \partial \beta ^{\prime }}\right]\), \(E\left[ \left( \frac{\partial g_i(X, \beta )}{\partial \beta ^{\prime }}\right) ^{\prime } g_i(X, \beta ) \right]\) and \(E\parallel g_i(X, \beta ) \parallel ^3\) are all bounded in the neighborhood of the true value \(\beta _0\). Then, as \(n \rightarrow \infty\), \(\exists \tilde{\beta }\), \(\tilde{\lambda }=\lambda (\tilde{\beta })\) with probability 1 satisfying,
$$\begin{aligned} Q_{1n}(\tilde{\beta }, \tilde{\lambda }) = 0, \quad Q_{2n}(\tilde{\beta }, \tilde{\lambda }) = 0 \text { and } \parallel \tilde{\beta }-\beta _0 \parallel = O_p\left( m^{-\frac{1}{2}}\right) , \end{aligned}$$
where
$$\begin{aligned} Q_{1n}(\beta , \lambda )&=\sum _l \frac{1}{1 + \lambda ^{\prime }(\beta ) \theta _l^{-1}g(x_l, \beta )}\theta _l^{-1}g(x_l, \beta ),\\ Q_{2n}(\beta ,\lambda )&=\sum _l \frac{1}{1 + \lambda ^{\prime }(\beta ) \theta _l^{-1}g(x_l, \beta )} \theta _l^{-1} \left( \frac{\partial g(x_l, \beta )}{\partial \beta }\right) ^{\prime } \lambda (\beta ). \end{aligned}$$
Proof
First we will show
$$\begin{aligned} \lambda (\beta )&= \epsilon _k O_p\left( m^{-\frac{1}{2}}\right) \\&= \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-2} g(x_l, \beta ) g^{\prime }(x_l, \beta ) \right] ^{-1} \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-1} g(x_l, \beta )\right] +\epsilon _k o_p\left( m^{-\frac{1}{2}}\right) , \end{aligned}$$
where \(\epsilon _k = \min \{\theta _k, 1- \theta _k\}\) and \(m=n\epsilon _k = \min \{k, n-k\}\).
Let \(\beta -\beta _0 = um^{-\frac{1}{2}}\) for \(\beta \in \{\beta :\parallel \beta -\beta _0\parallel =m^{-\frac{1}{2}}\}\) where \(\parallel u\parallel =1\). Let \(\lambda\) be the solution of the function \(f(\lambda )\) given by the first score function defined in Sect. 3.
$$\begin{aligned} f(\lambda ) = \frac{1}{n} \sum _{l=1}^n \frac{\theta _l^{-1}}{1 + \lambda ^{\prime }(\beta ) \theta _l^{-1} g(x_l,\beta )} g(x_l,\beta ) = 0. \end{aligned}$$
(A.1)
Let \(\lambda =\rho u\) where \(u=(\beta -\beta _0)m^\frac{1}{2}\) and \(\parallel u\parallel =1\).
$$\begin{aligned} 0&=\,\parallel f(\rho u)\parallel \\&\ge |u^{\prime } f(\rho u)| \\&= \frac{1}{n} \left| u^{\prime } \left( \sum _l \theta _l^{-1} g(x_l,\beta ) -\rho \sum _l \frac{\theta _l^{-2} g(x_l, \beta )u^{\prime } g(x_l, \beta )}{1 + \rho u^{\prime } \theta _l^{-1} g(x_l, \beta )} \right) \right| \\&\ge \frac{\rho }{n} u^{\prime } \sum _l \frac{\theta _l^{-2} g(x_l, \beta ) u^{\prime } g(x_l, \beta )}{1 + \rho u^{\prime } \theta _l^{-1}} u -\frac{1}{n} \left| \sum _{j=1}^p e_j \sum _l \theta _l^{-1}g(x_l, \beta )\right| \\&\qquad (\text {where}\ e_j\ \text {is the unit vector in the} \ j^{th}\ \text {coordinate direction.}) \\&\ge \frac{\rho u^{\prime } Su}{1 + \rho \theta _l g^*} - O_p\left( m^{-\frac{1}{2}}\right) ,\\&\qquad \left( \text {where}\ g^*={\displaystyle \max _{l}} g(x_l, \beta )\ \text {and} \ S=\frac{1}{n}\sum _l \theta _l^{-2} g(x_l,\beta ) g^{\prime }(x_l,\beta ).\right) \end{aligned}$$
Since \(u^{\prime } Su \ge \sigma _p + o_p(1)\), where \(\sigma _p>0\) is the smallest eigen value of \(\Sigma\), then
$$\begin{aligned} \frac{\rho }{1 + \rho \theta _l g^*} = O_p\left( m^{-\frac{1}{2}}\right) \end{aligned}$$
So, \(\parallel \lambda \parallel = \rho = O_p(m^{-\frac{1}{2}})\).
Let \(\gamma _l = \lambda ^{\prime }(\beta )\theta _l^{-1} g(x_l, \beta )\). Then, \({\displaystyle \max _{l}} |\gamma _l| =O_p(m^{-\frac{1}{2}})o(m^{\frac{1}{2}}) = o_p(1)\).
Expanding (A.1),
$$\begin{aligned} 0 = f(\lambda )&= \frac{1}{n} \sum _l \theta _l^{-1} g(x_l, \beta ) \left[ 1 - \gamma _l + \frac{\gamma _l^2}{1+\gamma _l} \right] \nonumber \\&= \frac{1}{n} \sum _l \theta _l^{-1} g(x_l, \beta ) - \frac{1}{n} \sum _l \theta _l^{-1} g(x_l, \beta )\cdot \gamma + \frac{1}{n} \sum _l \theta _l^{-1} g(x_l, \beta ) \frac{\gamma _l^2}{1+\gamma _l} \nonumber \\&= E(\theta _l^{-1} g(x_l, \beta )) - S \lambda + \frac{1}{n} \sum _l \theta _l^{-1} g(x_l, \beta ) \frac{\gamma _l^2}{1+\gamma _l}. \end{aligned}$$
(A.2)
The last equality is since \(\frac{1}{n} \sum _l \theta _l^{-1} g(x_l, \beta )\cdot \gamma = \frac{1}{n} \sum _l \theta _l^{-1} g(x_l, \beta ) \theta _l^{-1} g^{\prime }(x_l, \beta ) \lambda = S \lambda\).
By substituting \(\gamma _l\), we have the final term of (A.2);
$$\begin{aligned} \frac{1}{n} \sum _l \parallel \theta _l^{-1} g(x_l, \beta )\parallel ^3 \parallel \lambda \parallel ^2 |1+\gamma _l|^{-1} = o_p\left( m^{\frac{1}{2}}\right) O_p(m^{-1}) o_p(1) = o_p\left( m^{-\frac{1}{2}}\right) . \end{aligned}$$
Therefore,
$$\begin{aligned} 0&= E(\theta _l^{-1} g(x_l, \beta )) - S \lambda + o_p\left( m^{-\frac{1}{2}}\right) \nonumber \\&\Rightarrow \lambda = S^{-1} E(\theta _l^{-1} g(x_l, \beta )) + o_p\left( m^{-\frac{1}{2}}\right) \nonumber \\&\Rightarrow \lambda = \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-2} g(x_l, \beta ) g^{\prime }(x_l, \beta ) \right] ^{-1} \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-1} g(x_l, \beta )\right] + o_p\left( m^{-\frac{1}{2}}\right) . \end{aligned}$$
(A.3)
Now, denote \(V_n(\beta )=\frac{1}{n} \sum _{l=1}^n \theta _l^{-2} g(x_l, \beta )g^{\prime }(x_l, \beta )\), \(\bar{g}(\beta ) =\frac{1}{n}\sum _{l=1}^n \theta _l^{-1} g(x_l, \beta )\), and \(\varepsilon =\epsilon _k o_p(m^{-\frac{1}{2}})\). So (A.2) can be rewritten as,
$$\begin{aligned} \lambda (\beta ) = V_n(\beta )^{-1} \bar{g}(\beta ) + \varepsilon . \end{aligned}$$
Since \(\gamma _l = \lambda ^{\prime }(\beta )\theta _l^{-1} g(x_l, \beta ),\) so \(\sum _{l=1}^n |r_l|^3 = o_p(1)\).
Let \(a_m\) be any constant sequence such that \(a_m \rightarrow \infty\), and \(a_m m^{-\frac{1}{2}}\rightarrow 0\). Denote the ball \(B(\beta _0, a_m) = \{\beta |\parallel \beta -\beta _0 \parallel \le a_m m^{-\frac{1}{2}}\}\) and the surface of the ball \(\partial B(\beta _0, a_m) = \{\beta |\parallel \beta -\beta _0 \parallel = \phi a_m m^{-\frac{1}{2}}, \parallel \phi \parallel =1\}\). For any \(\beta \in \partial B(\beta _0, a_m)\), we have
$$\begin{aligned} V_n(\beta )&= \frac{1}{n} \sum _{l=1}^n \theta _l^{-2} g(x_l, \beta )g^{\prime }(x_l, \beta )\\&= \frac{n}{k} \frac{1}{k}\sum _{l=1}^k g_1(x_l, \beta _0)g_1^{\prime }(x_l, \beta _0) +\frac{n}{n-k} \frac{1}{n-k} \sum _{l=k+1}^n g_2(x_l, \beta _0) g_2^{\prime }(x_l, \beta _0)+o_p(\epsilon _k^{-1})\\&= \frac{n}{k} E g_1(x_l, \beta _0)g_1^{\prime }(x_l, \beta _0) +\frac{n}{n-k} E g_2(x_l, \beta _0)g_2^{\prime }(x_l, \beta _0) + o_p(\epsilon _k^{-1})\\&\le \epsilon _k^{-1}\left[ E g_1(x_l, \beta _0) g_1^{\prime }(x_l, \beta _0) + E g_2(x_l, \beta _0) g_2^{\prime }(x_l, \beta _0)\right] + o_p(\epsilon _k^{-1}), \end{aligned}$$
and
$$\begin{aligned} \bar{g}(\beta _0)&= \frac{1}{n} \sum _{l=1}^n \theta _l^{-1} g(x_l, \beta ) \\&= \frac{1}{k} \sum _{l=1}^k g_1(x_l, \beta _0) + \frac{1}{n-k} \sum _{l=k+1}^n g_2(x_l, \beta _0)\\&= \frac{1}{k} o_p\left( k^{\frac{1}{2}}\right) + \frac{1}{n-k} o_p\left( (n-k)^{\frac{1}{2}}\right) \\&= o_p\left( k^{-\frac{1}{2}}\right) + o_p\left( (n-k)^{-\frac{1}{2}}\right) \\&= o_p\left( m^{-\frac{1}{2}}\right) . \end{aligned}$$
By the Taylor expansion, for any \(\beta \in \partial B(\beta _0, a_m)\), we have
$$\begin{aligned} l_E(\beta ) = \sum _l \lambda ^{\prime }(\beta )\theta _l^{-1} g(x_l, \beta ) -\frac{1}{2} \sum _l \left[ \lambda ^{\prime }(\beta )\theta _l^{-1} g(x_l, \beta )\right] ^2+ o_p(1). \end{aligned}$$
(A.4)
The first term of (A.4) is;
$$\begin{aligned} \sum _l \lambda ^{\prime }(\beta )\theta _l^{-1} g(x_l, \beta )&= \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-1} g(x_l, \beta )\right] ^{\prime } \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-2} g(x_l, \beta ) g^{\prime }(x_l, \beta ) \right] ^{-1}\nonumber \\&\quad \left[ \frac{1}{n}\sum _{l=1}^n \theta _l^{-1} g(x_l, \beta )\right] \nonumber \\&\quad + o_p(1). \end{aligned}$$
(A.4.1)
The second term of (A.4) is:
$$\begin{aligned}&\frac{1}{2} \sum _l \left[ \lambda ^{\prime }(\beta ) \theta _l^{-1} g(x_l, \beta ) \right] ^2\nonumber \\&\quad = \frac{1}{2} \sum _l \lambda ^{\prime }(\beta ) \theta _l^{-2} g(x_l, \beta )g^{\prime }(x_l, \beta )\nonumber \\&\quad = \frac{n}{2}\left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-1} g(x_l, \beta )\right] ^{\prime } \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-2} g(x_l, \beta ) g^{\prime }(x_l, \beta ) \right] ^{-1}\nonumber \\&\qquad \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-2} g(x_l, \beta ) g^{\prime }(x_l, \beta ) \right] \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-2} g(x_l, \beta ) g^{\prime }(x_l, \beta ) \right] ^{-1}\left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-1} g(x_l, \beta )\right] + o_p(1)\nonumber \\&\quad = \frac{n}{2}\left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-1} g(x_l, \beta )\right] ^{\prime } \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-2} g(x_l, \beta ) g^{\prime }(x_l, \beta ) \right] ^{-1} \left[ \frac{1}{n} \sum _{l=1}^n \theta _l^{-1} g(x_l, \beta )\right] + o_p(1). \end{aligned}$$
(A.4.2)
Now,
$$\begin{aligned}&(\mathrm{A}.4.1)\,\text {--}\,(\mathrm{A}.4.2)\\&\quad =\frac{n}{2} \left( \frac{1}{n}\sum _l \theta _l^{-1} g(x_l, \beta ) \right) ^{\prime } \left( \frac{1}{n}\sum _l \theta _l^{-2} g(x_l, \beta )g^{\prime }(x_l, \beta ) \right) ^{-1} \left( \frac{1}{n}\sum _l \theta _l^{-1} g(x_l, \beta ) \right) + o_p(1). \end{aligned}$$
So we can rewrite (A.4) as,
$$\begin{aligned} l_E(\beta )&= \frac{n}{2} \left( \frac{1}{n}\sum _l \theta _l^{-1} g(x_l, \beta ) \right) ^{\prime } \left( \frac{1}{n}\sum _l \theta _l^{-2} g(x_l, \beta )g^{\prime }(x_l, \beta ) \right) ^{-1} \left( \frac{1}{n} \sum _l \theta _l^{-1} g(x_l, \beta ) \right) + o_p(1)\\&= \frac{n}{2} \bar{g}^{\prime } (\beta ) (V_n(\beta ))^{-1}\bar{g}(\beta ) + o_p(1) \\&= \frac{n}{2} \left\{ \bar{g}(\beta _0) + \frac{1}{n} \sum _l \theta _l^{-1} \frac{\partial g(x_l, \beta _0)}{\partial \beta ^{\prime }}\phi a_m m^{-\frac{1}{2}} + O\left[ \left( a_m m^{-\frac{1}{2}}\right) ^2\right] \right\} ^{\prime } \times \left( V_n(\beta )\right) ^{-1} \\&\quad \times \left\{ \bar{g}(\beta _0) + \frac{1}{n} \sum _l \theta _l^{-1} \frac{\partial g(x_l, \beta _0)}{\partial \beta ^{\prime }}\phi a_m m^{-\frac{1}{2}} + O\left[ \left( a_m m^{-\frac{1}{2}}\right) ^2\right] \right\} +o_p(1)\\&\quad \qquad (\text {By Taylor expansion of each term}.)\\&\ge \frac{n \epsilon _k}{2} \left\{ \bar{g}(\beta _0) +\frac{1}{n} \sum _l \theta _l^{-1} \frac{\partial g(x_l, \beta _0)}{\partial \beta ^{\prime }}\phi a_m m^{-\frac{1}{2}} + O\left[ \left( a_m m^{-\frac{1}{2}}\right) ^2\right] \right\} ^{\prime } \times \left( V_n(\beta )\right) ^{-1} \\&\quad \times \left\{ \bar{g}(\beta _0) + \frac{1}{n} \sum _l \theta _l^{-1} \frac{\partial g(x_l, \beta _0)}{\partial \beta ^{\prime }}\phi a_m m^{-\frac{1}{2}} +O\left[ \left( a_m m^{-\frac{1}{2}}\right) ^2\right] \right\} + o_p(1). \end{aligned}$$
As \(n\rightarrow \infty\), \(l_E(\beta ) \rightarrow \infty\).
Similarly,
$$\begin{aligned}&l_E(\beta _0) = \frac{n}{2} \bar{g}^{\prime }(\beta _0) V_n(\beta _0)^{-1} \bar{g}(\beta _0) + o_p(1),\\&V_n(\beta _0) = \frac{n}{k} E g_1(x_l, \beta _0)g_1^{\prime } (x_l, \beta _0) + \frac{n}{n-k} E g_2(x_l, \beta _0)g_2^{\prime } (x_l, \beta _0) + o_p(\epsilon _k^{-1}). \end{aligned}$$
Thus, \(l_E(\beta _0) = O_p(1)\) implies that for any \(\beta \in \partial B(\beta _0, a_m)\), \(l_E(\beta )\) can not arrive its minimum value with the probability approaching to 1. Since \(l_E(\beta )\) is a continuous function about \(\beta\), as \(\beta \in B(\beta _0, a_m)\), \(l_E(\beta )\) has a minimum value in the interior of this ball satisfying,
$$\begin{aligned} 0 = \left. \frac{\partial l_E(\beta )}{\partial \beta }\right| _{\beta =\tilde{\beta }}&= \sum _l \left. \frac{\left( \frac{\partial \lambda ^{\prime }(\beta )}{\partial \beta }\right) \theta _l^{-1}g(x_l, \beta ) + \theta _l^{-1}\left( \frac{\partial g(x_l, \beta )}{\partial \beta }\right) ^{\prime } \lambda (\beta )}{1 + \lambda ^{\prime }(\beta ) \theta _l^{-1}g(x_l, \beta )} \right| _{\beta =\tilde{\beta }}\\&= \left. \frac{\partial \lambda ^{\prime }(\beta )}{\partial \beta } \sum _l \frac{\theta _l^{-1} g(x_l, \beta )}{1 + \lambda ^{\prime }(\beta ) \theta _l^{-1}g(x_l, \beta )} \right| _{\beta =\tilde{\beta }} + \sum _l \frac{\theta _l^{-1} \left( \frac{\partial g(x_l, \beta )}{\partial \beta }\right) ^{\prime } \lambda (\beta )}{1 + \lambda ^{\prime }(\beta )\theta _l^{-1} g(x_l, \beta )}\\&= \sum _l \frac{\theta _l^{-1}\left( \frac{\partial g(x_l, \beta )}{\partial \beta }\right) ^{\prime } \lambda (\beta )}{1 + \lambda ^{\prime }(\beta ) \theta _l^{-1}g(x_l, \beta )}\\&\qquad \left( \text {Since}\ \sum _l \left. \frac{\theta _l^{-1}g(x_l, \beta )}{1 + \lambda ^{\prime }(\beta ) \theta _l^{-1}g(x_l, \beta )} \right| _{\beta =\tilde{\beta }}= Q_{1n}(\tilde{\beta }, \tilde{\lambda }) = 0\right) \\&= Q_{2n}(\tilde{\beta }, \tilde{\lambda }). \end{aligned}$$
Hence, \(Q_{1n}(\tilde{\beta }, \tilde{\lambda }) = 0\) and \(Q_{2n}(\tilde{\beta }, \tilde{\lambda }) = 0\). That is, \(\parallel \tilde{\beta }-\beta _0 \parallel = O_p(a_m m^{-\frac{1}{2}}).\) But \(a_m\) is arbitrary, hence \(\parallel \tilde{\beta }-\beta _0 \parallel = O_p(m^{-\frac{1}{2}})\). \(\square\)
Remark 1
If the partitioned matrix
$$\begin{aligned} \begin{pmatrix} A &{} B \\ B^{\prime } &{} 0 \end{pmatrix} \end{aligned}$$
is non-singular, then
$$\begin{aligned} \begin{pmatrix} A &{} B \\ B^{\prime } &{} 0 \end{pmatrix}^{-1} =\begin{pmatrix} P &{} Q \\ Q^{\prime } &{} R \end{pmatrix} \end{aligned}$$
where
$$\begin{aligned}&P=A^{-1} -A^{-1}B(B^{\prime } A^{-1} B)^{-1} B^{\prime } A^{-1}, \qquad Q= A^{-1}B(B^{\prime } A^{-1} B)^{-1},\\&Q^{\prime } = (B^{\prime } A^{-1} B)^{-1}B^{\prime } A^{-1}, \qquad R=-(B^{\prime } A^{-1} B)^{-1} \end{aligned}$$
Remark 2
$$\begin{aligned} \text {If } \begin{pmatrix} A &{} B \\ C &{} D \end{pmatrix} \text {is a}\ n\times n\ \text {symmetric positive definite matrix, and the partitioned matrices}\ A\in {\mathbb {R}}^{m\times m}, \end{aligned}$$
\(B\in {\mathbb {R}}^{m\times n-m}\), and \(D\in {\mathbb {R}}^{(n-m)\times (n-m)}\), then
-
1.
the matrix \((D-CA^{-1}B)\) is symmetric and positive definite,
-
2.
$$\begin{aligned} \begin{pmatrix} A &{} B \\ C &{} D \end{pmatrix}^{-1} \ge \begin{pmatrix} A^{-1} &{} 0 \\ 0 &{} 0 \end{pmatrix}. \end{aligned}$$
Remark 3
\(\beta ^{\prime } = ((\beta ^{\prime }, \mu ^{\prime }), \delta ^{\prime })\).
$$\begin{aligned}&\frac{\partial Q_{1n}(\beta , 0)}{\partial \lambda ^{\prime }} =-\frac{1}{n}\sum _l \theta _l^{-2} g(x_l, \beta )g^{\prime }(x_l, \beta ), \quad \frac{\partial Q_{1n}(\beta , 0)}{\partial \beta ^{\prime }} =\frac{1}{n}\sum _l \theta _l^{-1} \frac{\partial g(x_l, \beta )}{\partial \beta ^{\prime }}\\&\frac{\partial Q_{2n}(\beta , 0)}{\partial \lambda ^{\prime }} =\frac{1}{n}\sum _l \left( \theta _l^{-1} \frac{\partial g(x_l, \beta )}{\partial \beta ^{\prime }}\right) ^{\prime }, \quad \frac{\partial Q_{2n}(\beta , 0)}{\partial \beta ^{\prime }} = 0\\&\begin{pmatrix} \frac{\partial Q_{1n}}{\partial \lambda ^{\prime }} &{} \frac{\partial Q_{1n}}{\partial \beta ^{\prime }} \\ \frac{\partial Q_{2n}}{\partial \lambda ^{\prime }} &{} 0 \end{pmatrix} \longrightarrow \begin{pmatrix} S_{11} &{} S_{12} \\ S_{21} &{} 0 \end{pmatrix} = S(\beta ) \equiv S \end{aligned}$$
where
\(S_{11}(\beta ) = -\theta _l^{-1} E\left[ g_1(x_l, \beta )g_1^{\prime }(x_l, \beta ) \right] - (1-\theta _l)^{-1} E\left[ g_2(x_l, \beta )g_2^{\prime }(x_l, \beta )\right]\),
\(S_{12}(\beta ) = \theta ^{-1}_lE\left[ \frac{\partial g_1(x_l, \beta _0)}{\partial \beta ^{\prime }} \right] + (1-\theta _l)^{-1}E \left[ \frac{\partial g_2(x_l, \beta _0)}{\partial \beta ^{\prime }} \right]\),
\(S_{21}(\beta ) = S_{12}^{\prime }(\beta )\),
\(S_{12,i}(\beta ) = \theta ^{-1}_lE\left[ \frac{\partial g_1(x_l, \beta _0)}{\partial \beta _i^{\prime }} \right] + (1-\theta _l)^{-1}E \left[ \frac{\partial g_2(x_l, \beta _0)}{\partial \beta _i^{\prime }} \right]\), \(i=1, 2\).
By, Remark 1,
$$\begin{aligned} S^{-1} =\begin{pmatrix} S_{11} &{} S_{12} \\ S_{21} &{} 0 \end{pmatrix} ^{-1} =\begin{pmatrix} P &{} Q \\ Q^{\prime } &{} R \end{pmatrix} \end{aligned}$$
where
$$\begin{aligned}&P = S_{11}^{-1} - S_{11}^{-1} S_{12}(S_{21}S_{11}^{-1}S_{12})^{-1} S_{21}S_{11}^{-1} = S_{11}^{-1} + S_{11}^{-1} S_{12} \Sigma S_{21}S_{11}^{-1};\\&\Sigma = (S_{21}(-S_{11}^{-1})S_{12})^{-1},\\&Q = - S_{11}^{-1} S_{12}(S_{21}S_{11}^{-1}S_{12})^{-1} = - S_{11}^{-1} S_{12} \Sigma , \\&Q^{\prime } = -\Sigma S_{21}S_{11}^{-1}, \quad R = -(S_{21}S_{11}^{-1}S_{12})^{-1} = \Sigma . \end{aligned}$$
Lemma 2
Under the conditions in Lemma 1and \(H_0\), as \(n\rightarrow \infty\) we have
$$\begin{aligned} \sqrt{n}\Sigma ^{-\frac{1}{2}}(\tilde{\beta }-\beta _0) \rightarrow N(0, I_{2p+q}), \end{aligned}$$
where \(\Sigma = [S_{21}(-S_{11})^{-1}S_{12}]^{-1}\).
Proof
Expanding \(Q_{1n}(\tilde{\beta }, \tilde{\lambda })\) and \(Q_{2n}(\tilde{\beta }, \tilde{\lambda })\) at \((\theta _0,0)\), by the conditions of the \(H_0\) and Lemma 1, we have,
$$\begin{aligned} 0&= Q_{1n}(\tilde{\beta }, \tilde{\lambda })\\&= Q_{1n}(\beta _0, 0) + \frac{\partial Q_{1n}(\beta _0, 0)}{\partial \beta ^{\prime }} (\tilde{\beta } - \beta _0) +\frac{\partial Q_{1n}(\beta _0, 0)}{\partial \lambda ^{\prime }} (\tilde{\lambda } - 0) + O_p(m^{-1}),\\ 0&= Q_{2n}(\tilde{\beta }, \tilde{\lambda })\\&= Q_{2n}(\beta _0, 0) + \frac{\partial Q_{2n}(\beta _0, 0)}{\partial \beta ^{\prime }} (\tilde{\beta } - \beta _0) + \frac{\partial Q_{2n}(\beta _0, 0)}{\partial \lambda ^{\prime }} (\tilde{\lambda } - 0) + O_p(m^{-1}), \\&\quad \begin{pmatrix} -Q_{1n}(\beta _0, 0) + O_p(m^{-1}) \\ \epsilon _k O_p(m^{-1}) \end{pmatrix} =\begin{pmatrix} \frac{\partial Q_{1n}}{\partial \lambda ^{\prime }} &{} \frac{\partial Q_{1n}}{\partial \beta ^{\prime }} \\ \frac{\partial Q_{2n}}{\partial \lambda ^{\prime }} &{} 0 \end{pmatrix} \begin{pmatrix} \tilde{\lambda } \\ \tilde{\beta } -\beta _0 \end{pmatrix}. \end{aligned}$$
By LLN,
$$\begin{aligned} \begin{pmatrix} \tilde{\lambda } \\ \tilde{\beta } -\beta _0 \end{pmatrix} \longrightarrow S^{-1}(\beta _0) \begin{pmatrix} -Q_{1n}(\beta _0, 0) + O_p(m^{-1}) \\ \epsilon _k O_p(m^{-1}). \end{pmatrix} \end{aligned}$$
By Remark 1,
$$\begin{aligned} \tilde{\beta } -\beta _0 = (0 \,I) S^{-1} \begin{pmatrix} -Q_{1n}(\beta _0, 0) + O_p(m^{-1}) \\ \epsilon _k O_p(m^{-1}) \end{pmatrix}. \end{aligned}$$
Therefore,
$$\begin{aligned}&\begin{pmatrix} \tilde{\lambda } \\ \tilde{\beta } -\beta _0 \end{pmatrix} \longrightarrow \begin{pmatrix} S_{11}^{-1} + S_{11}^{-1} S_{12} &{} - S_{11}^{-1} S_{12} \Sigma \\ -\Sigma S_{21}S_{11}^{-1} &{} \Sigma \end{pmatrix} \begin{pmatrix} -Q_{1n}(\beta _0, 0) + O_p(m^{-1}) \\ \epsilon _k O_p(m^{-1}) \end{pmatrix}. \\&\tilde{\beta } -\beta _0 \rightarrow -\Sigma S_{21}S_{11}^{-1} \left( -Q_{1n}(\beta _0, 0) + O_p(m^{-1}) \right) + \Sigma \epsilon _k O_p(m^{-1})\\&\quad = (S_{21}S_{11}^{-1}S_{12})^{-1} S_{21}S_{11}^{-1}Q_{1n}(\beta _0, 0) - \Sigma S_{21}S_{11}^{-1} O_p(m^{-1}) + \Sigma \epsilon _k O_p(m^{-1})\\&\quad = \frac{1}{\sqrt{n}} (S_{21}S_{11}^{-1}S_{12})^{-1} S_{21} (-S_{11})^{-1/2}(-S_{11})^{-1/2}\sqrt{n}Q_{1n}(\beta _0, 0) +\epsilon _k O_p\left( m^{-\frac{1}{2}}\right) \end{aligned}$$
Since \((-S_{11})^{-1/2}\sqrt{n}Q_{1n}(\beta _0, 0) \rightarrow N(0, I_{2(p+q)})\), \(\sqrt{n}S_{21}S_{11}^{-1}(\tilde{\beta } - \beta _0) \rightarrow N(0, I_{2p+q})\). \(\square\)
Lemma 3
$$\begin{aligned} -2\log \Lambda _k = 2l_E(\tilde{\beta }_1^0, 0) -2l_E(\tilde{\beta }_1^0, \tilde{\beta }_2^0), \end{aligned}$$
where \(\tilde{\beta }_1^0\) minimizes \(l_E(\beta , 0)\) with respect to \(\beta _1\) under \(H_0\),
$$\begin{aligned} -2\log \Lambda _k = \left[ (-S_{11})^{-1/2}\sqrt{n}Q_{1n}(\beta _0, 0) \right] ^{\prime } \Delta \left[ (-S_{11})^{-1/2}\sqrt{n}Q_{1n}(\beta _0, 0) \right] + O_p\left( m^{-\frac{1}{2}}\right) \end{aligned}$$
where
$$\begin{aligned} \Delta = (-S_{11})^{-1/2} \left\{ S_{12} [S_{21}(-S_{11})^{-1}S_{12}]^{-1} S_{21} - S_{12,1} [S_{21,1}(-S_{11})^{-1}S_{12,1}]^{-1} S_{21,1} \right\} (-S_{11})^{-1/2} \ge 0. \end{aligned}$$
Proof
Similar to Qin and Lawless (1994), we can derive,
$$\begin{aligned} l_E(\tilde{\beta }_1^0, \tilde{\beta }_2^0) = -\frac{n}{2} Q_{1n}^{\prime } (\beta _0, 0) B Q_{1n} (\beta _0, 0) + O_p\left( m^{-\frac{1}{2}}\right) , \end{aligned}$$
where \(B = S_{11}^{-1} + S_{11}^{-1} S_{12} \Sigma S_{21}S_{11}^{-1}\), and
$$\begin{aligned} l_E(\tilde{\beta }_1^0, 0) = -\frac{n}{2} Q_{1n}^{\prime } (\beta _0, 0) A Q_{1n} (\beta _0, 0) + O_p\left( m^{-\frac{1}{2}}\right) , \end{aligned}$$
where \(A = S_{11}^{-1} + S_{11}^{-1} S_{12,1} (S_{21,1}S_{11}^{-1}S_{12,1})^{-1} S_{21,1}S_{11}^{-1}\). Then,
$$\begin{aligned}&2 \left[ l_E(\tilde{\beta }_1^0, 0) - l_E(\tilde{\beta }_1^0, \tilde{\beta }_2^0)\right] \\&\quad =\left[ -Q_{1n}^{\prime } (\beta _0, 0) A Q_{1n} (\beta _0, 0) +O_p\left( m^{-\frac{1}{2}}\right) \right] \\&\qquad +\left[ n Q_{1n}^{\prime } (\beta _0, 0) B Q_{1n} (\beta _0, 0) + O_p\left( m^{-\frac{1}{2}}\right) \right] \\&\quad = n Q_{1n}^{\prime } (\beta _0, 0) (B-A) Q_{1n} (\beta _0, 0) + O_p\left( m^{-\frac{1}{2}}\right) \\&\quad = n Q_{1n}^{\prime } (\beta _0, 0) S_{11}^{-1} \left[ S_{12} \Sigma S_{21} - S_{12,1} \Sigma ^* S_{12,2}\right] S_{11}^{-1} Q_{1n} (\beta _0, 0) + O_p\left( m^{-\frac{1}{2}}\right) \\&\quad \qquad (B-A = S_{11}^{-1} + S_{11}^{-1} S_{12} \Sigma S_{21}S_{11}^{-1} -S_{11}^{-1} - S_{11}^{-1} S_{12,1} (S_{21,1}S_{11}^{-1}S_{12,1})^{-1} S_{21,1}S_{11}^{-1}.)\\&\quad \qquad (\text {So},\ \Sigma ^*=(S_{21,1}S_{11}^{-1}S_{12,1})^{-1})\\&\quad = \left[ (-S_{11})^{-1/2}\sqrt{n}Q_{1n}(\beta _0, 0)\right] ^{\prime } (-S_{11})^{-1/2} \left[ S_{12} \Sigma S_{21} - S_{12,1} \Sigma ^* S_{12,2}\right] \\&\quad \qquad (-S_{11})^{-1/2} \left[ (-S_{11})^{-1/2}\sqrt{n}Q_{1n}(\beta _0, 0)\right] +O_p\left( m^{-\frac{1}{2}}\right) . \end{aligned}$$
Take \(\Delta = (-S_{11})^{-1/2} \left[ S_{12} \Sigma S_{21} - S_{12,1} \Sigma ^* S_{12,2}\right] (-S_{11})^{-1/2}\). Now,
$$\begin{aligned} \Delta&= (-S_{11})^{-1/2} \left[ S_{12} \left( S_{21}(-S_{11}^{-1})S_{12}\right) ^{-1} S_{21} - S_{12,1} \left( S_{21,1}S_{11}^{-1}S_{12,1}\right) ^{-1} S_{12,2}\right] (-S_{11})^{-1/2}\\&= (-S_{11})^{-1/2} (S_{12,1},\, S_{12,2}) \left\{ [S_{21}(-S_{11})^{-1}S_{12}]^{-1} -\begin{pmatrix} \left( S_{21,1}S_{11}^{-1}S_{12,1}\right) ^{-1} &{} 0 \\ 0 &{} 0 \end{pmatrix} \right\} \\&\quad \times \begin{pmatrix} S_{21,1}\\ S_{21,2} \end{pmatrix} (-S_{11})^{-1/2} \\&\ge 0. \\&\qquad (\text {By Remark}\,) \end{aligned}$$
\(\square\)
Lemma 4
Under the conditions of Theorem 1and the null hypothesis, denote \(U_{n_k} = \left\{ \frac{k}{n}: \frac{T}{n} \le (1-\frac{T}{n})\right\}\), for all \(\delta >0\), we can find \(C=C(\delta )\), \(T=T(\delta )\) and \(N=N(\delta )\) such that
$$\begin{aligned}&P\left( {\displaystyle \max _{\frac{k}{n} \in U_{n_k}}} \left( \frac{m}{\log \log m}\right) ^{1/2} \left\| \frac{\tilde{\lambda }}{\epsilon _k}\right\|> C \right) \le \delta , \quad P\left( n^{-1/2} {\displaystyle \max _{\frac{k}{n} \in U_{n_k}}} m \left\| \frac{\tilde{\lambda }}{\epsilon _k}\right\|> C \right) \le \delta ,\\&P\left( {\displaystyle \max _{\frac{k}{n} \in U_{n_k}}} \left( \frac{m}{\log \log m}\right) ^{1/2} \parallel \tilde{\theta } -\theta _0 \parallel> C \right) \le \delta , \quad P\left( n^{-1/2} {\displaystyle \max _{\frac{k}{n} \in U_{n_k}}} m \parallel \tilde{\theta } - \theta _0 \parallel > C \right) \le \delta . \end{aligned}$$
Proof
The proof is similar to Lemma 1.2.2 of Csörgo and Horváth (1997)). \(\square\)
Lemma 5
Under the conditions of Theorem 1and \(H_0\), for all \(0\le \alpha <\frac{1}{2}\) we have:
$$\begin{aligned}&n^\alpha {\displaystyle \max _{\frac{k}{n} \in U_{n_k}}} \left[ \theta _k (1-\theta _k) \right] ^\alpha | -2\log \Lambda - R_k| = O_p(1),\\&{\displaystyle \max _{\frac{k}{n} \in U_{n_k}}} \left[ \theta _k (1-\theta _k)\right] ^\alpha | -2\log \Lambda - R_k| = O_p\left( n^{-\frac{1}{2}} (\log \log n)^{\frac{3}{2}}\right) , \end{aligned}$$
where \(\Theta _{nk}=\{k:\delta _1 \le k \le n-\delta _2\}.\)
Proof of Theorem 1
The proof of Theorem 1 is similar to the proof of Theorem 1.3.1 (Theorem A.3.4) of Csörgo and Horváth (1997) which derives the null distribution of the trimmed test statistic. \(\square\)
Proof of Theorem 2
The ELR test statistic is,
$$\begin{aligned} -2 \log \Lambda _k = Z_{H_0,k_0} - Z_{H_1,k_0}. \end{aligned}$$
Under \(H_1\), \(Z_{H_1,k_0}\) also follows an asymptotic \(\chi ^2\) distribution. Therefore, \(Z_{H_1,k_0} = O_p(1)\). We only need to prove that \(P(Z_{H_0,k_0}>cn) \rightarrow 1\) for a positive constant c under \(H_1\). For any fixed \(\varepsilon\), we can obtain
$$\begin{aligned} \frac{1}{2n}Z_{H_0,k_0}&= {\displaystyle \sup _{\lambda }}\frac{1}{n} \sum _{l=1}^n \log \left[ 1 + \theta _l^{-1} \lambda ^{\prime } g(x_l, \varepsilon ) \right] \\&= {\displaystyle \sup _{\lambda _1}}\frac{1}{n}\sum _{l=1}^{k_0} \log \left[ 1 + \theta _{k_0}^{-1} \lambda _1^{\prime } g_1(x_l, \varepsilon ) \right] \\&\quad + {\displaystyle \sup _{\lambda _2}}\frac{1}{n}\sum _{l=k_0+1}^{n} \log \left[ 1 + (1- \theta _{k_0})^{-1} \lambda _2^{\prime } g_2(x_l, \varepsilon ) \right] \\&\xrightarrow {\text {a.s.}}{\displaystyle \sup _{\lambda _1}} \theta _0 E \log \left( 1 + \theta _0^{-1} \lambda _1^{\prime } g_1(x_l, \varepsilon ) \right) \\&\quad + {\displaystyle \sup _{\lambda _2}} (1-\theta _0) E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \varepsilon )\right) \right] \end{aligned}$$
By Jensen’s inequality,
$$\begin{aligned}&E \log \left( 1 + \theta _0^{-1} \lambda _1^{\prime } g_1(x_l, \varepsilon ) \right) \le \log \left[ E \left( 1 + \theta _0^{-1} \lambda _1^{\prime } g_1(x_l, \varepsilon ) \right) \right] = 0\\&\quad \Longrightarrow {\displaystyle \sup _{\lambda _1}} \theta _0 E \log \left( 1 + \theta _0^{-1} \lambda _1^{\prime } g_1(x_l, \varepsilon ) \right) = 0. \end{aligned}$$
Thus,
$$\begin{aligned} \frac{1}{2n}Z_{H_0,k_0}&\xrightarrow {\text {a.s.}} {\displaystyle \sup _{\lambda _2}} (1-\theta _0) E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \varepsilon )\right) \right] \\&\le (1-\theta _0) c_0 \end{aligned}$$
Hence, \(P(Z_{H_0, k_0} \ge (1-\theta _0c_0) \rightarrow 1\). Thus, the proof. \(\square\)
Proof of Theorem 3
To prove: For arbitrary small \(\frac{\theta _0}{2} > \eta\), \(|\frac{k_0-k}{n}|\ge \eta\), \(-2\log \Lambda _k\) cannot arrive at its maximum with probability approaching to 1.
Without loss of generality, suppose \(k<k_0\) and \(\frac{k_0-k}{n} \ge \eta\). Then we have,
$$\begin{aligned} -2 \log \Lambda _{k_0} - (-2 \log \Lambda _k) = (Z_{H_0,k_0} -Z_{H_1,k_0}) - (Z_{H_0,k} - Z_{H_1,k}). \end{aligned}$$
Since \(Z_{H_1,k_0} = O_p(1)\)
$$\begin{aligned}&\frac{1}{2n}(Z_{H_0,k_0} -Z_{H_0,k} + Z_{H_1,k}) \\&\quad = {\displaystyle \sup _{\lambda _1}}\frac{1}{n}\sum _{l=1}^{k_0} \log \left[ 1 + \theta _{k_0}^{-1} \lambda _1^{\prime } g_1(x_l, \beta _0) \right] + {\displaystyle \sup _{\lambda _2}}\frac{1}{n}\sum _{l=k_0+1}^{n} \log \left[ 1 + (1- \theta _{k_0})^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0) \right] \\&\qquad - {\displaystyle \sup _{\lambda _1}}\frac{1}{n}\sum _{l=1}^{k} \log \left[ 1 + \theta _{k}^{-1} \lambda _1^{\prime } g_1(x_l, \beta _0) \right] + {\displaystyle \sup _{\lambda _2}}\frac{1}{n}\sum _{l=k+1}^{n} \log \left[ 1 + (1- \theta _{k})^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0) \right] \\&\qquad + {\displaystyle \sup _{\lambda _1}}\frac{1}{n}\sum _{l=1}^{k} \log \left[ 1 + \theta _{k}^{-1} \lambda _1^{\prime } g_1(x_l, \beta _0) \right] + {\displaystyle \sup _{\lambda _2}}\frac{1}{n}\sum _{l=k+1}^{n} \log \left[ 1 + (1- \theta _{k})^{-1} \lambda _2^{\prime } g_2(x_l, \beta _1) \right] \\&\quad \ge {\displaystyle \sup _{\lambda _2}}\frac{1}{n}\sum _{l=k_0+1}^{n} \log \left[ 1 + (1- \theta _{k_0})^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0) \right] \\&\qquad - {\displaystyle \sup _{\lambda _2}}\frac{1}{n}\sum _{l=k+1}^{n} \log \left[ 1 + (1- \theta _{k})^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0) \right] \\&= {\displaystyle \sup _{\lambda _2}}\frac{1}{n}\sum _{l=k_0+1}^{n} \log \left[ 1 + (1- \theta _{k_0})^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0) \right] \\&\qquad - {\displaystyle \sup _{\lambda _2}}\frac{1}{n}\sum _{l=k+1}^{n} \log \left[ 1 + \rho _k(1- \theta _{k_0})^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0) \right] \quad \qquad \left( \rho _k=\frac{n-k_0}{n-k}\right) \\&\quad = {\displaystyle \sup _{\lambda _2}}\frac{1}{n}\sum _{l=k_0+1}^{n} \log \left[ 1 + (1- \theta _{k_0})^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0) \right] - {\displaystyle \sup _{\lambda _2}}\frac{1}{n}\sum _{l=k+1}^{n} \log \left[ 1 + (1- \theta _{k_0})^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0) \right] \\& \quad \qquad \left( \text {Since},\ \frac{n-k_0}{n}\le \rho _k \le \frac{n-k_0}{n-k_0+n\eta }. \ \text {So},\ 1-\theta _0\le \varliminf \rho _k \le \varlimsup \rho _k \le \frac{1-\theta _0}{1-\theta _0+\eta }\right) \\&\quad \xrightarrow {\text {a.s.}}{\displaystyle \sup _{\lambda _1}} (1-\theta _0) E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0)\right) \right] \\&\quad - {\displaystyle \sup _{\lambda _1}} \left\{ \frac{k_0-k}{n} E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0)\right) \right] \right. \\&\quad \left. + (1-\theta _0) E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0)\right) \right] \right\} \\&\quad \ge {\displaystyle \sup _{\lambda _1}} (1-\theta _0) E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0)\right) \right] \\&\qquad - {\displaystyle \sup _{\lambda _1}} \left\{ \eta E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0)\right) \right] \right. \\&\qquad \left. + (1-\theta _0) E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0)\right) \right] \right\} \\&\qquad \quad \left( \text {By Jensen's inequality and}\ \frac{k_0-k}{n}\ge \eta .\right) \end{aligned}$$
Assume that \({\displaystyle \sup _{\lambda _1}} \left\{ \eta E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0)\right) \right] + (1-\theta _0) E \left[ \log \left( 1+(1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0)\right) \right] \right\}\) attains its maximum at \(\delta _2^*\). Then we have,
$$\begin{aligned}&\frac{1}{2n}(Z_{H_0,k_0} -Z_{H_0,k} + Z_{H_1,k}) \\&\quad \ge {\left\{ \begin{array}{ll} {\displaystyle \sup _{\lambda _1}} (1-\theta _0) E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{\prime } g_2(x_l, \beta _0)\right) \right] , \text { if}\ \delta _2^*=0,\\ -\eta E \left[ \log \left( 1 + (1-\theta _0)^{-1} \lambda _2^{*\prime } g_2(x_l, \beta _0)\right) \right] , \text { if}\ \delta _2^*\ne 0. \end{array}\right. } \end{aligned}$$
Therefore, by the condition that for every fixed parameter \(\delta =\beta ^*-\beta \ne 0\), there exists a positive constant \(c_0>0\) satisfy that \(\infty> {\displaystyle \inf _{\delta \ne 0}} {\displaystyle \sup _{\lambda }} E\log \left[ 1 + \lambda ^{\prime } x(x^{\prime } \delta + e) \right] \ge c_0 >0\) and Jensen’s inequality, there exists a constant \(c_0>0\), such that \(P\left( \frac{1}{2n} (Z_{H_0,k_0} -Z_{H_0,k} + Z_{H_1,k}) > c_0\right) \rightarrow 1\) as \(n\rightarrow \infty\). Thus, we have, \(P\left[ \left( -2 \log \Lambda _{k_0} - (-2 \log \Lambda _k)\right) >cn\right] \rightarrow 1\), since \(Z_{H_1,k_0} = O_p(1)\). So, \(-2\log \Lambda _k\) cannot arrive at its maximum with probability approaching to 1. By the definition of \(\hat{k}\), we have \(|\frac{k_0-\hat{k}}{n}|\le \eta\) with probability approaching to 1. Since \(\eta\) is arbitrary, thus the proof. \(\square\)