Skip to main content

Advertisement

Log in

Conditional screening for ultrahigh-dimensional survival data in case-cohort studies

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

The case-cohort design has been widely used to reduce the cost of covariate measurements in large cohort studies. In many such studies, the number of covariates is very large, and the goal of the research is to identify active covariates which have great influence on response. Since the introduction of sure independence screening, screening procedures have achieved great success in terms of effectively reducing the dimensionality and identifying active covariates. However, commonly used screening methods are based on marginal correlation or its variants, they may fail to identify hidden active variables which are jointly important but are weakly correlated with the response. Moreover, these screening methods are mainly proposed for data under the simple random sampling and can not be directly applied to case-cohort data. In this paper, we consider the ultrahigh-dimensional survival data under the case-cohort design, and propose a conditional screening method by incorporating some important prior known information of active variables. This method can effectively detect hidden active variables. Furthermore, it possesses the sure screening property under some mild regularity conditions and does not require any complicated numerical optimization. We evaluate the finite sample performance of the proposed method via extensive simulation studies and further illustrate the new approach through a real data set from patients with breast cancer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Statis 10:1100–1120

    Article  MathSciNet  MATH  Google Scholar 

  • Barlow WE (1994) Robust variance estimation for the case-cohort design. Biometrics 50:1064–1072

    Article  MATH  Google Scholar 

  • Barut E, Fan J, Verhasselt A (2016) Conditional sure independence screening. J Am Stat Assoc 111:1266–1277

    Article  MathSciNet  Google Scholar 

  • Borgan O, Langholz B, Samuelsen SO, Goldstein L, Pogoda J (2000) Exposure stratified case-cohort designs. Lifetime Data Anal 6:39–58

    Article  MathSciNet  MATH  Google Scholar 

  • Bresolw NE, Wellner JA (2007) Weighted likelihood for semiparametric models and two-phase stratified samples, with application to cox regression. Scand J Stat 34:86–102

    Article  MathSciNet  MATH  Google Scholar 

  • Candes E, Tao T (2007) The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann Stat 35:2313–2351

    MathSciNet  MATH  Google Scholar 

  • Chang J, Tang CY, Wu Y (2013) Marginal empirical likelihood and sure independence feature screening. Ann Stat 41:2123–2148

    Article  MathSciNet  MATH  Google Scholar 

  • Chen K (2001) Generalized case-cohort sampling. J R Stat Soc B 63:791–809

    Article  MathSciNet  MATH  Google Scholar 

  • Chen K, Lo SH (1999) Case-cohort and case-control analysis with Cox’s model. Biometrika 86:755–764

    Article  MathSciNet  MATH  Google Scholar 

  • Cox DR (1972) Regression models and life-tables. J R Stat Soc B 34:187–220

    MathSciNet  MATH  Google Scholar 

  • Cui H, Li R, Zhong W (2015) Model-free feature screening for ultrahigh dimensional discriminant analysis. J Am Stat Assoc 110:630–641

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Feng Y, Song R (2011) Nonparametric independence screening in sparse ultra-high-dimensional additive models. J Am Stat Assoc 106:544–557

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Feng Y, Wu Y (2010) High-dimensional variable selection for Cox’s proportional hazards model. In: Borrowing strength: theory powering applications: a Festschrift for Lawrence D. Brown, Institute of Mathematical Statistics 6:70–86

  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc B 70:849–911

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Ma Y, Dai W (2014) Nonparametric independence screening in sparse ultra-high-dimensional varying coefficient models. J Am Stat Assoc 109:1270–1284

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Samworth R, Wu Y (2009) Ultrahigh dimensional feature selection: beyond the linear model. J Mach Learn Res 10:2013–2038

    MathSciNet  MATH  Google Scholar 

  • Fan J, Song R (2010) Sure independence screening in generalized linear models with NP-dimensionality. Ann Stat 38:3567–3604

    Article  MathSciNet  MATH  Google Scholar 

  • Fleming TR, Harrington DP (1991) Counting processes and survival analysis. Wiley, New York

    MATH  Google Scholar 

  • Gorst-Rasmussen A, Scheike T (2013) Independent screening for single-index hazard rate models with ultrahigh dimensional features. J R Stat Soc B 75:217–245

    Article  MathSciNet  Google Scholar 

  • He X, Wang L, Hong HG (2013) Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann Stat 41:342–369

    MathSciNet  MATH  Google Scholar 

  • Hong HG, Kang J, Li Y (2018) Conditional screening for ultra-high dimensional covariates with survival outcomes. Lifetime Data Anal 24:45–71

    Article  MathSciNet  MATH  Google Scholar 

  • Hong HG, Wang L, He X (2016) A data-driven approach to conditional screening of high-dimensional variables. Stat 5:200–212

    Article  MathSciNet  Google Scholar 

  • Hu Q, Lin L (2017) Conditional sure independence screening by conditional marginal empirical likelihood. Ann Inst Stat Math 69:63–96

    Article  MathSciNet  MATH  Google Scholar 

  • Kalbfleisch JD, Lawless JF (1988) Likelihood analysis of multi-state models for disease incidence and mortality. Stat Med 7:149–160

    Article  Google Scholar 

  • Kang S, Cai J (2009) Marginal hazards model for case-cohort studies with multiple disease outcomes. Biometrika 96:887–901

    Article  MathSciNet  MATH  Google Scholar 

  • Keogh RH, White IR (2013) Using full-cohort data in nested case-control and case-cohort studies by multiple imputation. Stat Med 32:4021–4043

    Article  MathSciNet  Google Scholar 

  • Kim S, Ahn WK (2019) Bi-level variable selection for case-cohort studies with group variables. Stat Methods Med Res 28:3404–3414

    Article  MathSciNet  Google Scholar 

  • Kim S, Cai J, Lu W (2013) More efficient estimators for case-cohort studies. Biometrika 100:695–708

    Article  MathSciNet  MATH  Google Scholar 

  • Kulich M, Lin D (2004) Improving the efficiency of relative-risk estimation in case-cohort studies. J Am Stat Assoc 99:832–844

    Article  MathSciNet  MATH  Google Scholar 

  • Li G, Peng H, Zhang J, Zhu L (2012a) Robust rank correlation based screening. Ann Stat 40:1846–1877

    Article  MathSciNet  MATH  Google Scholar 

  • Li R, Zhong W, Zhu L (2012b) Feature screening via distance correlation learning. J Am Stat Assoc 107:1129–1139

    Article  MathSciNet  MATH  Google Scholar 

  • Lin DY, Wei LJ (1989) The robust inference for the Cox proportional hazards model. J Am Stat Assoc 84:1074–1078

    Article  MathSciNet  MATH  Google Scholar 

  • Lin Y, Liu X, Hao M (2018) Model-free feature screening for high-dimensional survival data. Sci China Math 61:1617–1636

    Article  MathSciNet  MATH  Google Scholar 

  • Liu Y, Chen XL (2018) Quantile screening for ultra-high-dimensional heterogeneous data conditional on some variables. J Stat Comput Sim 88:329–342

  • Liu J, Li R, Wu R (2014) Feature selection for varying coefficient models with ultrahigh-dimensional covariates. J Am Stat Assoc 109:266–274

    Article  MathSciNet  MATH  Google Scholar 

  • Liu Y, Wang Q (2018) Model-free feature screening for ultrahigh-dimensional data conditional on some variables. Ann Inst Stat Math 70:283–301

    Article  MathSciNet  MATH  Google Scholar 

  • Liu Y, Zhang J, Zhao X (2018) A new nonparametric screening method for ultrahigh-dimensional survival data. Comput Stat Data Anal 119:74–85

    Article  MathSciNet  MATH  Google Scholar 

  • Lu J, Lin L (2020) Model-free conditional screening via conditional distance correlation. Stat Pap 61:225–244

    Article  MathSciNet  MATH  Google Scholar 

  • Mai Q, Zou H (2015) The fused Kolmogorov filter: a nonparametric model-free screening method. Ann Stat 43:1471–1497

    Article  MathSciNet  MATH  Google Scholar 

  • Marti H, Chavance M (2011) Multiple imputation analysis of case-cohort studies. Stat Med 30:1595–1607

    Article  MathSciNet  Google Scholar 

  • Ni A, Cai J, Zeng D (2016) Variable selection for case-cohort studies with failure time outcome. Biometrika 103:547–562

    Article  MathSciNet  MATH  Google Scholar 

  • Pan W, Wang X, Xiao W, Zhu H (2019) A generic sure independence screening procedure. J Am Stat Assoc 114:928–937

    Article  MathSciNet  MATH  Google Scholar 

  • Prentice RL (1986) A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73:1–11

    Article  MathSciNet  MATH  Google Scholar 

  • Scheike TH, Martinussen T (2004) Maximum likelihood estimation for Cox’s regression model under case-cohort sampling. Scand J Stat 31:283–293

    Article  MathSciNet  MATH  Google Scholar 

  • Self SG, Prentice R (1988) Asymptotic distribution theory and efficiency results for case-cohort studies. Ann Stat 16:64–81

    Article  MathSciNet  MATH  Google Scholar 

  • Song R, Lu W, Ma S, Jeng XJ (2014) Censored rank independence screening for high-dimensional survival data. Biometrika 101:799–814

    Article  MathSciNet  MATH  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58:267–288

    MathSciNet  MATH  Google Scholar 

  • Tibshirani R (2009) Univariate shrinkage in the Cox model for high dimensional data. Stat Appl Genet Mol 8:1–18

    Article  MathSciNet  MATH  Google Scholar 

  • Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei LJ (2011) On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med 30:1105–1117

    Article  MathSciNet  Google Scholar 

  • van de Vijver MJ, He YD, van Veer LJ, Dai H, Hart AA, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ (2002) A gene-expression signature as a predictor of survival in breast cancer. New Engl J Med 347:1999–2009

    Article  Google Scholar 

  • van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York

    Book  MATH  Google Scholar 

  • van Veer LJ, Dai H, van De Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536

    Article  Google Scholar 

  • Wu Y, Yin G (2015) Conditional quantile screening in ultrahigh-dimensional heterogeneous data. Biometrika 102:65–76

    Article  MathSciNet  MATH  Google Scholar 

  • Yeung KY, Bumgarner RE, Raftery AE (2005) Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics 21:2394–2402

    Article  Google Scholar 

  • Zeng D, Lin DY (2014) Efficient estimation of semiparametric transformation models for two-phase cohort studies. J Am Stat Assoc 109:371–383

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38:894–942

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang J, Liu Y, Wu Y (2017) Correlation rank screening for ultrahigh–dimensional survival data. Comput Stat Data Anal 108:121–132

  • Zhang J, Yin G, Liu Y, Wu Y (2018) Censored cumulative residual independent screening for ultrahigh-dimensional survival data. Lifetime Data Anal 24:273–292

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao SD, Li Y (2012) Principled sure independence screening for Cox models with ultra-high-dimensional covariates. J Mult Anal 105:397–411

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou T, Zhu L (2017) Model-free feature screening for ultrahigh dimensional censored regression. Stat Comput 27:947–961

    Article  MathSciNet  MATH  Google Scholar 

  • Zhu LP, Li L, Li R, Zhu LX (2011) Model-free feature screening for ultrahigh-dimensional data. J Am Stat Assoc 106:1464–1475

    Article  MathSciNet  MATH  Google Scholar 

  • Zou H (2006) The adaptive Lasso and its oracle properties. J Am Stat Assoc 101:1418–1429

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work is funded in part by the U.S. National Institute of Health Grants (P01CA142538, P42ES031007, P30ES010126), the National Natural Science Foundation of China grants (Nos. 11971362, 11901581, 11771366).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianwen Cai.

Ethics declarations

Conflict of interest

The authors declare that we have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 34 KB)

Appendix

Appendix

1.1 Appendix A: regularity conditions

Let \(S_T(t|\mathbf{Z}_i)=\exp \{-\varLambda _0(t)\exp (\varvec{\alpha }^{\mathrm{T}}\mathbf{Z}_i)\}\) and \(S_C(t|\mathbf{Z}_i)=P(C_i>t|\mathbf{Z}_i)\) denote the survival functions of \(T_i\) and \(C_i\), \(F_T(t|\mathbf{Z}_i)=1-S_T(t|\mathbf{Z}_i)\), \(\varLambda _0(t)=\int _0^t\lambda _0(s)\mathrm{d}s\) denotes the cumulative baseline hazard function. For any vector \({\varvec{\nu }}=(\nu _1,\ldots ,\nu _p)\in R^p\), let \(\Vert {\varvec{\nu }}\Vert _d=\root d \of {\sum _{j=1}^p|\nu _j|^d}\) be the \(L_d\) norm. For any random variables \(\zeta : \varOmega \rightarrow R^d\), \(\zeta _1: \varOmega \rightarrow R^{d_1}\), \(\zeta _2: \varOmega \rightarrow R^{d_2}\) and \(\eta : \varOmega \rightarrow R^p\), the conditional linear expectation of \(\zeta \) given \(\eta \) is defined as \(E^{*}(\zeta |\eta )=E(\zeta )+B^{T}\{\eta -E(\eta )\}\), where \(B=\mathrm{argmin}_{D\in R^d\times R^p}E[\{\zeta -E(\zeta )-D^{\mathrm{T}}(\eta -E(\eta ))\}^2|\eta ]\). The conditional linear covariance between \(\zeta _1\) and \(\zeta _2\) given \(\eta \) is defined as \(Cov^{*}(\zeta _1,\zeta _2|\eta )=E^{*}[\{\zeta _1-E^{*}(\zeta _1|\eta )\} \{\zeta _2-E^{*}(\zeta _2|\eta )\}|\eta ]\). The properties of \(E^{*}(\zeta |\eta )\) and \(Cov^{*}(\zeta _1,\zeta _2|\eta )\) are presented in “Appendix B”. The regularity conditions listed below are imposed throughout our discussions.

  1. C1.

    For each \(j\notin {\mathcal {C}}\) and \(k\in {\mathcal {C}}\bigcup \{j\}\), there exists a neighborhood \({\mathcal {B}}_j\) of \(({\varvec{\beta }}_{{\mathcal {C}},j}^{0},{\beta }_j^{0})^{\mathrm{T}}\) such that

    $$\begin{aligned} \sup _{t\in [0,\tau ],({\varvec{\beta }}_{{\mathcal {C}},j},{\beta _j})^{\mathrm{T}}\in {\mathcal {B}}_j}\Vert S_{j,k}^{(l)}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j,t)- s_{j,k}^{(l)}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j,t)\Vert _2\rightarrow 0 \end{aligned}$$

    in probability as \(n\rightarrow \infty \) (\(l=0,1\)), \(s_{j,k}^{(0)}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j,t)\) is bounded away from zero on \({\mathcal {B}}_j\times [0,\tau ]\), \(s_{j,k}^{(l)}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j,t)\) are bounded on \({\mathcal {B}}_j\times [0,\tau ]\).

  2. C2.

    For all \(j=1,\ldots ,p\), \(\int _0^{\tau }\lambda _{j,0}(t)\mathrm {d}t< \infty \) and \(E\{Y(\tau )\}>0\).

  3. C3.

    The covariates \(Z_{j}\) (\(j=1,\ldots , p\)) are independent of time and bounded by a constant \(L_0\). Furthermore, \(E(Z_j)=0\) for all \(j\in \{1,\ldots , p\}\).

  4. C4.

    All \(Z_{j}\), \(j\in {\mathcal {A}}_{-{\mathcal {C}}}\) are independent of all \(Z_{j}\), \(j\notin {\mathcal {A}}_{-{\mathcal {C}}}\) given \(\mathbf{Z}_{{\mathcal {C}}}\).

  5. C5.

    There exists a constant \(L_1\) such that \(\Vert \varvec{\alpha }\Vert _1<L_1\) and \(\Vert ({\varvec{\beta }}_{{\mathcal {C}},j},{\beta }_j)^{\mathrm{T}}\Vert _1<L_1\).

  6. C6.

    There exist constants \(c_1>0\) and \(0<\kappa <1/2\) such that \(\min _{j\in {\mathcal {A}}_{-{\mathcal {C}}}}|E[Cov^{*}(Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}})]| \ge c_1 n^{-\kappa }\).

  7. C7.

    There exists a constant \(L>0\) such that \(n^{-1}\Vert \mathbf{U}_j(\widehat{\varvec{\beta }}_{{\mathcal {C}},j},\widehat{\beta }_j)-\mathbf{U}_j(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\Vert _2 \ge L\Vert (\widehat{\varvec{\beta }}_{{\mathcal {C}},j},\widehat{\beta }_j)^{\mathrm{T}} -({\varvec{\beta }}_{{\mathcal {C}},j}^{0},{\beta }_j^{0})^{\mathrm{T}}\Vert _2\) for all \(j\notin {\mathcal {C}}\).

  8. C8.

    Let \({\tilde{n}}=\sum _{i=1}^n\xi _i\) denote the sample size of subcohort, then \({\tilde{n}}/n\) converges to the constant \(\pi \in (0,1)\).

Conditions C1 and C2 are common assumptions in survival analysis (Andersen and Gill 1982; Fleming and Harrington 1991). Condition C3 assumes the covariates are bounded, similar condition also used in Hong et al. (2018). Condition C4 is similar to the partial orthogonality assumption of the covariates. Condition C5 controls the total effect size of the covariates, it is reasonable under the sparsity principle. Condition C6 is a typical assumption which has been widely used in the literature of feature screening, such as condition 3 in Fan and Lv (2008), condition 2 in Li et al. (2012b), condition 2 in Song et al. (2014), conditions 2 and 5 in Wu and Yin (2015), etc. Condition C7 is a mild assumption which holds in many situations. Condition C8 is a common assumption on the case-cohort design.

1.2 Appendix B: lemmas and theoretic proofs

Let \({\varvec{\beta }}_{{\mathcal {C}},0}\) be the solution of the equations \(\mathbf{u}_{{\mathcal {C}}}(\varvec{\beta }_{{\mathcal {C}}})=[u_{j,k}(\varvec{\beta }_{{\mathcal {C}}},0), k\in {\mathcal {C}}]^{\mathrm{T}}=\mathbf{0}_{q}\). Define \(\mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)=u_{j,j}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)- \sum _{k\in {\mathcal {C}}}b_k u_{j,k}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)\), where vector \(\mathbf{b}_{{\mathcal {C}}}=[b_k, k\in {\mathcal {C}}]^{\mathrm{T}}\) such that \(E^{*}[Z_j|\mathbf{Z}_{{\mathcal {C}}}]=\sum _{k\in {\mathcal {C}}} b_kZ_k\). As a preparation, we first introduce some lemmas.

Lemma 4

Let \(\varvec{\xi }=(\xi _1,\ldots ,\xi _n)\) be a random vector containing \({\tilde{n}}\) ones and \(n-{\tilde{n}}\) zeros, with each permutation equally likely. Let \(B_i (t)\) \((i = 1, \ldots , n)\) be independent and identically distributed real-valued random processes on \([0,\tau ]\) with \(E\{B_i(t)\}=\mu _B(t)\), \(var(B_i(\tau ))< \infty \). Let \(B(t)=\{B_1(t),\ldots ,B_n(t)\}\) be independent of \(\xi \). Suppose that almost all paths of \(B_i(t)\) have finite variation. Then, \(n^{-1/2}\sum _{i=1}^n\xi _i\{B_i(t)-\mu _B(t)\}\) converges weakly in \(l^\infty [0,\tau ]\) to a zero-mean Gaussian process and therefore \(n^{-1/2}\sum _{i=1}^n\xi _i\{B_i(t)-\mu _B(t)\}\) converges in probability to zero uniformly in t.

This Lemma is the same as Lemma A1 of Kang and Cai (2009).

Lemma 5

Given that \(\xi \) is independent of \(\varDelta \) and Y(t), \(n^{1/2}\{\widehat{\pi }^{-1}(t)-\pi ^{-1}\}\) converges weakly to a zero-mean Gaussian process.

This lemma is extracted from lemma A3 of Ni et al. (2016).

Lemma 6

For independent random variables \(Y_1,\ldots , Y_n\) with bounded ranges \([-M,M]\) and zero mean,

$$\begin{aligned} P\left( |Y_1+\ldots +Y_n|>y\right) \le 2 \exp \left( -\frac{1}{2}\frac{y^2}{V+My/3}\right) \end{aligned}$$

for \(V\ge Var(Y_1+\ldots +Y_n)\).

This lemma is extracted from lemma 2.2.9 of van der Vaart and Wellner (1996).

Lemma 7

Let \(\zeta \), \(\zeta _1\), \(\zeta _2\) and \(\eta \) be any four random variables in the probability space \((\varOmega , {\mathcal {F}},P)\), the following properties hold for the conditional linear expectation \(E^*(\cdot |\eta )\) given \(\eta \):

  1. 1.

    \(E^*(\zeta |\eta )=E(\zeta )+Cov(\zeta ,\eta )Var(\eta )^{-1}\{\eta -E(\eta )\}\);

  2. 2.

    \(E^*(\eta |\eta )=\eta \);

  3. 3.

    For any matrices \(A_1\) and \(A_2\), \(E^*(A_1\zeta _1+A_2\zeta _2|\eta )=A_1E^*(\zeta _1|\eta )+A_2E^*(\zeta _2|\eta )\);

  4. 4.

    \(E^*[E^*(\zeta |\eta )]=E[E^*(\zeta |\eta )]=E[\zeta ]\).

This lemma is extracted from proposition 2 of Hong et al. (2018).

Lemma 8

The conditional linear covariance has the following properties:

  1. 1.

    \(Cov^*(\zeta _1,\zeta _2|\eta )=0\Longleftrightarrow E^*(\zeta _1\zeta _2|\eta )=E^*(\zeta _1|\eta )E^*(\zeta _2|\eta )\);

  2. 2.

    \(E[Cov^*(\zeta _1,\zeta _2|\eta )]=Cov(\zeta _1,\zeta _2)- Cov(\zeta _1,\eta )Var(\eta )^{-1}Cov(\eta ,\zeta _2)\);

  3. 3.

    For any increasing function \(h(\cdot ): R \rightarrow R\) and random variable \(\xi : \varOmega \rightarrow R\), we have \(Cov^*(h(\xi ),\xi |\eta )\ge 0\).

This lemma is extracted from proposition 3 of Hong et al. (2018).

1.2.1 Proof of Lemma 1

Proof

We first relate \(\beta _j^{0}\) to \(E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}]\), then by condition C6, we relate it to \(\alpha _j\). For any \(j\notin {\mathcal {C}}\) and \(k\in {\mathcal {C}}\), straightforward calculations entail that \(s_k^{l}(t)=E\{Z_k^l\lambda _0(t)\exp (\varvec{\alpha }^{\mathrm{T}}\mathbf{Z})S_TS_C\}\) and \(s_{j,k}^{(l)}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j,t)=E\{Z_k^l\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}+Z_j\beta _j )S_TS_C\}\) \((l=0,1,2)\), then

$$\begin{aligned}&u_{j,k}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)\\&\quad =\int _0^{\tau }E\left\{ \left[ Z_k-\frac{E\{Z_k\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}+Z_j\beta _j )S_TS_C\}}{E\{\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}+Z_j\beta _j )S_TS_C\}}\right] \lambda _0(t)\exp (\varvec{\alpha }^{\mathrm{T}}\mathbf{Z})S_TS_C\right\} \mathrm{d}t. \end{aligned}$$

By the definition, we have

$$\begin{aligned} \mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)=u_{j,j}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)- \sum _{k\in {\mathcal {C}}}b_k u_{j,k}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j) \equiv F_{1j}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)-F_{2j}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j), \end{aligned}$$

where

$$\begin{aligned} F_{1j}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)= & {} \int _0^{\tau }E\{(Z_j-\sum _{k\in {\mathcal {C}}}b_kZ_k)\lambda _0(t)\exp (\varvec{\alpha }^{\mathrm{T}}\mathbf{Z})S_TS_C\}\mathrm{d}t\\= & {} \int _0^{\tau }E[\{Z_j-E^*(Z_j|\mathbf{Z}_{\mathcal {C}})\}\lambda _0(t)\exp (\varvec{\alpha }^{\mathrm{T}}\mathbf{Z})S_TS_C]\mathrm{d}t\\= & {} E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}], \end{aligned}$$

and

$$\begin{aligned}&F_{2j}(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)\\&=\int _0^{\tau }\left[ \frac{E\{Z_j\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}+Z_j\beta _j )S_TS_C\}}{E\{\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}+Z_j\beta _j )S_TS_C\}}\right. \\&\qquad \left. - \sum _{k\in {\mathcal {C}}}b_k \frac{E\{Z_k\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}+Z_j\beta _j )S_TS_C\}}{E\{\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}+Z_j\beta _j )S_TS_C\}}\right] \\&\qquad \times E\left\{ \lambda _0(t)\exp (\varvec{\alpha }^{\mathrm{T}}\mathbf{Z})S_TS_C\right\} \mathrm{d}t\\&\quad =\int _0^{\tau }\frac{E[\{Z_j-E^*(Z_j|\mathbf{Z}_{\mathcal {C}})\}\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}+Z_j\beta _j )S_TS_C]}{E\{\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}+Z_j\beta _j )S_TS_C\}}\\&\qquad \times E\left\{ \lambda _0(t)\exp (\varvec{\alpha }^{\mathrm{T}}\mathbf{Z})S_TS_C\right\} \mathrm{d}t. \end{aligned}$$

By the definition of \((\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\), we have \(\mathbf{u}_j(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})=\mathbf{0}_{q+1}\), then \(u_{j,k}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})=0\) for any \(k\in {\mathcal {C}}\cup \{j\}\), \(\mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})=u_{j,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})- \sum _{k\in {\mathcal {C}}}b_k u_{j,k}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})=0\), \(F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})=F_{1j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0}) =E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}]\). When \(\alpha _j=0\), \(E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}]=0\), thus \(F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})=0\). Because of \(F_{2j}(\varvec{\beta }_{{\mathcal {C}},0},0)=0\), \(\mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}},0},0)=E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}] -F_{2j}(\varvec{\beta }_{{\mathcal {C}},0},0)=\mathbf{0}_{q+1}\). By the uniqueness of the solution of \(\mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}}},\beta )\), we have \(\beta _j^{0}=0\).

When \(\alpha _j\ne 0\), by condition C6, we have \(F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0}) =E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}]\ge c_1n^{-\kappa }\). This implies that \(F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\) and \(E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}]\) are both nonzero and have the same signs since they are equal. Specifically, \(P(\delta =1|\mathbf{Z})\) is the probability of occurrence of the event and \(S_TS_C=P(X>t|\mathbf{Z})\) represents the probability at risk at time t. For any t, we have

$$\begin{aligned} \frac{\partial P(\delta =1|\mathbf{Z})}{\partial Z_j}\times \frac{\partial P(X>t|\mathbf{Z})}{\partial Z_j} \le 0. \end{aligned}$$

By lemma 8, \(Cov^*\{Z_j,P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}\) and \(Cov^*(Z_j,S_TS_C|\mathbf{Z}_{{\mathcal {C}}})\) have the opposite signs unless they are zero. This further implies that

$$\begin{aligned} F_{2j}(\varvec{\beta }_{{\mathcal {C}},0},0)&= \int _0^{\tau }\frac{E\{\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},0})Cov^{*}(Z_j,S _TS_C|\mathbf{Z}_{{\mathcal {C}}})\}}{E\{\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},0})S_TS_C\}}\\&\times E\left\{ \lambda _0(t)\exp (\varvec{\alpha }^{\mathrm{T}}\mathbf{Z})S_TS_C\right\} \mathrm{d}t, \end{aligned}$$

and \(E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}]\) have opposite signs unless they are equal to zero. So \(F_{2j}(\varvec{\beta }_{{\mathcal {C}},0},0)\ne F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\), therefore, \(\beta _j^{0}\ne 0\). \(\square \)

1.2.2 Proof of Lemma 2

Proof

By lemma 1, for any \(j\in {\mathcal {A}}_{-{\mathcal {C}}}\), we have \(\beta _j^{0}\ne 0\). By Taylor expansion, there exists \(\widetilde{\beta }_j\in (0, \beta _j^{0})\) such that

$$\begin{aligned} |\mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},0)|= |\mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})-\mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},0)| =|\frac{\partial \mathbf{v}_{j} }{\partial \beta _j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\widetilde{\beta }_j)||\beta _j^{0}|. \end{aligned}$$

By the proof of lemma 1, \(\mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})=E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}] -F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0}).\) Given \(\varvec{\beta }_{{\mathcal {C}},j}^{0}\), consider \(F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j)\) as a function of \(\beta _j\), then

$$\begin{aligned} \frac{\partial {F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j)}}{\partial \beta _j}= & {} \int _0^{\tau }H_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j,t)E\left\{ \lambda _0(t)\exp (\varvec{\alpha }^{\mathrm{T}}\mathbf{Z})S_TS_C\right\} \mathrm{d}t\\= & {} E\left\{ \int _0^{\tau }H_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j,t)S_C \mathrm{d}F_T(t|\mathbf{Z})\right\} , \end{aligned}$$

where

$$\begin{aligned}&H_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j,t)\\&\quad =\frac{E[Z_j\{Z_j-E^*(Z_j|\mathbf{Z}_{\mathcal {C}})\}\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}^{0}+Z_j\beta _j )S_TS_C]}{E\{\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}^{0}+Z_j\beta _j )S_TS_C\}}\\&\qquad -\frac{E[\{Z_j-E^*(Z_j|\mathbf{Z}_{\mathcal {C}})\}\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}^{0}+Z_j\beta _j )S_TS_C]E\{Z_j\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}^{0}+Z_j\beta _j )S_TS_C\}}{[E\{\exp (\mathbf{Z}_{{\mathcal {C}}}^{\mathrm{T}}\varvec{\beta }_{{\mathcal {C}},j}^{0}+Z_j\beta _j )S_TS_C\}]^2}. \end{aligned}$$

By condition C3, \(|Z_j|\le L_0\), then \(\sup _{\beta _j}|H_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j,t)| \le 2L_0^2\). So

$$\begin{aligned} \left| \frac{\partial \mathbf{v}_{j} }{\partial \beta _j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\widetilde{\beta }_j)\right| \le \sup _{\beta _j}\left| \frac{\partial {F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j)}}{\partial \beta _j}\right| \le 2L_0^2\left| E[E\{S_C(T)|\mathbf{Z}\}]\right| \le 2L_0^2. \end{aligned}$$

By the proof in lemma 1, \(F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},0)\) and \(E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}]\) have opposite signs, combining it with condition C6,

$$\begin{aligned} |\mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},0)|=|E[Cov^{*}\{Z_{j},P(\delta =1|\mathbf{Z})|\mathbf{Z}_{{\mathcal {C}}}\}]| +|F_{2j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},0)|\ge c_1n^{-\kappa }. \end{aligned}$$

So

$$\begin{aligned} \left| \beta _j^{0}\right| =\left| \frac{\partial \mathbf{v}_{j} }{\partial \beta _j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\widetilde{\beta }_j)\right| ^{-1}\left| \mathbf{v}_{j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},0)\right| \ge (2L_0^2)^{-1}c_1n^{-\kappa }. \end{aligned}$$

Taking \(c_2=0.5L_0^{-2}c_1\), we have

$$\begin{aligned} \min _{j\in {\mathcal {A}}_{-{\mathcal {C}}}}|\beta _j^{0}|\ge c_2n^{-\kappa }, \end{aligned}$$

which completes the proof. \(\square \)

1.2.3 Proof of Lemma 3

Proof

Denote \(\bar{\mathbf{U}}_j(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)=n^{-1}\mathbf{U}_j(\varvec{\beta }_{{\mathcal {C}},j},\beta _j)\). By the definition of \((\widehat{\varvec{\beta }}_{{\mathcal {C}},j},\widehat{\beta }_j)^{\mathrm{T}}\), we have

$$\begin{aligned} \left\| \bar{\mathbf{U}}_j(\widehat{\varvec{\beta }}_{{\mathcal {C}},j},\widehat{\beta }_j)- \bar{\mathbf{U}}_j({\varvec{\beta }}_{{\mathcal {C}},j}^{0},{\beta }_j^{0})\right\| = \left\| \bar{\mathbf{U}}_j({\varvec{\beta }}_{{\mathcal {C}},j}^{0},{\beta }_j^{0})\right\| . \end{aligned}$$

For any \(j\notin {\mathcal {C}}\) and \(k\in {\mathcal {C}}\cup \{j\}\), using the similar method of Lin and Wei (1989), by lemmas 4 and 5, we can obtain that

$$\begin{aligned} \bar{\mathbf{U}}_j(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})=n^{-1}\sum _{i=1}^n\mathbf{W}_{i,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})+o_p(1), \end{aligned}$$

where \(\mathbf{W}_{i,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\) \((i=1,\ldots , n)\) are independent, \(E\{\mathbf{W}_{i,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\}=\mathbf{0}\) and \(\mathbf{W}_{i,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})= [{W}_{i,j,k}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0}), k\in {\mathcal {C}}\cup \{j\}]^{\mathrm{T}}\) with

$$\begin{aligned}&{W}_{i,j,k}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\\&\quad =\int _{0}^{\tau }\Big [ Z_{ik}-\frac{E\{Z_{ik} \exp (\varvec{\beta }_{{\mathcal {C}},j}^{0}\mathbf{Z}_{i,{\mathcal {C}}}+ \beta _j^{0} Z_{ij})S_TS_C\}}{E\{ \exp (\varvec{\beta }_{{\mathcal {C}},j}^{0}\mathbf{Z}_{i,{\mathcal {C}}}+ \beta _j^{0} Z_{ij})S_TS_C\}}\Big ]\mathrm{d}N_i(t)\\&\qquad -\int _{0}^{\tau }\frac{Y_i(t)\exp (\varvec{\beta }_{{\mathcal {C}},j}^{0}\mathbf{Z}_{i,{\mathcal {C}}}+ \beta _j^{0} Z_{ij})}{E\{ \exp (\varvec{\beta }_{{\mathcal {C}},j}^{0}\mathbf{Z}_{i,{\mathcal {C}}}+ \beta _j^{0} Z_{ij})S_TS_C\}} \Big [ Z_{ik}\\&\qquad -\frac{E\{Z_{ik} \exp (\varvec{\beta }_{{\mathcal {C}},j}^{0}\mathbf{Z}_{i,{\mathcal {C}}}+ \beta _j^{0} Z_{ij})S_TS_C\}}{E\{ \exp (\varvec{\beta }_{{\mathcal {C}},j}^{0}\mathbf{Z}_{i,{\mathcal {C}}}+ \beta _j^{0} Z_{ij})S_TS_C\}}\Big ] E\{\mathrm{d}N_i(t)\}. \end{aligned}$$

Let \(E_n\) denote the empirical measure, we can write

$$\begin{aligned} \bar{\mathbf{U}}_j(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})=E_n[\mathbf{W}_{i,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})]+o_p(1). \end{aligned}$$

For any given ijk, by conditions C1, C3, C5, there exists a constant \(L_2\) such that \(|{W}_{i,j,k}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})|\le L_2\), by the fact that \(E[{W}_{i,j,k}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})]=0\), we have \(Var[{W}_{i,j,k}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})]= E[|{W}_{i,j,k}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})|^2]\le L_2^2\). By lemma 6, for any \(t>0\), \(j\notin {\mathcal {C}}\) and \(k\in {\mathcal {C}}\cup \{j\}\), we have

$$\begin{aligned} P\left( \left| E_n({W}_{i,j,k}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0}))\right| >\frac{t}{n}\right) \le 2\exp \left( -\frac{1}{2}\frac{t^2}{nL_2^2+L_2t/3}\right) . \end{aligned}$$

By Bonferroni inequality, we have

$$\begin{aligned} P\left( \Vert E_n(\mathbf{W}_{i,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0}))\Vert _2>\frac{t(q+1)}{n}\right) \le 2(q+1)\exp \left( -\frac{1}{2}\frac{t^2}{nL_2^2+L_2t/3}\right) . \end{aligned}$$

As \(\Vert \bar{\mathbf{U}}_j(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})-E_n[\mathbf{W}_{i,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})]\Vert _2=o_p(1)\), for any \(\epsilon _1>0\) and \(\epsilon _2>0\), there exists \(N_1\) such that for any \(n>N_1\), we have

$$\begin{aligned} P\left( \left\| \bar{\mathbf{U}}_j(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})-E_n[\mathbf{W}_{i,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})]\right\| _2>Lc_2\epsilon _1/2\right) <\epsilon _2. \end{aligned}$$

Taking \(t=\frac{c_2Ln^{1-\kappa }}{2(q+1)}>0\), then \(\frac{t(q+1)}{n}=\frac{c_2Ln^{-\kappa }}{2}\). By Triangle inequality and Bonferroni inequality, we have

$$\begin{aligned}&P\left( \left\| \bar{\mathbf{U}}_j(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\right\| _2>Lc_2(n^{-\kappa }+\epsilon _1)/2\right) \\&\le P\left( \left\| E_n\{\mathbf{W}_{i,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\}\right\| _2>Lc_2n^{-\kappa }/2\right) \\&+ P\left( \left\| \bar{\mathbf{U}}_j(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})-E_n\{\mathbf{W}_{i,j}(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\}\right\| _2>Lc_2\epsilon _1/2\right) \\&\le 2(q+1)\exp \left( -\frac{1}{2}\frac{c_2^2L^2n^{2-2\kappa }/4(q+1)^2}{nL_2^2+L_2c_2Ln^{1-\kappa }/6(q+1)}\right) +\epsilon _2. \end{aligned}$$

Taking \(N=\max \{(L_2/3)^{1/\kappa },N_1\}\), then for any \(n>N\), \(n^{-\kappa }<3/L_2\), so we have

$$\begin{aligned} P\left( \Vert \bar{\mathbf{U}}_j(\varvec{\beta }_{{\mathcal {C}},j}^{0},\beta _j^{0})\Vert _2>Lc_2(n^{-\kappa }+\epsilon _1)/2\right) \le 2(q+1)\exp \left( -c_3n^{1-2\kappa }\right) +\epsilon _2, \end{aligned}$$

where \(c_3=\frac{c_2^2L^2}{8L_2^2(q+1)^2+4c_2L(q+1)}\). By condition C7, we have

$$\begin{aligned}&P\left( \left| \widehat{\beta }_j-\beta _j^{0}\right|>c_2(n^{-\kappa }+\epsilon _1)/2\right) \\&\le P\left( \left\| (\widehat{\varvec{\beta }}_{{\mathcal {C}},j},\widehat{\beta }_j)^{\mathrm{T}}-({\varvec{\beta }}_{{\mathcal {C}},j}^{0},{\beta }_j^{0})^{\mathrm{T}}\right\| _2>c_2(n^{-\kappa }+\epsilon _1)/2\right) \\&\le P\left( \left\| \bar{\mathbf{U}}_j(\widehat{\varvec{\beta }}_{{\mathcal {C}},j},\widehat{\beta }_j)^{\mathrm{T}}-\bar{\mathbf{U}}_j({\varvec{\beta }}_{{\mathcal {C}},j}^{0},{\beta }_j^{0})\right\| _2>Lc_2(n^{-\kappa }+\epsilon _1)/2\right) \\&= P\left( \left\| \bar{\mathbf{U}}_j({\varvec{\beta }}_{{\mathcal {C}},j}^{0},{\beta }_j^{0})\right\| _2>Lc_2(n^{-\kappa }+\epsilon _1)/2\right) \\&\le 2(q+1)\exp \left( -c_3n^{1-2\kappa }\right) +\epsilon _2. \end{aligned}$$

Then we have

$$\begin{aligned} P\left( \max _{j\in {\mathcal {A}}_{-{\mathcal {C}}}}\left| \widehat{\beta }_j-\beta _j^{0}\right| >c_2(n^{-\kappa }+\epsilon _1)/2\right) \le 2a(q+1)\exp (-c_3n^{1-2\kappa })+a\epsilon _2, \end{aligned}$$

where \(a=|{\mathcal {A}}_{-{\mathcal {C}}}|=\sum _{j\notin {\mathcal {C}}}I(\alpha _j\ne 0)\) is the size of \(|{\mathcal {A}}_{-{\mathcal {C}}}|\). \(\square \)

1.2.4 Proof of Theorem 1

Proof

By the definition of \(\widehat{{\mathcal {A}}}_{-{\mathcal {C}}}\) and condition C7, there exists a positive constant \(c_4\) such that

$$\begin{aligned} P\left( {\mathcal {A}}_{-{\mathcal {C}}}\subseteq \widehat{{\mathcal {A}}}_{-{\mathcal {C}}}\right) =P\left( \min _{j\in {\mathcal {A}}_{-{\mathcal {C}}}}|\widehat{\beta }_j|/\widehat{\sigma }_j\ge \gamma \right) \ge 1-P\left( \min _{j\in {\mathcal {A}}_{-{\mathcal {C}}}}|\widehat{\beta }_j|<n^{-1/2}c_4\gamma \right) . \end{aligned}$$

Following lemma 2, for any \(j\in {\mathcal {A}}_{-{\mathcal {C}}}\), we have \(|\beta _j^{0}-\widehat{\beta }_j|\ge |\beta _j^{0}|-|\widehat{\beta }_j|\ge c_2n^{-\kappa }-|\widehat{\beta }_j|\). Suppose \(\min _{j\in {\mathcal {A}}_{-{\mathcal {C}}}}|\widehat{\beta }_j|<n^{-1/2}c_4\gamma \), then \(\max _{j\in {\mathcal {A}}_{-{\mathcal {C}}}}|\beta _j^{0}-\widehat{\beta }_j|\ge c_2n^{-\kappa }-n^{-1/2}c_4\gamma \). If we have \(\gamma <c_2(n^{-\kappa }-\epsilon _1)n^{1/2}/(2c_4)\), we can obtain

$$\begin{aligned} P\left( \min _{j\in {\mathcal {A}}_{-{\mathcal {C}}}}\left| \widehat{\beta }_j\right|<n^{-1/2}c_4\gamma \right) < P\left( \max _{j\in {\mathcal {A}}_{-{\mathcal {C}}}}\left| \beta _j^{0}-\widehat{\beta }_j\right| \ge c_2(n^{-\kappa }+\epsilon _1)/2\right) . \end{aligned}$$

Then \(P({\mathcal {A}}_{-{\mathcal {C}}}\subseteq \widehat{{\mathcal {A}}}_{-{\mathcal {C}}}) \ge 1-2a(q+1)\exp (-c_3n^{1-2\kappa })-a\epsilon _2\). Let \(n\rightarrow \infty \), for any \(\epsilon _2>0\), we have \(\lim _{n\rightarrow \infty }P({\mathcal {A}}_{-{\mathcal {C}}}\subseteq \widehat{{\mathcal {A}}}_{-{\mathcal {C}}}) \ge 1-a\epsilon _2\), the right side of the above equation does not depend on n any more. Taking \(\epsilon _2\rightarrow 0\), we have \(\lim _{n\rightarrow \infty }P({\mathcal {A}}_{-{\mathcal {C}}}\subseteq \widehat{{\mathcal {A}}}_{-{\mathcal {C}}})=1\). \(\square \)

1.2.5 Proof of Theorem 2

Proof

For any \(j\in {\mathcal {A}}_{-{\mathcal {C}}}\), we have \(\alpha _j\ne 0\). From lemma 1, we know that \(|\beta _j^{0}|> 0\). Similarily, we have \(|\beta _j^{0}|=0\) if \(j\notin {\mathcal {A}}_{-{\mathcal {C}}}\). As \(\widehat{\beta }_j\) is a consistent estimator of \(\beta _j^{0}\) and \( M_{{\mathcal {C}},j}=|\widehat{\beta }_j|/\widehat{\sigma }_j\), we can easily conclude that \(P(\max _{j\notin {\mathcal {A}}_{-{\mathcal {C}}}}M_{{\mathcal {C}},j}<\min _{j\in {\mathcal {A}}_{-{\mathcal {C}}}}M_{{\mathcal {C}},j})\rightarrow 1\) when \(n\rightarrow \infty \), which completes the proof of theorem 2. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Zhou, H., Liu, Y. et al. Conditional screening for ultrahigh-dimensional survival data in case-cohort studies. Lifetime Data Anal 27, 632–661 (2021). https://doi.org/10.1007/s10985-021-09531-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-021-09531-7

Keywords

Navigation