Skip to main content

Advertisement

Log in

Analysis of the time-varying Cox model for the cause-specific hazard functions with missing causes

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

This paper studies the Cox model with time-varying coefficients for cause-specific hazard functions when the causes of failure are subject to missingness. Inverse probability weighted and augmented inverse probability weighted estimators are investigated. The latter is considered as a two-stage estimator by directly utilizing the inverse probability weighted estimator and through modeling available auxiliary variables to improve efficiency. The asymptotic properties of the two estimators are investigated. Hypothesis testing procedures are developed to test the null hypotheses that the covariate effects are zero and that the covariate effects are constant. We conduct simulation studies to examine the finite sample properties of the proposed estimation and hypothesis testing procedures under various settings of the auxiliary variables and the percentages of the failure causes that are missing. These simulation results demonstrate that the augmented inverse probability weighted estimators are more efficient than the inverse probability weighted estimators and that the proposed testing procedures have the expected satisfactory results in sizes and powers. The proposed methods are illustrated using the Mashi clinical trial data for investigating the effect of randomization to formula-feeding versus breastfeeding plus extended infant zidovudine prophylaxis on death due to mother-to-child HIV transmission in Botswana.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Aalen OO, Johansen S (1978) An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scand J Stat 5:141–150

    MathSciNet  MATH  Google Scholar 

  • Cai Z, Sun Y (2003) Local linear estimation for time-dependent coefficients in Cox’s regression models. Scand. J Stat 30:93–111

    Article  MathSciNet  Google Scholar 

  • Clemens JD, Sack DA, Harris JR et al (1990) Field trial of oral cholera vaccines in Bangladesh: results from three-year follow-up. Lancet 335:270–273

    Article  Google Scholar 

  • Efromovich S (2010) Dimension reduction and adaptation in conditional density estimation. J Am Stat Assoc 105:761–774

    Article  MathSciNet  Google Scholar 

  • Fan J, Gijbels I (1996) Local polynomial modelling and its applications: monographs on statistics and applied probability 66, 1st edn. Chapman and Hall/CRC, New York

    Google Scholar 

  • Gao G, Tsiatis AA (2005) Semiparametric estimators for the regression coefficients in the linear transformation competing risks model with missing cause of failure. Biometrika 92:875–891

    Article  MathSciNet  Google Scholar 

  • Gilbert P, McKeague I, Sun Y (2008) The 2-sample problem for failure rates depending on a continuous mark: an application to vaccine efficacy. Biostatistics 9(2):263–276

    Article  Google Scholar 

  • Gilbert P, Sun Y (2015) Inferences on relative failure rates in stratified mark-specific proportional hazards models with missing marks, with application to human immunodeficiency virus vaccine efficacy trials. J R Stat Soc Ser C (Appl Stat) 64(1):49–73

    Article  MathSciNet  Google Scholar 

  • Goetghebeur E, Ryan L (1995) Analysis of competing risks survival data when some failure types are missing. Biometrika 82(4):821–833

    Article  MathSciNet  Google Scholar 

  • Hall P, Racine JS, Li Q (2004) Cross-validation and the estimation of conditional probability densities. J Am Stat Assoc 99:1015–1026

    Article  MathSciNet  Google Scholar 

  • Horvitz D, Thompson D (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685

    Article  MathSciNet  Google Scholar 

  • Hyun S, Lee J, Sun Y (2012) Proportional hazards model for competing risks data with missing cause of failure. J Stat Plann Inference 142:1767–1779

    Article  MathSciNet  Google Scholar 

  • Izbicki R, Lee AB (2016) Nonparametric conditional density estimation in a high-dimensional regression setting. J Comput Gr Stat 25(4):1297–1316

    Article  MathSciNet  Google Scholar 

  • Lin DY, Wei LJ, Ying Z (1993) Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 80:557–572

    Article  MathSciNet  Google Scholar 

  • Liu L, Nevo D, Nishihara R, Cao Y, Song M, Twombly T, Chan A, Giovannucci E, VanderWeele T, Wang M, Ogino S (2018) Utility of inverse probability weighting in molecular pathological epidemiology. Eur J Epidemiol 33(4):381–392

    Article  Google Scholar 

  • Lu W, Liang Y (2008) Analysis of competing risks data with missing cause of failure under additive hazards model. Stat Sin 19:219–234

    MathSciNet  MATH  Google Scholar 

  • Lu K, Tsiatis A (2001) Multiple imputation methods for estimating regression coefficients in the competing risks model with missing cause of failure. Biometrics 57(4):1191–1197

    Article  MathSciNet  Google Scholar 

  • Lu K, Tsiatis A (2005) Comparison between two partial likelihood approaches for the competing risks model with missing cause of failure. Lifetime Data Anal 11:29–40

    Article  MathSciNet  Google Scholar 

  • Martinussen T, Scheike TH, Skovgaard IM (2002) Efficient estimation of fixed and time-varying covariates effects in multiplicative intensity models. Scand J Stat 29:59–77

    Article  MathSciNet  Google Scholar 

  • Martinussen T, Scheike T (2006) Dynamic regression models for survival data. Springer, New York

    MATH  Google Scholar 

  • Murphy SA, Sen PK (1991) Time-dependent coefficients in a Cox-type regression model. Stoch Process Appl 39:153–180

    Article  MathSciNet  Google Scholar 

  • Nevo D, Nishihara R, Ogino S, Wang M (2018) The competing risks Cox model with auxiliary case covariates under weaker missing-at-random cause of failure. Lifetime Data Anal 24:425–442

    Article  MathSciNet  Google Scholar 

  • Rice JA, Silverman BW (1991) Estimating the mean and covariance structure nonparametrically when the data are curves. J R Stat Soc Ser B 53:233–243

    MathSciNet  MATH  Google Scholar 

  • Robins J, Rotnitzky A, Zhao L (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866

    Article  MathSciNet  Google Scholar 

  • Rubin D (1976) Inference and missing data. Biometrika 63:581–592

    Article  MathSciNet  Google Scholar 

  • Scharfstein DO, Rotnitzky A, Robins JM (1999) Adjusting for nonignorable drop-out using semiparametric nonresponse models: rejoinder. J Am Stat Assoc 94:1135–1146

    MATH  Google Scholar 

  • Sun Y, Gilbert PB (2012) Estimation of stratified mark-specific proportional hazards models with missing marks. Scand J Stat 39:34–52

    Article  MathSciNet  Google Scholar 

  • Sun Y, Hyun S, Gilbert PB (2008) Testing and estimation of time-varying cause-specific hazard ratios with covariate adjustment. Biometrics 64:1070–1079

    Article  MathSciNet  Google Scholar 

  • Sun Y, Qian X, Shou Q, Gilbert P (2017) Analysis of two-phase sampling data with semiparametric additive hazards models. Lifetime Data Anal 23:377–399

    Article  MathSciNet  Google Scholar 

  • Sun Y, Sundaram R, Zhao Y (2009) Empirical likelihood inference for the Cox model with time-dependent coefficients via local partial likelihood. Scand J Stat 36:444–462

    Article  MathSciNet  Google Scholar 

  • Sun Y, Wang H, Gilbert PB (2012) Quantile regression for competing risks data with missing cause of failure. Stat Sin 22:703–728

    Article  MathSciNet  Google Scholar 

  • Sun Y, Wu H (2005) Semiparametric time-varying coefficients regression model for longitudinal data. Scand J Stat 32:21–47

    Article  MathSciNet  Google Scholar 

  • Thior, I., Lockman, S., Smeaton, L.M., Shapiro, R.L., Wester, C., Heymann, S.J., Gilbert, P.B., Stevens, L., Peter, T., Kim, S., van Widenfelt, E., Moffat, C., Ndase, P., Arimi, P., Kebaabetswe, P., Mazonde, P., Makhema, J., McIntosh, K., Novitsky, V., Lee, T.H., Marlink, R., Lagakos, S., Essex M. and the Mashi Study Team (2006) Breastfeeding plus infant zidovudine prophylaxis for 6 months vsformula feeding plus infant zidovudine for 1 month to reducemother-to-child HIV transmission in Botswana: a randomized trial:the Mashi study. J. Am. Stat Medical Assoc 296: 794–805

  • Tian L, Zucker D, Wei LJ (2005) On the Cox model with time-varying regression coefficients. J Am Stat Assoc 100:172–183

    Article  MathSciNet  Google Scholar 

  • van der Vaart AW (1998) Asymptotic statistics. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Zucker DM, Karr AF (1990) Nonparametric survival analysis with time-dependent covariate effects: a penalized partial likelihood approach. Ann Stat 18:329–353

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank the reviewers for their constructive comments that have improved the contents and the presentation of the paper. The authors also thank the Mashi study team (led by Dr. Max Essex) and the Mashi study participants. Research reported in this publication was partially supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Award Number R37 AI054165. The research of Yanqing Sun was also partially supported by National Science Foundation grant DMS-1513072 and DMS-1915829. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanqing Sun.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 341 KB)

Appendices

Appendix

This Appendix introduces the notations and presents the conditions for the asymptotic results presented in Theorems 1–5.

Let \(\mathcal {F}_t\) be the right continuous filtration generated by the data processes \(\{N_{ik}(s), Y_i(s), Z_i(s); i=1,\dots ,n, k=1,2,\dots ,L, 0 \le s \le t\}\). Assume \(E(dN_{ik}(t)=1|\mathcal {F}_{t^-})=E(dN_{ik}(t)=1|Y_i(t), Z_i(t))=Y_i(t)\lambda _{ik}(t|Z_i(t))dt\). It follows that \(M_{ik}(t) = N_{ik}(t) -\int _0^t Y_i(u)\lambda _k(u|Z_i(u))du\), \(i=1,\dots ,n, k=1,2,\dots ,L\), are multivariate orthogonal martingales with respect to \(\mathcal {F}_t\) (Aalen and Johansen 1978). To accommodate additional information introduced due to missing data, we define the augmented filtration \(\mathcal {F}_t^*\) generated by the data processes \(\{N_{ik}(s), Y_i(s), Z_i(s), R_i, \delta _i A_i; i=1,\dots ,n, k=1,2,\dots ,L, 0 \le s \le t\}\). Let \(\lambda _{ik}^*(t)\,dt=P\{T_i\in [t,t+dt),V_i=k|X_i \ge t, Z_i(t), R_i, \delta _i A_i\}\). Then \(Y_i(t)\lambda _{ik}^*(t)\) is the intensity of \(N_{ik}(t)\) with respect to \(\mathcal {F}_t^*\), and \(M_{ik}^*(t) = N_{ik}(t) -\int _0^t Y_i(u)\lambda _{ik}^*(u)du\), \(i=1,\dots ,n, k=1,2,\dots ,L\), are multivariate orthgonal martingales with respect to \(\mathcal {F}_t^*\).

Let \(S^{(j)}(t,\beta _k)=n^{-1}\sum _{i=1}^{n} Y_i(t)\exp \left( \beta _k(t)^{{\textsf {T}}}Z_i(t)\right) Z_i(t)^{\otimes j}\), and \(S_I^{*(j)}(t,\beta _k,\psi )=n^{-1}\sum _{i=1}^n q_i \)\(Y_i(t)\exp \left( \beta _k(t)^{{\textsf {T}}}Z_i(t)\right) Z_i(t)^{\otimes j}\), for \(k=1,\dots ,L\) and \(j=0,1,2\). Let \(s^{(j)}(t,\beta _k)=ES^{(j)}(t,\beta _k)\) and \(s_I^{*(j)}(t,\beta _k,\psi )=ES_I^{*(j)}(t,\beta _k,\psi )\). If the model \(r(\zeta _i, A_i, \psi )\) is correctly specified, then \(s^{(j)}(t,\beta _k)\)\(= s_I^{*(j)}(t,\beta _k,\psi )\). Define \(\Sigma _k(t) = \big [s^{(2)}(t,\beta _k) - {\big (s^{(1)}(t,\beta _k)\big )^{\otimes 2}}/\)\({s^{(0)}(t,\beta _k)}\big ] \lambda _{k0}(t)\) and \(\Sigma _k^*(t)=E\big [ \big (Z_i(t)-\)\({s^{(1)}(t,\beta _k)}/{s^{(0)}(t,\beta _k)} \big )^{\otimes 2}\)\( {R_i}{\pi ^{-2}(Q_i)} Y_{i}(t) \lambda _{ik}^*(t) \big ]\).

Let \(S_i^{\psi }\) and \(I^{\psi }\) be the score vector and information matrix for \({{\widehat{\psi }}}\) under (4). Then,

$$\begin{aligned}&S_i^{\psi } = \frac{\delta _i(R_i-r(\zeta _i, A_i,\psi _0))}{r(\zeta _i, A_i,\psi _0)(1-r(\zeta _i, A_i,\psi _0))} \frac{\partial r(\zeta _i, A_i,\psi _0)}{\partial \psi },\\&I^{\psi } = E\Bigg \{ \frac{\delta _i}{r(\zeta _i, A_i,\psi _0)(1-r(\zeta _i, A_i,\psi _0))} \frac{\partial r(\zeta _i, A_i,\psi _0)}{\partial \psi } \Bigg (\frac{\partial r(\zeta _i, A_i,\psi _0)}{\partial \psi } \Bigg )^{{\textsf {T}}} \Bigg \}, \end{aligned}$$

and \({{\widehat{\psi }}} - \psi = n^{-1}\sum _{i=1}^n(I^{\psi })^{-1}S_i^{\psi }+o_p(n^{-1/2})\), where \(\psi _0\) is the true value of \(\psi \). We also define the following notations:

$$\begin{aligned}&{\mathcal {A}}_{i}(t,\beta _k)=\int _0^{\tau } K_h(u-t) H^{-1}\Bigg (Z_i(u)-\frac{s^{(1)}(u,\beta _k)}{s^{(0)}(u,\beta _k)} \Bigg ) q_{i0}\ dM_{ik}(u),\\&{\mathcal {B}}_{i}(t,\beta _k)=\int _0^{\tau } K_h(u-t) H^{-1}\Bigg (Z_i(u)-\frac{s^{(1)}(u,\beta _k)}{s^{(0)}(u,\beta _k)} \Bigg )(1-q_{i0})\ E(dM_{ik}(u)|Q_{i}),\\&\mathcal {D}^n(t,\beta _k) = n^{-1}\sum _{i=1}^n \int _0^{\tau }K_h(u-t)\Bigg (Z_i(u)-\frac{s^{(1)}(u,\beta _k)}{s^{(0)}(u,\beta _k)}\Bigg ) \dfrac{-R_i}{(\pi (Q_i,\psi _0))^2}\\&\quad \Bigg (\dfrac{\partial \pi (Q_i,\psi _0)}{\partial \psi }\Bigg )^{\textsf {T}}dM_{ik}(u),\\&\mathcal {O}_i(t,\beta _k) = \mathcal {D}^n(t,\beta _k)(I^{\psi })^{-1}S_i^{\psi }. \end{aligned}$$

The following conditions are assumptions we use to prove the theorems:

  1. (C.1)

    For \(k=1,\dots ,L\), \(\beta _k(t)\) has componentwise second derivatives on \([0,\tau ]\). The sample path of the covariate process \(Z_i(t)\) is left continuous and of bounded variation, and satisfies the moment condition \(E[||Z_i(t)||^4\exp (2M||Z_i(t)||)]<\infty \), where M is a constant such that \((t,\beta _k(t))\in [0,\tau ]\times [-M,M]^p\) for all t and \(||A||=\max _{k,l}|a_{kl}|\) for a matrix \(A=(a_{kl})\).

  2. (C.2)

    The kernel function \(K(\cdot )\) is bounded and symmetric with bounded support \([-1,1]\). The bandwidth h satisfies \(nh^2\rightarrow \infty \) and \(nh^5\) is bounded as \(n \rightarrow \infty \).

  3. (C.3)

    The matrix \(\Sigma _k(t)\) is positive definite for all \(t \in [0, \tau ]\).

  4. (C.4)

    For \(k=1,\dots ,L\) and for \(j=0,1,2\), the functions \(s^{(j)}(t,\beta _k)\) and \(s_I^{*(j)}(t,\beta _k,\psi )\) are componentwise continuous on \(t\in [0,\tau ],\beta _k\in [-M,M]^p,\psi \in \varTheta _{\psi }\), where \(\varTheta _{\psi }\) is a compact set. \(\sup _{t\in [0,\tau ],\beta _k\in [-M,M]^p}||S^{(j)}(t,\beta _k)-s^{(j)}(t,\beta _k)||=O_p(n^{-1/2})\), and \(\sup _{t\in [0,\tau ],\beta _k\in [-M,M]^p,\psi \in \varTheta _{\psi }}\)\(||S_I^{*(j)}(t,\beta _k,\psi )-s_I^{*(j)}(t,\beta _k,\psi )||=O_p(n^{-1/2}).\)

  5. (C.5)

    The function \(r(\zeta _i,A_i,\psi )\) is twice differentiable with respect to \(\psi \) on a compact set \(\varTheta _{\psi }\), \(r^\prime (\zeta _i,A_i,\psi )=\partial r(\zeta _i,A_i,\psi )/\partial \psi \) is uniformly bounded, and there is an \(\varepsilon > 0\) such that \(r(\zeta _i,A_i,\psi ) \ge \varepsilon \) for all i. The function \(f(A_i |k,T_i,Z_i,\varphi _k)\) is also twice differentiable with respect to \(\varphi _k\) on a compact set \(\varTheta _{\varphi _k}\) for \(k=1,\dots ,L\).

Supplementary materials

The Web-based Supplementary Materials referenced in the manuscript are available with this paper at the journal’s online website.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Heng, F., Sun, Y., Hyun, S. et al. Analysis of the time-varying Cox model for the cause-specific hazard functions with missing causes. Lifetime Data Anal 26, 731–760 (2020). https://doi.org/10.1007/s10985-020-09497-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-020-09497-y

Keywords

Navigation