Skip to main content
Log in

Factor copula models for right-censored clustered survival data

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

In this article we extend the factor copula model to deal with right-censored event time data grouped in clusters. The new methodology allows for clusters to have variable sizes ranging from small to large and intracluster dependence to be flexibly modeled by any parametric family of bivariate copulas, thus encompassing a wide range of dependence structures. Incorporation of covariates (possibly time dependent) in the margins is also supported. Three estimation procedures are proposed: both one- and two-stage parametric and a two-stage semiparametric method where marginal survival functions are estimated by using a Cox proportional hazards model. We prove that the estimators are consistent and asymptotically normally distributed, and assess their finite sample behavior with simulation studies. Furthermore, we illustrate the proposed methods on a data set containing the time to first insemination after calving in dairy cattle clustered in herds of different sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Andersen EW (2005) Two-stage estimation in copula models used in family studies. Lifetime Data Anal 11(3):333–350

    Article  MathSciNet  Google Scholar 

  • Barthel N, Geerdens C, Killiches M, Janssen P, Czado C (2018) Vine copula based likelihood estimation of dependence patterns in multivariate event time data. Comput Stat Data Anal 117:109–127

    Article  MathSciNet  Google Scholar 

  • Barthel N, Geerdens C, Czado C, Janssen P (2019) Dependence modeling for recurrent event times subject to right-censoring with d-vine copulas. Biometrics 75(2):439–451

    Article  MathSciNet  Google Scholar 

  • Cox DR (1972) Regression models and life-tables. J Roy Stat Soc: Ser B (Methodol) 34(2):187–202

    MathSciNet  MATH  Google Scholar 

  • Cox DR, Hinkley D (1974) Theoretical statistics. Chapman and Hall, London

    Book  Google Scholar 

  • Duchateau L, Janssen P (2004) Penalized partial likelihood for frailties and smoothing splines in time to first insemination models for dairy cows. Biometrics 60(3):608–614

    Article  MathSciNet  Google Scholar 

  • Duchateau L, Janssen P (2008) The frailty model. Springer, Berlin

    MATH  Google Scholar 

  • Emura T, Nakatochi M, Murotani K, Rondeau V (2017) A joint frailty-copula model between tumour progression and death for meta-analysis. Stat Methods Med Res 26(6):2649–2666

    Article  MathSciNet  Google Scholar 

  • Glidden DV (2000) A two-stage estimator of the dependence parameter for the clayton-oakes model. Lifetime Data Anal 6(2):141–156

    Article  MathSciNet  Google Scholar 

  • Goethals K, Janssen P, Duchateau L (2008) Frailty models and copulas: similarities and differences. J Appl Stat 35(9):1071–1079

    Article  MathSciNet  Google Scholar 

  • Hougaard P (2000) Analysis of multivariate survival data. Springer, New York

    Book  Google Scholar 

  • Joe H (2005) Asymptotic efficiency of the two-stage estimation method for copula-based models. J Multiv Anal 94(2):401–419

    Article  MathSciNet  Google Scholar 

  • Joe H (2014) Dependence modeling with copulas. Chapman & Hall/CRC, Boca Raton

    Book  Google Scholar 

  • Krupskii P, Joe H (2013) Factor copula models for multivariate data. J Multiv Anal 120:85–101

    Article  MathSciNet  Google Scholar 

  • Lehmann EL, Casella G (1998) Theory of point estimation. Springer, New York

    MATH  Google Scholar 

  • Massonnet G, Janssen P, Duchateau L (2009) Modelling udder infection data using copula models for quadruples. J Stat Plan Inference 139(11):3865–3877

    Article  MathSciNet  Google Scholar 

  • Nelsen RB (2007) An introduction to copulas. Springer Science & Business Media, New York

    MATH  Google Scholar 

  • Othus M, Li Y (2010) A gaussian copula model for multivariate survival data. Stat Biosci 2(2):154–179

    Article  Google Scholar 

  • Prenen L, Braekers R, Duchateau L (2017a) Extending the archimedean copula methodology to model multivariate survival data grouped in clusters of variable size. J R Stat Soc Ser B 79(2):483–505

    Article  MathSciNet  Google Scholar 

  • Prenen L, Braekers R, Duchateau L, De Troyer E (2017b) Sunclarco: survival analysis using copulas. R package version 1

  • R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria

  • Romeo JS, Meyer R, Gallardo DI (2018) Bayesian bivariate survival analysis using the power variance function copula. Lifetime Data Anal 24(2):355–383

    Article  MathSciNet  Google Scholar 

  • Schneider S, Demarqui FN, Colosimo EA, Mayrink VD (2020) An approach to model clustered survival data with dependent censoring. Biom J 62(1):157–174

    Article  MathSciNet  Google Scholar 

  • Shih JH, Louis TA (1995) Inferences on the association parameter in copula models for bivariate survival data. Biometrics 51:1384–1399

    Article  MathSciNet  Google Scholar 

  • Sklar A (1959) Fonctions de répartition à n dimensions et leurs marges. Publ Inst Statist Univ Par 8:229–231

    MATH  Google Scholar 

  • Spiekerman CF, Lin D (1998) Marginal regression models for multivariate failure time data. J Am Stat Assoc 93(443):1164–1175

    Article  MathSciNet  Google Scholar 

  • Therneau TM (2015) A Package for Survival Analysis in S. R package version 2:38

    Google Scholar 

  • Van der Vaart AW (2000) Asymptotic statistics, vol 3. Cambridge University Press, Cambridge

    Google Scholar 

  • Wienke A (2011) Frailty models in survival analysis. Chapman and Hall, Boca Raton

    Google Scholar 

  • Xu JJ (1996) Statistical modelling and inference for multivariate and longitudinal discrete response data. PhD thesis, University of British Columbia

  • Yan J et al (2007) Enjoy the joy of copulas: with a package copula. J Stat Softw 21(4):1–21

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the Brazilian Federal Agency for Support and Evaluation of Graduate Education (CAPES - PDSE; Process 88881.190074/2018-01). We would also like to express our deep appreciation to the editor, the associate editor and the reviewers for their comments on our manuscript. They have allowed us to majorly improve our work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eleanderson Campos.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Proof of Proposition 1

Assuming that \({G_\theta }(x)\) is differentiable, such that \({g_\theta }(x) = \frac{d}{{dx}}{G_\theta }(x)\), we can make \(x = G_\theta ^{ - 1}(v)\) with \(dx = \frac{{dv}}{{{g_\theta }(G_\theta ^{ - 1}(v))}} = \frac{{dv}}{{{g_\theta }(x)}}\). This way, we can rewrite Eq. (6) as

$$\begin{aligned} \int _0^1 {\prod \limits _{j = 1}^n {{{\mathrm {C}}_{\varvec{\cdot }|\,V}}({u_j}|v)\,dv} }=\int _0^1 {\prod \limits _{j = 1}^n {\exp \left\{ { - G_\theta ^{ - 1}(v){\varphi ^{-1} _\theta }({u_j})} \right\} dv} }. \end{aligned}$$
(13)

By subtracting \(\int _0^1 {\prod \limits _{j = 1}^n {\exp \left\{ { - G_\theta ^{ - 1}(v){\varphi ^{-1} _\theta }({u_j})} \right\} dv} }\) from both sides of Eq. (13), we get

$$\begin{aligned}&\int _0^1 {\prod \limits _{j = 1}^n {{{\mathrm {C}}_{\varvec{\cdot }|\,V}}({u_j}|v)\,dv} } - \int _0^1 {\prod \limits _{j = 1}^n {\exp \left\{ { - G_\theta ^{ - 1}(v){\varphi ^{-1} _\theta }({u_j})} \right\} dv} } = 0\\&\quad \Longleftrightarrow \int _0^1 {\left( {\prod \limits _{j = 1}^n {{{\mathrm {C}}_{\varvec{\cdot }|\,V}}({u_j}|v) - \prod \limits _{j = 1}^n {\exp \left\{ { - G_\theta ^{ - 1}(v){\varphi ^{-1} _\theta }({u_j})} \right\} } } } \right) } \,dv = 0. \end{aligned}$$

Hence, for the above condition to hold, it is sufficient that

$$\begin{aligned} \prod \limits _{j = 1}^n {{{\mathrm {C}}_{\varvec{\cdot }|\,V}}({u_j}|v) - \prod \limits _{j = 1}^n {\exp \left\{ { - G_\theta ^{ - 1}(v){\varphi ^{-1} _\theta }({u_j})} \right\} } } = 0, \end{aligned}$$

or, equivalently,

$$\begin{aligned} {{\mathrm {C}}_{\varvec{\cdot }|\,V}}({u_j}|v) = \exp \left\{ { - G_\theta ^{ - 1}(v){\varphi ^{-1} _\theta }({u_j})} \right\} , \end{aligned}$$

which is the conditional distribution derived from

$$\begin{aligned} {{\mathrm {C}}_{\varvec{\cdot }V}}({u_j},v)&= \int _0^v {\exp \left\{ { - G_\theta ^{ - 1}(t){\varphi ^{-1} _\theta }({u_j})} \right\} dt}\nonumber \\&=\int _0^{G_\theta ^{ - 1}(v)} {\exp \left\{ { - s{\varphi ^{-1} _\theta }({u_j})} \right\} d{G_\theta }(s)}, \end{aligned}$$

a bivariate function with the following properties:

1) \({\mathrm {C}}_{\varvec{\cdot } V}(u_j,v)\) is grounded

$$\begin{aligned}\begin{aligned} {{\mathrm {C}}_{\varvec{\cdot } V}}(0,v) = \int _0^v {\exp \left\{ { - G_\theta ^{ - 1}(t)\varphi _\theta ^{ - 1}(0)} \right\} dt = } \int _0^v {0\,dt = 0} \\ {{\mathrm {C}}_{\varvec{\cdot } V}}({u_j},0) = \int _0^0 {\exp \left\{ { - G_\theta ^{ - 1}(t)\varphi _\theta ^{ - 1}({u_j})} \right\} dt} = 0. \\ \end{aligned} \end{aligned}$$

2) \({\mathrm {C}}_{\varvec{\cdot } V}(u_j,v)\) has margins \(u_j\) and v

$$\begin{aligned} {{\mathrm {C}}_{\varvec{\cdot } V}}(1,v)&= \int _0^v {\exp \left\{ { - G_\theta ^{ - 1}(t)\varphi _\theta ^{ - 1}(1)} \right\} dt = } \int _0^v {dt} = v\\ {{\mathrm {C}}_{\varvec{\cdot } V}}({u_j},1)&= \int _0^{G_\theta ^{ - 1}(1)} {\exp \left\{ { - s\varphi _\theta ^{ - 1}({u_j})} \right\} d{G_\theta }(s)} = \int _0^{ + \infty } {\exp \left\{ { - s\varphi _\theta ^{ - 1}({u_j})} \right\} d{G_\theta }(s)}\\&= {\varphi _\theta }\left( {\varphi _\theta ^{ - 1}({u_j})} \right) = {u_j}. \end{aligned}$$

3) \({\mathrm {C}}_{\varvec{\cdot } V}(u_j,v)\) is 2-increasing, i.e., \(\forall u_j,u^{*}_j,v,v^* \in [0,1]\) with \(u_j\le u^{*}_j\), \(v\le v^*\), it follows that \({{\mathrm {C}}_{\varvec{\cdot } V}}(u_j^*,{v^*}) - {{\mathrm {C}}_{\varvec{\cdot } V}}({u_j},{v^*}) - {{\mathrm {C}}_{\varvec{\cdot } V}}(u_j^*,v) + {{\mathrm {C}}_{\varvec{\cdot } V}}({u_j},v) \geqslant 0\)

$$\begin{aligned}&\int _0^{{v^*}} {\left[ {\exp \left\{ { - G_\theta ^{ - 1}(t){\varphi _\theta ^{ - 1}}(u_j^*)} \right\} - \exp \left\{ { - G_\theta ^{ - 1}(t){\varphi _\theta ^{ - 1}}({u_j})} \right\} } \right] } \,dt\\&\qquad - \int _0^v {\left[ {\exp \left\{ { - G_\theta ^{ - 1}(t){\varphi _\theta ^{ - 1}}(u_j^*)} \right\} - \exp \left\{ { - G_\theta ^{ - 1}(t){\varphi _\theta ^{ - 1}}({u_j})} \right\} } \right] } \,dt \geqslant 0 \\&\quad \Longleftrightarrow \int _v^{{v^*}} {\left[ {\exp \left\{ { - G_\theta ^{ - 1}(t){\varphi _\theta ^{ - 1}}(u_j^*)} \right\} - \exp \left\{ { - G_\theta ^{ - 1}(t){\varphi _\theta ^{ - 1}}({u_j})} \right\} } \right] } \,dt \geqslant 0 . \end{aligned}$$

Therefore, \({\mathrm {C}}_{\varvec{\cdot } V}(u_j,v)\) is a bivariate copula, proving that the Archimedean copula model of Prenen et al. (2017a) is a subclass of our model.

1.2 Proof of Proposition 2

Let \({\tilde{T}}_{i0}=b_i\) and \(\epsilon _{ij}\sim N(0,1)\), such that \({\tilde{T}}_{ij} = \sqrt{\sigma }b_i + \sqrt{1-\sigma } \epsilon _{ij}\). Since \(b_i, \epsilon _{i1},\ldots ,\epsilon _{in_i}\) are independent standard normal random variables, their joint density is \(f_{b_i, \epsilon _{i1},\ldots ,\epsilon _{in_i}}=\phi (b_i)\phi (\epsilon _{i1})\ldots \phi (\epsilon _{in_i})\). Given the set of transformation functions

$$\begin{aligned} {{\tilde{T}}_{i0}}&= {g_{0}}(b_i,\epsilon _{i1},\ldots ,{\epsilon _{in_i}}) = b_i\\ {{\tilde{T}}_{i1}}&= {g_1}(b_i,\epsilon _{i1},\ldots ,{\epsilon _{in_i}}) = \sqrt{\sigma }b_i + \sqrt{1 - \sigma } {\epsilon _{i1}}\\&\,\,\,\, \vdots \\ {{\tilde{T}}_{in_i}}&= {g_{n_i}}(b_i,\epsilon _{i1},\ldots ,{\epsilon _{in_i}}) = \sqrt{\sigma }b_i + \sqrt{1 - \sigma } {\epsilon _{in_i}}, \end{aligned}$$

take the set of inverse transformation functions \(g_0^{ - 1},g_1^{ - 1},\ldots ,g^{ - 1}_{n_i}\)

$$\begin{aligned} b_i&= g_0^{-1}({\tilde{T}}_{i0},{\tilde{T}}_{i1},\ldots ,{\tilde{T}}_{in_i}) = {\tilde{T}}_{i0}\\ {\epsilon _{i1}}&= {g_1^{-1}}({\tilde{T}}_{i0},{\tilde{T}}_{i1},\ldots ,{\tilde{T}}_{in_i}) = \frac{{{{\tilde{T}}_{i1}} - \sqrt{\sigma }{{\tilde{T}}_{i0}}}}{{\sqrt{1 - \sigma } }} \\&\vdots \\ {\epsilon _{in_i}}&= g^{ - 1}_{n_i}({\tilde{T}}_{i0},{\tilde{T}}_{i1},\ldots ,{\tilde{T}}_{in_i}) = \frac{{{{\tilde{T}}_{in_i}} - \sqrt{\sigma }{{\tilde{T}}_{i0}}}}{{\sqrt{1 - \sigma } }}. \end{aligned}$$

It can be shown that the Jacobian of the transformations above is given by

$$\begin{aligned} J=\prod \limits _{j = 1}^{n_i} {\frac{1}{{\sqrt{1 - \sigma } }}}. \end{aligned}$$

Hence, the joint cumulative distribution function of \({\tilde{T}}_{i1},\ldots ,{\tilde{T}}_{in_i}\) after integrating out \({\tilde{T}}_{i0}\) and making appropriate variable changes is given by

$$\begin{aligned} {F_{{\tilde{T}}_{i1},\ldots ,{\tilde{T}}_{in_i}}}&:= {\mathrm {C}}(\varPhi ({{\tilde{t}}_{i1}}),\ldots ,\varPhi ({{\tilde{t}}_{in_i}})) = \int _{ - \infty }^{ + \infty } {\left[ {\prod \limits _{j = 1}^{n_i} {\varPhi \left( {\frac{{{{\tilde{t}}_{ij}} - \sqrt{\sigma }{b_i}}}{{\sqrt{1 - \sigma } }}} \right) } } \right] } \,\phi ({b_i})d{b_i}\\&=\int _{ 0 }^{ 1 } { {\prod \limits _{j = 1}^{n_i} {\varPhi \left( {\frac{{{{\tilde{t}}_{ij}} - \sqrt{\sigma }{\varPhi ^{-1}(v_i) }}}{{\sqrt{1 - \sigma } }}} \right) } } } \,d{v_i}, \end{aligned}$$

a one-factor copula model where the bivariate linking copulas are all Gaussian with correlation parameter \(\theta =\sqrt{\sigma }\).

On the original time scale:

$$\begin{aligned} S(t_{i1},\ldots ,t_{in_i}|{\mathbf {Z}}_{i1},\ldots ,{\mathbf {Z}}_{in_i})=\int _{ 0 }^{ 1 } { {\prod \limits _{j = 1}^{n_i} {\varPhi \left( {\frac{{{\varPhi ^{-1}(S(t_{ij}|{\mathbf {Z}}_{ij}))} - \sqrt{\sigma }{\varPhi ^{-1}(v_i) }}}{{\sqrt{1 - \sigma } }}} \right) } } } \,d{v_i}, \end{aligned}$$

where \(S(t_{ij}|{\mathbf {Z}}_{ij})\) is the marginal survival function associated with a Cox proportional hazards model. \(\square \)

For convenience, we first introduce some notations and definitions adapted from Prenen et al. (2017a) and Othus and Li (2010) before we continue with the other proofs.

$$\begin{aligned} {Y_{ij}}(t)= & {} {I_{\left\{ {{X_{ij}} \ge t} \right\} }}\\ {\check{\varLambda }} (t)= & {} \int _0^t \frac{{d\sum \limits _{i = 1}^K {\sum \limits _{j = 1}^{{n_i}} {{\delta _{ij}}{I_{\left\{ {{X_{ij}} \le u} \right\} }}} } }}{{\sum \limits _{i = 1}^K {\sum \limits _{j = 1}^{{n_i}} {{Y_{ij}}(u)\exp \left\{ {\check{{\varvec{\beta }}}'{\mathbf{{Z}}_{ij}}(u)} \right\} } } }} \\= & {} \sum \limits _{i = 1}^K {\sum \limits _{j = 1}^{{n_i}} {\frac{{{\delta _{ij}}{I_{\left\{ {{X_{ij}} \le t} \right\} }}}}{{\sum \limits _{k = 1}^K {\sum \limits _{l = 1}^{{n_k}} {{I_{\left\{ {{X_{kl}} \le {X_{ij}}} \right\} }}\exp \left\{ {\check{{\varvec{\beta }}}'{\mathbf{{Z}}_{kl}}(X_{ij})} \right\} } } }}} } ,\\ {H_{ij}}= & {} \exp \left[ { - \int _0^\tau {{Y_{ij}}(u)\exp \left\{ {{\varvec{\beta }} '{\mathbf{{Z}}_{ij}}(u)} \right\} d\varLambda (u)} } \right] ,\\ H_{ij}^0= & {} \exp \left[ { - \int _0^\tau {{Y_{ij}}(u)\exp \left\{ {{{{\varvec{\beta }'_0}}}{\mathbf{{Z}}_{ij}}(u)} \right\} d{\varLambda _0}(u)} } \right] ,\\ {{\check{H} }_{ij}}= & {} \exp \left[ { - \int _0^\tau {{Y_{ij}}(u)\exp \left\{ {\check{{\varvec{\beta }}}' {\mathbf{{Z}}_{ij}}(u)} \right\} d{\check{\varLambda }} (u)} } \right] ,\\ {H_{ij}}(t)= & {} \exp \left[ { - \int _0^\tau {{Y_{ij}}(u)\exp \left\{ {{\varvec{\beta }} '{\mathbf{{Z}}_{ij}}(u)} \right\} d\left\{ {\varLambda + t\left( {\varGamma - \varLambda } \right) } \right\} (u)} } \right] . \end{aligned}$$

Note that \({H_{ij}} = {H_{ij}}(0)\).

$$\begin{aligned} L\left( {\theta ,{\varvec{\beta }} ,\varLambda } \right)&= \prod \limits _{i = 1}^K {{L_i}(\theta ,\varvec{{\varvec{\beta }}} ,\varLambda )}\\&= \prod \limits _{i = 1}^K {\int _0^1 {\prod \limits _{j = 1}^{{n_i}} {{{\mathrm {c}}_{{\varvec{\cdot }}{V}}}{{({H_{ij}},{v_i})}^{{\delta _{ij}}}} \times } } \,{{{\mathrm {C}}}_{{\varvec{\cdot }}|{V}}}{{({H_{ij}}|{v_i})}^{1 - {\delta _{ij}}}}d{v_i}}\\&= \prod \limits _{i = 1}^K {\int _0^1 {\exp \left\{ {\sum \limits _{j = 1}^{{n_i}} {\log \left( {{{\mathrm {c}}_{{\varvec{\cdot }}{V}}}{{({H_{ij}},{v_i})}^{{\delta _{ij}}}} \times {{{\mathrm {C}}}_{{\varvec{\cdot }}|{V}}}{{({H_{ij}}|{v_i})}^{1 - {\delta _{ij}}}}} \right) } } \right\} d{v_i}} \,}\\&= \prod \limits _{i = 1}^K {\int _0^1 {\exp \left\{ {\sum \limits _{j = 1}^{{n_i}} {\log {\pmb {{{\mathrm {C}}}}}}(H_{ij},v_i;\theta ) } \right\} d{v_i}} }, \end{aligned}$$

where \(\pmb {{{\mathrm {C}}}}(H_{ij},v_i,\theta ) = {{{\mathrm {c}}_{{\varvec{\cdot }}{V}}}{{({H_{ij}},{v_i};\theta )}^{{\delta _{ij}}}} \times {{{\mathrm {C}}}_{{\varvec{\cdot }}|{V}}}{{({H_{ij}}|{v_i};\theta )}^{1 - {\delta _{ij}}}}}\).

$$\begin{aligned}&{l_K}(\theta ) = {K^{ - 1}}\log \left\{ {L\left( {\theta ,{\varvec{\beta }} ,\varLambda } \right) } \right\} ,\\&{l_{K0}}(\theta ) = {K^{ - 1}}\log \left\{ {L\left( {\theta ,{{\varvec{\beta }} _0},{\varLambda _0}} \right) } \right\} ,\\&{{\check{l} }_K}(\theta ) = {K^{ - 1}}\log \left\{ {L\left( {\theta ,\check{{\varvec{\beta }}} ,{\check{\varLambda }} } \right) } \right\} ,\\&{U_K}(\theta ) = \frac{{\partial {l_K}(\theta )}}{{\partial \theta }} = {K^{ - 1}}\frac{{\partial \log \left\{ {L\left( {\theta ,{\varvec{\beta }} ,\varLambda } \right) } \right\} }}{{\partial \theta }}\\&\quad = {K^{ - 1}}\sum \limits _{i = 1}^K {\frac{{\displaystyle \int _0^1 {\exp \left\{ {\sum \limits _{j = 1}^{{n_i}} {\log \pmb {{{\mathrm {C}}}}(H_{ij},v_i,\theta )} } \right\} \left\{ {\sum \limits _{j = 1}^{{n_i}} {\frac{\partial }{{\partial \theta }}\log \pmb {{{\mathrm {C}}}}(H_{ij},v_i,\theta )} } \right\} d{v_i}} }}{{\displaystyle \int _0^1 {\exp \left\{ {\sum \limits _{j = 1}^{{n_i}} {\log \pmb {{{\mathrm {C}}}}(H_{ij},v_i,\theta )} } \right\} d{v_i}} }}},\\&{U_{K0}}(\theta ) = \frac{{\partial {l_{K0}}(\theta )}}{{\partial \theta }} = {K^{ - 1}}\frac{{\partial \log \left\{ {L\left( {\theta ,{{\varvec{\beta }} _0},{\varLambda _0}} \right) } \right\} }}{{\partial \theta }},\\&{{{\check{U}} }_K}(\theta ) = \frac{{\partial {{{\check{l}} }_K}(\theta )}}{{\partial \theta }} = {K^{ - 1}}\frac{{\partial \log \left\{ {L\left( {\theta ,{\check{\varvec{\beta }}} ,{{\check{\varLambda }}} } \right) } \right\} }}{{\partial \theta }}. \end{aligned}$$

The following notation is copied from Spiekerman and Lin (1998). Let \({\mathbf{{a}}^{ \otimes 0}} = 1,\,{\mathbf{{a}}^{ \otimes 1}} = \mathbf{{a}},\,{\mathbf{{a}}^{ \otimes 2}} = \mathbf{{a'a}}\) and \(r=0,1,2\):

$$\begin{aligned} {\mathbf{{S}}^{(r)}}\left( {{\varvec{\beta }} ,t} \right)= & {} {K^{ - 1}}\sum \limits _{i = 1}^K {\sum \limits _{j = 1}^{{n_i}} {{Y_{ij}}(t)\exp \left\{ {{\varvec{\beta }} '{\mathbf{{Z}}_{ij}}(t)} \right\} {\mathbf{{Z}}_{ij}}{{(t)}^{ \otimes r}}} },\qquad \\ {\mathbf{{s}}^{(r)}}= & {} E\left[ {{\mathbf{{S}}^{(r)}}\left( {{\varvec{\beta }} ,t} \right) } \right] ,\\ \mathbf{{E}}({\varvec{\beta }} ,t)= & {} \frac{{{\mathbf{{S}}^{(1)}}\left( {{\varvec{\beta }} ,t} \right) }}{{{S^{(0)}}\left( {{\varvec{\beta }} ,t} \right) }},\\ \mathbf{{e}}({\varvec{\beta }} ,t)= & {} \frac{{{\mathbf{{s}}^{(1)}}\left( {{\varvec{\beta }} ,t} \right) }}{{{s^{(0)}}\left( {{\varvec{\beta }} ,t} \right) }},\\ \mathbf{{V}}({\varvec{\beta }} ,t)= & {} \frac{{{\mathbf{{S}}^{(2)}}\left( {{\varvec{\beta }} ,t} \right) }}{{{S^{(0)}}\left( {{\varvec{\beta }} ,t} \right) }} - \mathbf{{E}}{({\varvec{\beta }} ,t)^{ \otimes 2}},\\ \mathbf{{v}}({\varvec{\beta }} ,t)= & {} \frac{{{\mathbf{{s}}^{(2)}}\left( {{\varvec{\beta }} ,t} \right) }}{{{s^{(0)}}\left( {{\varvec{\beta }} ,t} \right) }} - \mathbf{{e}}{({\varvec{\beta }} ,t)^{ \otimes 2}}. \end{aligned}$$

Assume the following regularity conditions, where \(\tau >0\) is a constant denoting the last survival time of the uncensored subjects:

Condition 1

\({\varvec{\beta }}\) is in a compact subset of \({\mathbb {R}}^p\).

Condition 2

\(\varLambda (t) < \infty \).

Condition 3

\(\theta \in \nu \), where \(\nu \) is a compact subset of \(\varTheta \).

Condition 4

\(P({ {{\mathcal {C}}}_{ij}} \ge t,\,\forall t \in [0,\tau ])> {\delta _c} > 0\) for \(i=1,\ldots ,K\) and \(j=1,\ldots ,n_i\).

Condition 5

Let \({\mathbf{{Z}}_{ij}}(t) = \left\{ {{Z_{ij1}}(t),\ldots ,{Z_{ijp}}(t)} \right\} \). For \(i=1,\ldots ,K\), \(j=1,\ldots ,n_i\) and \(k=1,\ldots ,p\),

$$\begin{aligned} \left| {{Z_{ijk}}(0)} \right| + \int _0^\tau {\left| {d{Z_{ijk}}(t)} \right| \le {B_Z} < \infty } \qquad \text {almost surely for some constant }B_z. \end{aligned}$$

Condition 6

\(E[\log \left\{ {{L_i}({\theta _1},{\varvec{\beta }} ,\varLambda )/{L_i}({\theta _2},{\varvec{\beta }} ,\varLambda )} \right\} \) exists for all \({\theta _1},{\theta _2} \in \varTheta \), \(i=1,\ldots ,K\).

Condition 7

\(\mathbf{{A}} = \int _0^\tau {\mathbf{{v}}({{\varvec{\beta }} _0},u){s^{(0)}}({{\varvec{\beta }} _0},u)d{\varLambda _0}(u)}\) is positive definite.

Condition 8

The bivariate copula \({\mathrm {C}}_{\varvec{\cdot } V}(u_{ij},v_i;\theta )\) is absolutely continuous.

1.3 Proof of Theorem 2

Since the consistency of \(\check{{\varvec{\beta }}}\) and \({\check{\varLambda }}\) was already proved in Spiekerman and Lin (1998), we only show the consistency of \({\check{\theta }}\). This is done by extending the results in Prenen et al. (2017a) and Othus and Li (2010).

Accounting for the fact that we use plug-in estimators for \({\varvec{\beta }}\) and \(\varLambda \), we proceed by taking a Taylor series expansion of the log-likelihood of \(\theta \) in the neighbourhood of \({\varvec{\beta }}\) and \(\varLambda \). In view of \(\varLambda \) being an unspecified function, we need to include a functional expansion term. The concept of Hadamard differentiability is suitable in this case. In order to use this approach, we must first verify that the log-likelihood \(l(\theta )\) is Hadamard differentiable with respect to \(\varLambda \): By condition 5, the term \({\int _0^\tau {{Y_{ij}}(u)\exp \left\{ {{\varvec{\beta }} '{\mathbf{{Z}}_{ij}}(u)} \right\} d\varLambda (u)} }\) in \(H_{ij}\) is Hadamard differentiable. Furthermore, by the chain rule for Hadamard derivatives (Van der Vaart 2000), we conclude that \(l(\theta )\) is Hadamard differentiable with respect to \(\varLambda \).

Let \({BV}[0,\tau ]\) denote the class of functions with bounded total variation on \([0,\tau ]\). The Hadamard derivative of \(l(\theta )\) with respect to \(\varLambda \) at \(\varGamma - \varLambda \in {{BV}}[0,\tau ]\) can be obtained by taking the derivative of \({K^{ - 1}}\log \left[ {L\left\{ {\theta ,{\varvec{\beta }} ,\varLambda + t\left( {\varGamma - \varLambda } \right) } \right\} } \right] \) with respect to t and then making \(t=0\):

$$\begin{aligned}\frac{d}{{dt}}\left( {{K^{ - 1}}\log \left[ {L\left\{ {\theta ,{\varvec{\beta }} ,\varLambda + t\left( {\varGamma - \varLambda } \right) } \right\} } \right] } \right) \bigg |_{t=0} = \int _0^\tau {{\zeta _K}(\theta ,\varLambda )(u)d(\varGamma - \varLambda )(u)}, \end{aligned}$$

where \({\zeta _K}\left( {\theta ,\varLambda } \right) \left( u \right) \) is equal to

$$\begin{aligned}&{K^{ - 1}}\sum \limits _{i = 1}^K {\frac{{\displaystyle \int _0^1 {\exp \left\{ {\sum \limits _{j = 1}^{{n_i}} {\log \pmb {{{\mathrm {C}}}}(H_{ij},v_i,\theta )} } \right\} \left[ {\sum \limits _{j = 1}^{{n_i}} {\left\{ {\left( {\frac{\partial }{{\partial {H_{ij}}}}\log \pmb {{{\mathrm {C}}}}(H_{ij},v_i,\theta )} \right) D_{ij}^\varLambda } \right\} } } \right] d{v_i}} }}{{\displaystyle \int _0^1 {\exp \left\{ {\sum \limits _{j = 1}^{{n_i}} {\log \pmb {{{\mathrm {C}}}}(H_{ij},v_i,\theta )} } \right\} d{v_i}} }}}\nonumber \\&\quad ={K^{ - 1}}\sum \limits _{i = 1}^K \displaystyle \int _0^1 P(v_i|H_{i\varvec{\cdot }},\theta )\left[ {\sum \limits _{j = 1}^{{n_i}} {\left\{ {\left( {\frac{\partial }{{\partial {H_{ij}}}}\log {{\pmb {{\mathrm {C}}}}}(H_{ij},v_i,\theta )} \right) D_{ij}^\varLambda } \right\} } } \right] d{v_i}\nonumber \\&\quad = {K^{ - 1}}\sum \limits _{i = 1}^K \sum \limits _{j = 1}^{n_i} D_{ij}^\varLambda \, E\left[ \frac{\partial }{{\partial {H_{ij}}}}\log {{\pmb {{\mathrm {C}}}}}(H_{ij},v_i,\theta )\right] , \end{aligned}$$
$$\begin{aligned}D_{ij}^\varLambda = ( - {H_{ij}}){Y_{ij}}(u)\exp \left\{ {{\varvec{\beta }} '{\mathbf{{Z}}_{ij}}(u)} \right\} ,\end{aligned}$$

and

$$\begin{aligned} P(v_i|H_{i\varvec{\cdot }},\theta ) = \frac{\exp \left\{ {\sum \limits _{j = 1}^{{n_i}} {\log {\pmb {{\mathrm {C}}}}(H_{ij},v_i,\theta )} } \right\} }{\displaystyle \int _0^1 \exp \left\{ {\sum \limits _{j = 1}^{{n_i}} {\log {\pmb {{\mathrm {C}}}}(H_{ij},v_i,\theta )} } \right\} d{v_i}} \end{aligned}$$
(14)

is a probability density function of a random variable \(V_i\) assuming values in [0, 1]. Similarly, the derivative of \(l(\theta )\) with respect to \({\varvec{\beta }}\) is

$$\begin{aligned} \zeta _K(\theta ,{\varvec{\beta }} )&={K^{ - 1}}\sum \limits _{i = 1}^K \displaystyle \int _0^1 P(v_i|H_{i\varvec{\cdot }},\theta )\left[ {\sum \limits _{j = 1}^{{n_i}} {\left\{ {\left( {\frac{\partial }{{\partial {H_{ij}}}}\log {{\pmb {{\mathrm {C}}}}}(H_{ij},v_i,\theta )} \right) D_{ij}^{\varvec{\beta }} } \right\} } } \right] d{v_i} \\&={K^{ - 1}}\sum \limits _{i = 1}^K \sum \limits _{j = 1}^{n_i} D_{ij}^{\varvec{\beta }} \, E\left[ \frac{\partial }{{\partial {H_{ij}}}}\log {{\pmb {{\mathrm {C}}}}}(H_{ij},v_i,\theta )\right] , \end{aligned}$$

where

$$\begin{aligned}D_{ij}^{\varvec{\beta }} = ( - {H_{ij}})\int _0^\tau {{Y_{ij}}(u){\mathbf{{Z}}_{ij}}(u)\exp \left\{ {{\varvec{\beta }} '{\mathbf{{Z}}_{ij}}(u)} \right\} d\varLambda (u)}. \end{aligned}$$

Let \(\left\| {\, \cdot \,} \right\| \) denote the Euclidean norm and let \({\left\| {\, \cdot \,} \right\| _\infty }\) denote the supremum norm on \([0,\tau ]\). To prove the consistency of \({\check{\theta }}\), we need \({\left\| {{\zeta _K}(\theta ,\varLambda )} \right\| _\infty }\) and \(\left\| {{\zeta _K}(\theta ,{\varvec{\beta }} )} \right\| \) to be bounded. Note that, by the definition of \(H_{ij}\) and conditions 2 and 5, the terms \({\left\| {\, D_{ij}^\varLambda \,} \right\| _\infty }\) and \(\left\| {\, D_{ij}^{\varvec{\beta }} \,} \right\| \) are bounded. Therefore, in order to satisfy the boundedness condition of \({\left\| {{\zeta _K}(\theta ,\varLambda )} \right\| _\infty }\) and \(\left\| {{\zeta _K}(\theta ,{\varvec{\beta }} )} \right\| \), we shall require the expectations in their formulae to be finite.

We now continue with the proof by taking an expansion of \({{{\check{l}} }_K}(\theta )\) around \({{\varvec{\beta }} _0}\) and \({\varLambda _0}\), given by

$$\begin{aligned}{{{\check{l}} }_K}(\theta ) = {l_{K0}}(\theta ) + {\zeta _K}(\theta ,{{\varvec{\beta }} _0})({\check{\varvec{\beta }}} - {{\varvec{\beta }} _0}) + \int _0^\tau {{\zeta _K}(\theta ,{\varLambda _0})(t)d({{\check{\varLambda }}} - {\varLambda _0})(t)} + R.\end{aligned}$$

Another (intuitive) notation is

$$\begin{aligned} {l_{K,\theta }}({\check{\varvec{\beta }}} ,{{\check{\varLambda }}} )= & {} {l_{K,\theta }}({{\varvec{\beta }} _0},{\varLambda _0}) + \frac{\partial }{{\partial {\varvec{\beta }} }}{l_{K,\theta }}({{\varvec{\beta }} _0},{\varLambda _0})({\check{\varvec{\beta }}} - {{\varvec{\beta }} _0}) \\&+ \frac{\partial }{{\partial \varLambda }}{l_{K,\theta }}({{\varvec{\beta }} _0},{\varLambda _0})({{\check{\varLambda }}} - {\varLambda _0}) + R.\end{aligned}$$

The remainder term R is of order \({o_p}\left( {\max \left\{ {\left\| {{\check{\varvec{\beta }}} - {{\varvec{\beta }} _0}} \right\| ,{{\left\| {{{\check{\varLambda }}} - {\varLambda _0}} \right\| }_\infty }} \right\} } \right) \). This can be seen from the definition of Hadamard differentiability, since

$$\begin{aligned}&{\left\| {\frac{{{l_{K,\theta }}\left\{ {{\varvec{\beta }} ,{\varLambda _0} + t({{\check{\varLambda }}} - {\varLambda _0})} \right\} - {l_{K,\theta }}({\varvec{\beta }} ,{{\check{\varLambda }}} )}}{t} - \frac{\partial }{{\partial \varLambda }}{l_{K,\theta }}({{\varvec{\beta }}},{\varLambda _0})({{\check{\varLambda }}} - {\varLambda _0})} \right\| _\infty } \\&\quad \rightarrow 0,\quad \text {as}\,\,t \downarrow 0 \end{aligned}$$

uniformly in \({\check{\varLambda }}-\varLambda _0\) in all compact subsets of \({\mathbb {D}}\), the space of cumulative hazard functions. Since \(\check{{\varvec{\beta }}}\) is consistent and \({\check{\varLambda }}\) is uniformly consistent (Spiekerman and Lin 1998), \(R={o_p}(1)\). To prove that \({\check{\theta }}\) is consistent we need to verify the uniform convergence of the log-likelihood with the plug-in estimate of \(\varLambda \) to the expected value of the log-likelihood evaluated at the true value of \(\varLambda \), denoted \(l_{K0}(\theta )\):

$$\begin{aligned} \mathop {\sup }\limits _{\theta \in \nu } \left| {{{{\check{l}} }_K}(\theta ) - E[{l_{K0}}(\theta )]} \right| = {o_P}(1). \end{aligned}$$
(15)

This can be shown as follows:

$$\begin{aligned} {{{\check{l}} }_K}(\theta ) - E[{l_{K0}}(\theta )]&= {l_{K0}}(\theta ) - E[{l_{K0}}(\theta )] + {\zeta _K}(\theta ,{{\varvec{\beta }} _0})({\check{\varvec{\beta }}} - {{\varvec{\beta }} _0})\\&\quad + \int _0^\tau {{\zeta _K}(\theta ,{\varLambda _0})(t)d({{\check{\varLambda }}} - {\varLambda _0})(t)} + R. \end{aligned}$$

By the law of large numbers, for fixed \(\theta \)

$$\begin{aligned} {l_{K0}}(\theta ) - E[{l_{K0}}(\theta )]\mathop \rightarrow \limits ^p 0. \end{aligned}$$
(16)

Since \(\left\| {{\zeta _K}(\theta ,{\varvec{\beta }} )} \right\| \) and \({\left\| {{\zeta _K}(\theta ,\varLambda )}(u) \right\| _\infty }\) are bounded, say \(\left\| {{\zeta _K}(\theta ,{\varvec{\beta }} )} \right\| \le M_1\) and \({\left\| {{\zeta _K}(\theta ,\varLambda )}(u) \right\| }_\infty \le M_2\), we have

$$\begin{aligned}&\mathop {\sup }\limits _{\theta \in \nu } \left| {{\zeta _K}(\theta ,{{\varvec{\beta }} _0})({\check{\varvec{\beta }}} - {{\varvec{\beta }} _0})} \right| \le {M_1}\left\| {{\check{\varvec{\beta }}} - {{\varvec{\beta }} _0}} \right\| ,\\&\mathop {\sup }\limits _{\theta \in \nu } \left| {\int _0^\tau {{\zeta _K}(\theta ,{\varLambda _0})(t)d({{\check{\varLambda }}} - {\varLambda _0})(t)} } \right| \le {M_2}{\left\| {{{\check{\varLambda }}} - {\varLambda _0}} \right\| _\infty }. \end{aligned}$$

For this reason,

$$\begin{aligned} \mathop {\sup }\limits _{\theta \in \nu } \left| {{{{\check{l}} }_K}(\theta ) - E[{l_{K0}}(\theta )]} \right|&\le \mathop {\sup }\limits _{\theta \in \nu } \left| {{l_{K0}}(\theta ) - E[{l_{K0}}(\theta )]} \right| + {M_1}\left\| {{\check{\varvec{\beta }}} - {{\varvec{\beta }} _0}} \right\| \\&\quad + {M_2}{\left\| {{{\check{\varLambda }}} - {\varLambda _0}} \right\| _\infty } + R. \end{aligned}$$

Using result (16), the consistency of \(\check{{\varvec{\beta }}}\) and the uniform consistency of \({\check{\varLambda }}\) and the fact that \(R={o_p}(1)\), we obtain

$$\begin{aligned}\mathop {\sup }\limits _{\theta \in \nu } \left| {{{{\check{l}} }_K}(\theta ) - E[{l_{K0}}(\theta )]} \right| = {o_p}(1).\end{aligned}$$

Finally, to verify that \({\check{\theta }}\) is consistent, we need to show that the expected log-likelihood is maximized at the true value:

$$\begin{aligned} E[{l_{K0}}(\theta )] - E[{l_{K0}}({\theta _0})] < 0. \end{aligned}$$
(17)

Since clusters are independent, the log-likelihood \(l_K(\theta )\) can be written as a sum of independent and identically distributed random variables:

$$\begin{aligned}{K^{ - 1}}\sum \limits _{i = 1}^K {\log \left\{ {{L_i}(\theta ,{\varvec{\beta }} ,\varLambda )} \right\} }, \end{aligned}$$

with

$$\begin{aligned} {L_i}&= {( - 1)^{{d_i}}}\frac{{{\partial ^{{d_i}}}}}{{{{\left( {\partial {x_{i1}}} \right) }^{{\delta _{i1}}}}\ldots \,\,{{\left( {\partial {x_{i{n_i}}}} \right) }^{{\delta _{i{n_i}}}}}}}{S}({x_{i1}},\ldots ,{x_{i{n_i}}})\\&= \prod \limits _{i = 1}^K \int _0^1 \exp \left\{ \sum \limits _{j = 1}^{{n_i}} \log \left( { \mathrm {c}_{ \varvec{\cdot }V }}{{(\exp \{ - \varLambda ({x_{ij}})\} ,{v_i})}^{{\delta _{ij}}}} \right. \right. \\&\left. \left. \,\,\,\,\, \times {{{\mathrm {C}}}_{ \varvec{\cdot }|V }}{{(\exp \{ - \varLambda ({x_{ij}})\} |{v_i})}^{1 - {\delta _{ij}}}} \right) \right\} d{v_i}. \end{aligned}$$

Take \(\theta \ne \theta _0 \). The law of large numbers, Jensen’s inequality and condition 6 imply that

$$\begin{aligned}&\mathop {\lim }\limits _{K \rightarrow \infty } {l_{K0}}(\theta ) - {l_{K0}}({\theta _0}) = E[{l_{K0}}(\theta )] - E[{l_{K0}}({\theta _0})]\\&\quad = E\left[ {{K^{ - 1}}\sum \limits _{i = 1}^K {\log \left\{ {{L_i}(\theta ,{{\varvec{\beta }} _0},{\varLambda _0})} \right\} } } \right] - E\left[ {{K^{ - 1}}\sum \limits _{i = 1}^K {\log \left\{ {{L_i}({\theta _0},{{\varvec{\beta }} _0},{\varLambda _0})} \right\} } } \right] \\&\quad = E\left[ {\log \left\{ {{L_i}(\theta ,{{\varvec{\beta }} _0},{\varLambda _0})} \right\} - \log \left\{ {{L_i}({\theta _0},{{\varvec{\beta }} _0},{\varLambda _0})} \right\} } \right] \\&\quad = E\left[ {\log \left\{ {{L_i}(\theta ,{{\varvec{\beta }} _0},{\varLambda _0})/{L_i}({\theta _0},{{\varvec{\beta }} _0},{\varLambda _0})} \right\} } \right] \\&\quad \le \log \left( {E\left[ {\log \left\{ {{L_i}(\theta ,{{\varvec{\beta }} _0},{\varLambda _0})/{L_i}({\theta _0},{{\varvec{\beta }} _0},{\varLambda _0})} \right\} } \right] } \right) \\&\quad \le \log \left( {E\left[ {\log \left\{ {{L_i}(\theta ,{{\varvec{\beta }} _0},{\varLambda _0})/{L_i}({\theta _0},{{\varvec{\beta }} _0},{\varLambda _0})} \right\} } \right] } \right) \\&\quad = \log (1) = 0. \end{aligned}$$

Since \({\check{\theta }}\) maximizes \(\check{l}(\theta )\), Eq. (15) implies that

$$\begin{aligned} 0 \le {{{\check{l}} }_K}\left( {{\check{\theta }} } \right) - {{{\check{l}} }_K}\left( {{\theta _0}} \right) + E\left[ {{l_{K0}}\left( {{\theta _0}} \right) } \right] - E\left[ {{l_{K0}}\left( {{\theta _0}} \right) } \right] = {{{\check{l}} }_K}\left( {{\check{\theta }} } \right) - E\left[ {{l_{K0}}\left( {{\theta _0}} \right) } \right] + {o_p}(1)\\ \implies E\left[ {{l_{K0}}\left( {{\theta _0}} \right) } \right] \le {{{\check{l}} }_K}\left( {{\check{\theta }} } \right) + {o_p}(1). \end{aligned}$$

Subtracting \(E\left[ {{l_{K0}}\left( {{\check{\theta }} } \right) } \right] \) from each side of the inequality we get

$$\begin{aligned}&E\left[ {{l_{K0}}\left( {{\theta _0}} \right) } \right] - E\left[ {{l_{K0}}\left( {{\check{\theta }} } \right) } \right] \le {{{\check{l}} }_K}\left( {{\check{\theta }} } \right) - E\left[ {{l_{K0}}\left( {{\check{\theta }} } \right) } \right] + {o_p}(1)\nonumber \\&\quad \le \mathop {\sup }\limits _{\theta \in \varTheta } \left| {{{{\check{l}} }_K}\left( \theta \right) - E\left[ {{l_{K0}}\left( \theta \right) } \right] } \right| + {o_p}(1) = {o_p}(1). \end{aligned}$$
(18)

Now take \(\theta \) such that \(|\theta -\theta _0|\ge \epsilon \) for any fixed \(\epsilon >0\). By inequality (17), there must be some \({\gamma _\varepsilon }>0\) such that

$$\begin{aligned}E\left[ {{l_{K0}}\left( {{\check{\theta }} } \right) } \right] + {\gamma _\varepsilon } < E\left[ {{l_{K0}}\left( {{\theta _0}} \right) } \right] .\end{aligned}$$

It follows that

$$\begin{aligned}P(|{\check{\theta }} - {\theta _0}| \ge \varepsilon ) \le P\left\{ {E\left[ {{l_{K0}}\left( {{\check{\theta }} } \right) } \right] + {\gamma _\varepsilon } < E\left[ {{l_{K0}}\left( {{\theta _0}} \right) } \right] } \right\} .\end{aligned}$$

Equation (18) implies that

$$\begin{aligned}P\left\{ {E\left[ {{l_{K0}}\left( {{\check{\theta }} } \right) } \right] + {\gamma _\varepsilon } < E\left[ {{l_{K0}}\left( {{\theta _0}} \right) } \right] } \right\} \rightarrow 0 \qquad \text {as}\,\, K\rightarrow \infty . \end{aligned}$$

Therefore

$$\begin{aligned}P(|{\check{\theta }} - {\theta _0}| \ge \varepsilon ) \rightarrow 0 \qquad \text {as}\,\, K\rightarrow \infty . \end{aligned}$$

1.4 Proof of Theorem 3

Take a first order Taylor series expansion of \({{{\check{U}} }_K}({\check{\theta }} )\) around \(\theta _0\):

$$\begin{aligned}{{{\check{U}} }_K}({\check{\theta }} ) = {{{\check{U}} }_K}({\theta _0}) + \left( {{\check{\theta }} - {\theta _0}} \right) \frac{{\partial {{{\check{U}} }_K}}}{{\partial \theta }}\bigg |_{\theta =\theta ^*},\end{aligned}$$

where \(\theta ^*\) is between \({\check{\theta }}\) and \(\theta _0\). It must be that \(\check{U}_K({\check{\theta }})=0\) since \({\check{\theta }}\) was taken to be the maximum of \(L(\theta ,\check{{\varvec{\beta }}},{\check{\varLambda }})\). For this reason

$$\begin{aligned} \sqrt{K} \left( {{\check{\theta }} - {\theta _0}} \right) = \frac{{\sqrt{K} {{{\check{U}} }_K}\left( {{\theta _0}} \right) }}{{ - \partial {{{\check{U}} }_K}/\partial \theta \big |_{\theta =\theta ^*}}}. \end{aligned}$$
(19)

We already showed that \({\check{\theta }}\) consistently estimates \(\theta _0\), so the law of large numbers implies that

$$\begin{aligned}\frac{{\partial {{{\check{U}} }_K}}}{{\partial \theta }}\bigg |_{\theta =\theta ^*} \rightarrow W\left( {{\theta _0}} \right) = \mathop {\lim }\limits _{K \rightarrow \infty } \frac{{\partial {{{\check{U}} }_K}}}{{\partial \theta }}\bigg |_{\theta =\theta _0}.\end{aligned}$$

We shall show that the score equation \(\check{U}_K({\theta }_0)\) in the numerator of Eq. (19) follows a normal distribution. Hereto, we need a Taylor series expansion of \(\check{U}_K({\theta _0})\) in the neighbourhood of \({\varvec{\beta }}_0\) and \(\varLambda _0\). Because \(\varLambda _0\) is an unspecified function, we shall use the Hadamard derivative of \(U_K(\theta _0)\) with respect to \(\varLambda \) at \(\varGamma - \varLambda \in {{BV}}[0,\tau ]\):

$$\begin{aligned}\frac{d}{{dt}}\left( {{K^{ - 1}}\frac{{\partial \log \left[ {L\left\{ {\theta ,{\varvec{\beta }} ,\varLambda + t(\varGamma - \varLambda )} \right\} } \right] }}{{\partial \theta }}} \right) \bigg |_{t=0} = \int _0^\tau {{\xi _K}(\theta ,\varLambda )(u)d(\varGamma - \varLambda )(u)}, \end{aligned}$$

where \({\xi _K}(\theta ,\varLambda )(u)\) is equal to

$$\begin{aligned}&{K^{ - 1}}\sum \limits _{i = 1}^K \left[ \displaystyle \int _0^1 P(v_i|H_{i\varvec{\cdot }},\theta ) \left\{ {\sum \limits _{j = 1}^{{n_i}} {\frac{{{\partial ^2}\log {\pmb {{\mathrm {C}}}}(H_{ij},v_i,\theta )}}{{\partial \theta \partial {H_{ij}}}}D_{ij}^\varLambda } } + \sum \limits _{j = 1}^{{n_i}} \frac{\partial \log \pmb {{\mathrm {C}}}(H_{ij},v_i,\theta )}{\partial \theta }\right. \right. \\&\,\,\quad \left. \times \sum \limits _{j = 1}^{{n_i}} \frac{\partial \log \pmb {{\mathrm {C}}}(H_{ij},v_i,\theta )}{\partial H_{ij}}D_{ij}^\varLambda \right\} dv_i \\&\qquad -\displaystyle \int _0^1 P(v_i|H_{i\varvec{\cdot }},\theta )\left\{ \sum \limits _{j = 1}^{{n_i}} \frac{\partial \log \pmb {{\mathrm {C}}}(H_{ij},v_i,\theta )}{\partial \theta }\right\} dv_i\\&\,\,\quad \left. \times \displaystyle \int _0^1 P(v_i|H_{i\varvec{\cdot }},\theta )\left\{ \sum \limits _{j = 1}^{{n_i}} \frac{\partial \log \pmb {{\mathrm {C}}}(H_{ij},v_i,\theta )}{\partial H_{ij}}D_{ij}^{\varLambda }\right\} dv_i\right] \\&\quad ={K^{ - 1}}\sum \limits _{i = 1}^K\sum \limits _{j = 1}^{n_i}D_{ij}^{\varLambda }\left\{ E\left[ \frac{{{\partial ^2}\log {\pmb {{\mathrm {C}}}}(H_{ij},v_i,\theta )}}{{\partial \theta \partial {H_{ij}}}} \right] \right. \\&\,\,\quad \left. + \sum \limits _{k = 1}^{n_i}\mathrm {Cov}\left[ \frac{{{\partial }\log {\pmb {{\mathrm {C}}}}(H_{ij},v_i,\theta )}}{{\partial \theta }},\frac{{{\partial }\log {\pmb {{\mathrm {C}}}}(H_{ik},v_i,\theta )}}{{\partial H_{ik}}} \right] \right\} ,\\&D_{ij}^\varLambda = ( - {H_{ij}}){Y_{ij}}(u)\exp \left\{ {{\varvec{\beta }} '{\mathbf{{Z}}_{ij}}(u)} \right\} , \end{aligned}$$

and \(P(v_i|H_{i\varvec{\cdot }},\theta )\) has the same definition as in expression (14). The derivative of \(U_K(\theta _0)\) with respect to \({\varvec{\beta }}\) is given by the same expression as \({\xi _K}(\theta ,\varLambda )(u)\), replacing \(D_{ij}^\varLambda \) for

$$\begin{aligned}D_{ij}^{\varvec{\beta }} = ( - {H_{ij}})\int _0^\tau {{Y_{ij}}(u){\mathbf{{Z}}_{ij}}(u)\exp \left\{ {{\varvec{\beta }} '{\mathbf{{Z}}_{ij}}(u)} \right\} d\varLambda (u)}. \end{aligned}$$

By the same arguments used to show the consistency of \({\check{\theta }}\), we also need \({\left\| {{\xi _K}(\theta ,\varLambda )} \right\| _\infty }\) and \(\left\| {{\xi _K}(\theta ,{\varvec{\beta }} )} \right\| \) to be bounded. For this reason, we shall require the expectation and covariance in their formulae to be finite. Hence, we proceed by taking a Taylor series expansion of \(\check{U}_K({\theta _0})\) in the neighbourhood of \({\varvec{\beta }}_0\) and \(\varLambda _0\) which gives

$$\begin{aligned} {{{\check{U}} }_K}({\theta _0})= & {} {U_{K0}}({\theta _0}) + {\xi _K}({\theta _0},{{\varvec{\beta }} _0})({\check{\varvec{\beta }}} - {{\varvec{\beta }} _0}) \\&+ \int _0^\tau {{\xi _K}({\theta _0},{\varLambda _0})(t)d\left\{ {{{\check{\varLambda }}} (t) - {\varLambda _0}(t)} \right\} } + {G_K}, \end{aligned}$$

where \({G_K}\) is the remainder term for the Taylor series. Since \({\check{\varLambda }}\) is \(\sqrt{K}\) consistent, it can be shown that \({G_K}=o_p(K^{-1/2})\). Define the pointwise limit of \({{\xi _K}(\theta ,\varLambda )}(t)\) as \({{\xi }(\theta ,\varLambda )}(t)\) and denote \({{\xi }(\theta ,{\varvec{\beta }} )}=E[{{\xi _K}(\theta ,{\varvec{\beta }} )}]\). Since \({\left\| {{\xi _K}(\theta ,\varLambda )} \right\| _\infty }\) and \(\left\| {{\xi _K}(\theta ,{\varvec{\beta }} )} \right\| \) are bounded, \({\left\| {{\xi }(\theta ,\varLambda )} \right\| _\infty }\) and \(\left\| {{\xi }(\theta ,{\varvec{\beta }} )} \right\| \) are also. Therefore

$$\begin{aligned} \sqrt{K} {{{\check{U}} }_K}({\theta _0})&= \sqrt{K} \left[ {U_{K0}}({\theta _0}) + {\xi _K}({\theta _0},{{\varvec{\beta }} _0})({\check{\varvec{\beta }}} - {{\varvec{\beta }} _0}) \right. \\&\quad \left. + \int _0^\tau {{\xi _K}({\theta _0},{\varLambda _0})(t)d\left\{ {{{\check{\varLambda }}} (t) - {\varLambda _0}(t)} \right\} } \right] + {o_p}(1). \end{aligned}$$

By Spiekerman and Lin (1998)

$$\begin{aligned}\sqrt{K} ({\check{\varvec{\beta }}} - {{\varvec{\beta }} _0}) \rightarrow {\mathbf{{A}}^{ - 1}}\sum \limits _{i = 1}^K {{\mathbf{{w}}_{i.}}}, \end{aligned}$$

where \({{\mathbf{{w}}_{i.}}}\) is the ith component of the score function for \({\varvec{\beta }}\) under the independence working assumption, evaluated at \({\varvec{\beta }}_0\):

$$\begin{aligned}{\mathbf{{w}}_{i.}} = \sum \limits _{j = 1}^{{n_i}} {\int _0^\tau {\left\{ {{\mathbf{{Z}}_{ij}}(u) - E\left[ {{{\varvec{\beta }} _0},u} \right] } \right\} d{M_{ij}}(u)} }, \end{aligned}$$

with

$$\begin{aligned}{M_{ij}}(t) = {\delta _{ij}}{Y_{ij}}(t) - \int _0^t {{Y_{ij}}(u)\exp \left\{ {{{{\varvec{\beta }} '}_0}{\mathbf{{Z}}_{ij}}(u)} \right\} d{\varLambda _0}(u)}. \end{aligned}$$

They also showed that

$$\begin{aligned}\sqrt{K} \left\{ {{{{{\check{\varLambda }}} }_0}(t,{\check{\varvec{\beta }}} ) - {\varLambda _0}(t)} \right\} \rightarrow {\mathcal {W}}(t) = {K^{ - 1/2}}\sum \limits _{i = 1}^K {{\varPsi _i}(t)}, \end{aligned}$$

where \({\mathcal {W}}(t)\) is a zero mean Gaussian process with variance function

$$\begin{aligned}E\left[ {{\varPsi _i}{{(t)}^2}} \right] ,\end{aligned}$$

with

$$\begin{aligned}{\varPsi _i}(t) = \int _0^t {\frac{{d{M_{i.}}(u)}}{{{s^{(0)}}\left( {{{\varvec{\beta }} _0},u} \right) }} + {\mathbf{{h}}^T}(t){\mathbf{{A}}^{ - 1}}{\mathbf{{w}}_{i.}}} \end{aligned}$$

and

$$\begin{aligned}{} \mathbf{{h}}(t) = - \int _0^t {\mathbf{{e}}\left( {{{\varvec{\beta }} _0},u} \right) d{\varLambda _0}(u)}. \end{aligned}$$

That is why

$$\begin{aligned}&\sqrt{K} \left[ {{U_{K0}}({\theta _0}) + {\xi _K}({\theta _0},{{\varvec{\beta }} _0})({\check{\varvec{\beta }}} - {{\varvec{\beta }} _0}) + \int _0^\tau {{\xi _K}({\theta _0},{\varLambda _0})(t)d\left\{ {{{\check{\varLambda }}} (t) - {\varLambda _0}(t)} \right\} } } \right] \\&\quad =\sqrt{K} \left[ {K^{ - 1}}\sum \limits _{i = 1}^K {{\phi _i}({\theta _0})} + {\xi _K}({\theta _0},{{\varvec{\beta }} _0}){K^{ - 1/2}}{\mathbf{{A}}^{ - 1}}\sum \limits _{i = 1}^K {{\mathbf{{w}}_{i.}}}\right. \\&\qquad \left. \,\,\,\,+ {K^{ - 1/2}}\int _0^\tau {{\xi _K}({\theta _0},{\varLambda _0})(t)d\left\{ {{K^{ - 1/2}}\sum \limits _{i = 1}^K {{\varPsi _i}(t)} } \right\} } \right] \\&\quad =\sqrt{K} \sum \limits _{i = 1}^K \left[ {K^{ - 1}}{\phi _i}({\theta _0}) + {\xi _K}({\theta _0},{{\varvec{\beta }} _0}){K^{ - 1/2}}{\mathbf{{A}}^{ - 1}}{\mathbf{{w}}_{i.}} \right. \\&\left. \qquad + {K^{ - 1}}\int _0^\tau {{\xi _K}({\theta _0},{\varLambda _0})(t)d{\varPsi _i}(t)} \right] \\&\quad ={K^{ - 1/2}}\sum \limits _{i = 1}^K {\left[ {{\phi _i}({\theta _0}) + {\xi _K}({\theta _0},{{\varvec{\beta }} _0})\sqrt{K} {\mathbf{{A}}^{ - 1}}{\mathbf{{w}}_{i.}} + \int _0^\tau {{\xi _K}({\theta _0},{\varLambda _0})(t)d{\varPsi _i}(t)} } \right] } \\&\quad ={K^{ - 1/2}}\sum \limits _{i = 1}^K {{\varXi _i}} \end{aligned}$$

The central limit theorem implies that \(\sqrt{K}\check{U}_K(\theta _0)\) converges to a normally distributed random variable with mean 0 and variance equal to the variance of \(\varXi \). Thus we have

$$\begin{aligned}\sqrt{K} \left( {{\check{\theta }} - {\theta _0}} \right) = \frac{{\sqrt{K} {{{\check{U}} }_K}({\theta _0})}}{{ - \partial {{{\check{U}} }_K}/\partial \theta \big |_{\theta =\theta ^*}}},\end{aligned}$$

where

$$\begin{aligned}\sqrt{K} {{{\check{U}} }_K}({\theta _0})\mathop \rightarrow \limits ^D N\left\{ {0,{\mathop {\mathrm {var}}} \left( \varXi \right) } \right\} \end{aligned}$$

and

$$\begin{aligned}\partial {{{\check{U}} }_K}/\partial \theta \big |_{\theta =\theta ^*}\mathop \rightarrow \limits ^P W\left( {{\theta _0}} \right) .\end{aligned}$$

By Slutsky’s theorem, \(\sqrt{K} \left( {{\check{\theta }} - {\theta _0}} \right) \) converges to a normal distribution with mean 0 and variance

$$\begin{aligned}{\mathop {\mathrm {var}}} \left( \varXi \right) /W{\left( {{\theta _0}} \right) ^2}. \end{aligned}$$

The variance of \(\varXi \) (note that \(\mathrm {var}[\varXi ]=E[\varXi ^2]\)) can be estimated by \({K^{ - 1}}\sum \nolimits _{i = 1}^K {{{\check{\varXi }} ^2}}\), where \({\check{\varXi }}\) is obtained from \({\varXi } \) replacing parameter values by their estimates. \(W(\theta _0)\) can be estimated by the (minus) derivative of the pseudoscore function \(\check{U}_K(\theta )\) evaluated at \({\check{\theta }}\). Therefore, the standard error of \({\check{\theta }}\) is given by the square root of

$$\begin{aligned} \mathrm {var}({\check{\theta }})=\frac{K^{-2}\sum \nolimits _{i = 1}^K {{{\check{\varXi }} ^2}}}{\left( \partial {{{\check{U}} }_K}/\partial \theta \big |_{\theta ={\check{\theta }}}\right) ^2}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Campos, E., Braekers, R., de Souza, D.J. et al. Factor copula models for right-censored clustered survival data. Lifetime Data Anal 27, 499–535 (2021). https://doi.org/10.1007/s10985-021-09525-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-021-09525-5

Keywords

Navigation