Abstract
Cox’s proportional hazards regression model is the standard method for modelling censored life-time data with covariates. In its standard form, this method relies on a semiparametric proportional hazards structure, leaving the baseline unspecified. Naturally, specifying a parametric model also for the baseline hazard, leading to fully parametric Cox models, will be more efficient when the parametric model is correct, or close to correct. The aim of this paper is two-fold. (a) We compare parametric and semiparametric models in terms of their asymptotic relative efficiencies when estimating different quantities. We find that for some quantities the gain of restricting the model space is substantial, while it is negligible for others. (b) To deal with such selection in practice we develop certain focused and averaged focused information criteria (FIC and AFIC). These aim at selecting the most appropriate proportional hazards models for given purposes. Our methodology applies also to the simpler case without covariates, when comparing Kaplan–Meier and Nelson–Aalen estimators to parametric counterparts. Applications to real data are also provided, along with analyses of theoretical behavioural aspects of our methods.
Similar content being viewed by others
Notes
Slightly adjusted estimators not influencing the theory may typically be applied when there are tied events, see e.g. Aalen et al. (2008, Ch. 3.1.3).
If \(\gamma \) influences the censoring mechanism and covariate distribution, then (11) is only a ‘partial’ likelihood, and not a true one. This has no consequences for inference, however.
We have avoided introducing the notation of Hadamard differentiability tangentially to a subset of \(\mathbb {D}\), as such are better stated explicitly in our concrete cases.
References
Aalen OO, Gjessing HK (2001) Understanding the shape of the hazard rate: a process point of view [with discussion and a rejoinder]. Stat Sci 16:1–22
Aalen OO, Borgan Ø, Gjessing HK (2008) Survival and event history analysis: a process point of view. Springer, Berlin
Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer, Berlin
Borgan Ø (1984) Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scand J Stat 11:1–16
Breslow NE (1972) Contribution to the discussion of the paper by D.R. Cox. J R Stat Soc Ser B 34:216–217
Claeskens G, Hjort NL (2003) The focused information criterion [with discussion and a rejoinder]. J Am Stat Assoc 98:900–916
Claeskens G, Hjort NL (2008) Model selection and model averaging. Cambridge University Press, Cambridge
Cox DR (1972) Regression models and life-tables [with discussion and a rejoinder]. J R Stat Soc Ser B 34:187–220
Efron B (1977) The efficiency of Cox’s likelihood function for censored data. J Am Stat Assoc 72:557–565
Hjort NL (1985) Bootstrapping Cox’s regression model. Department of Statistics, University of Stanford, Tech. rep
Hjort NL (1990) Goodness of fit tests in models for life history data based on cumulative hazard rates. Ann Stat 18:1221–1258
Hjort NL (1992) On inference in parametric survival data models. Int Stat Rev 60:355–387
Hjort NL (2008) Focused information criteria for the linear hazard regression model. In: Vonta F, Nikulin M, Limnios N, Huber-Carol C (eds) Statistical models and methods for biomedical and technical systems. Birkhäuser, Boston, pp 487–502
Hjort NL, Claeskens G (2003) Frequentist model average estimators [with discussion and a rejoinder]. J Am Stat Assoc 98:879–899
Hjort NL, Claeskens G (2006) Focused information criteria and model averaging for the Cox hazard regression model. J Am Stat Assoc 101:1449–1464
Hjort NL, Pollard DB (1993) Asymptotics for minimisers of convex processes. Department of Mathematics, University of Oslo, Tech. rep
Jeong JH, Oakes D (2003) On the asymptotic relative efficiency of estimates from Cox’s model. Sankhya 65:422–439
Jeong JH, Oakes D (2005) Effects of different hazard ratios on asymptotic relative efficiency estimates from Cox’s model. Commun Stat Theory Methods 34:429–448
Jullum M, Hjort NL (2017) Parametric or nonparametric: The FIC approach. Stat Sin 27:951–981
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley, New York
Meier P, Karrison T, Chappell R, Xie H (2004) The price of Kaplan-Meier. J Am Stat Assoc 99:890–896
Miller R (1983) What price Kaplan-Meier? Biometrics 39:1077–1081
Oakes D (1977) The asymptotic information in censored survival data. Biometrika 64:441–448
van der Vaart A (2000) Asymptotic statistics. Cambridge University Press, Cambridge
Acknowledgements
Our efforts have been supported in part by the Norwegian Research Council, through the project FocuStat (Focus Driven Statistical Inference With Complex Data) and the research based innovation centre Statistics for Innovation (sfi)\(^2\). We are also grateful to the reviewers and editor Mei-Ling T. Lee for constructive comments which led to an improved presentation.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Appendix
Estimating variances and covariances
For FIC and AFIC applications we need not only the focus parameter estimators \({{\widehat{\mu }}}_\mathrm{cox}\) and \({{\widehat{\mu }}}_\mathrm{pm}\) themselves (yielding also \({\widehat{b}}={{\widehat{\mu }}}_\mathrm{pm}-{{\widehat{\mu }}}_\mathrm{cox}\)), but also (consistent) recipes for estimating the quantities \(v_\mathrm{cox}\), \(v_c\), \(v_\mathrm{pm}\), making up the covariance matrix \(\Sigma _\mu \) in (31). The main ingredient in \(\Sigma _\mu \) is indeed \(\Sigma (s,t)\), with blocks as in (27), consisting of the quantities
In this appendix we provide explicit consistent estimators for these quantities, in addition to a simple consistent estimation strategy for other quantities typically involved in \(\Sigma _\mu \).
The principle we essentially follow is to insert the empirical analogues of all unknown quantities. This amounts firstly to estimating \(\beta _\mathrm{true}\), \(\beta _0\), \(\theta _0\), \(A_\mathrm{true}(\cdot )\), by respectively \({\widehat{\beta }}_\mathrm{cox}\), \({\widehat{\beta }}_\mathrm{pm}\), \(\widehat{\theta }\), \({\widehat{A}}_\mathrm{cox}(\cdot )\). Secondly, \(r^{(k)}(s;h(\beta _\mathrm{true},\beta _0))\) is estimated by \(n^{-1}R^{(k)}_n(s;h({\widehat{\beta }}_\mathrm{cox},{\widehat{\beta }}_\mathrm{pm}))\) for \(k=0,1,2\), and h some simple continuous function combining \(\beta \) and \(\beta _0\). For f some vector function involving unknown quantities, integrals of the form \(\int _0^t f\alpha _\mathrm{true}\,\mathrm{d}s=\int _0^t f\, \mathrm{d}A_\mathrm{true}\) are then estimated by \(\int _0^t {\widehat{f}}\,\mathrm{d}{\widehat{A}}_\mathrm{cox}= \sum _{T_i \le t} {\widehat{f}}(T_i)D_i/R^{(0)}_n(T_i;{\widehat{\beta }}_\mathrm{cox})\). Note also that integrals \(\int _0^t f(s)r^{(k)}(s;h(\beta _\mathrm{true},\beta _0))\,\mathrm{d}s\) are estimated by \(n^{-1}\int _0^t {\widehat{f}}(s) R_n^{(k)}(s;h({\widehat{\beta }}_\mathrm{cox},{\widehat{\beta }}_\mathrm{pm}))\, \mathrm{d}s\), which may be expressed as the sum
where \(R_{(i)}^{(k)}(h(\cdot ))=R_{(i)}^{(k)}(0;h(\cdot ))\) is equal to respectively \(\exp \{X_i^{\mathrm{t}}h(\cdot )\},X_i \exp \{X_i^{\mathrm{t}}h(\cdot )\}\), and \(X_i X_i^{\mathrm{t}}\exp \{X_i^{\mathrm{t}}h(\cdot )\}\) for \(k=0,1,2\). Thus, estimators of the form \(\int _0^t f(s)g^{(k)}(s;\beta )\, \mathrm{d}s\) may be expressed by
with \({\widehat{\beta }}\) inserted to estimate \(\beta \). The f-function is sometimes partly estimated by a step-function, like when f(s) is equal to either \(A(s)f_1(s), \sigma ^2(\min (s,t))f_1(s)\) or \(F(s)f_1(s)\) for some function \(f_1\). In such cases, integrals like \(\int _0^{t} f(s)r^{(k)}(s;h(\beta ,\beta _0))\,\mathrm{d}s\) are decomposed even further. To see this, assume \(f(s)=f_0(s)f_1(s)\) is estimated by \({\widehat{f}}(s)={\widehat{f}}_0(s){\widehat{f}}_1(s)\) where \({\widehat{f}}_0(s)\) is a step function of the form \({\widehat{f}}_0(s)=\sum _{j=1}^n \text {step}_{j} \mathbf {1}_{\{ T_j \le s \}}=\sum _{j:T_j\le s} \text {step}_j\). Then (40) decomposes further into the ‘triangle sum’
As a consequence, also \(\int _0^t f(s)g^{(k)}(s;\beta )\, \mathrm{d}s\) decomposes further, such that the subtrahend in (41) equals
Let us now turn to the actual estimation of the quantities in (39).
-
[1]
First, consider \(\sigma ^2(t)\) as given in (9). The estimation strategy outlined above gives the estimator
$$\begin{aligned} {\widehat{\sigma }}^2(t)=\int _0^t \frac{\mathrm{d}{{\widehat{A}}}_\mathrm{cox}(s)}{ n^{-1}R_n^{(0)}(s;{{\widehat{\beta }}}_\mathrm{cox})} =\sum _{T_i\le t}\frac{nD_i}{ \{R^{(0)}_n(T_i,{{\widehat{\beta }}}_\mathrm{cox})\}^2}. \end{aligned}$$ -
[2]
Next consider F(t) as given in (9). Writing \(E_n(s;\beta )\) for \(R_n^{(1)}(s;\beta )/R_n^{(0)}(s;\beta )\), this function is similarly estimated by
$$\begin{aligned} {\widehat{F}}(t)=\int _0^t E_n(T_i;{\widehat{\beta }}_\mathrm{cox})\,\mathrm{d}{{\widehat{A}}}_\mathrm{cox}(s) =\sum _{T_i\le t} \frac{D_i E_n(T_i;{\widehat{\beta }}_\mathrm{cox})}{R_n^{(0)}(T_i;{\widehat{\beta }}_\mathrm{cox})}. \end{aligned}$$ -
[3]
Consider now \(J_\mathrm{cox}\) as given in (7). Following the plug-in procedure, we get
$$\begin{aligned} \widehat{J}_\mathrm{cox}=\frac{1}{n} \sum _{T_i \le \tau } \left\{ \frac{R^{(2)}_n(T_i;{\widehat{\beta }}_\mathrm{cox})}{R^{(0)}_n(T_i; {\widehat{\beta }}_\mathrm{cox})} - E_n(T_i;{\widehat{\beta }}_\mathrm{cox})E_n(T_i;{\widehat{\beta }}_\mathrm{cox})^{\mathrm{t}}\right\} D_i. \end{aligned}$$Alternatively, \(J_\mathrm{cox}\) may be estimated by \(n^{-1}\) times minus the Hessian matrix of log-partial likelihood in (4).
-
[4]
Consider J as given in (14) with blocks as in (16). Following the plug-in procedure, we estimate J by \(\widehat{J}\) having blocks
$$\begin{aligned} \widehat{J}_{11}&= \frac{1}{n}\sum _{i=1}^nR_{(i)}^{(0)}({\widehat{\beta }}_\mathrm{pm}) \int _0^{T_i} \{\psi (s;\widehat{\theta })\psi (s;\widehat{\theta })^{\mathrm{t}} +\psi ^\mathrm{d}(s;\widehat{\theta })\}\alpha _{\mathrm{pm}}(s;\widehat{\theta })\, \mathrm{d}s\\&\quad -\frac{1}{n} \sum _{i=1}^n\psi ^\mathrm{d}(T_i;\widehat{\theta })D_i , \\ \widehat{J}_{12}&= \widehat{J}_{21}^{\mathrm{t}} = \frac{1}{n}\sum _{i=1}^n\int _0^{T_i} \psi (s;\widehat{\theta })\alpha _{\mathrm{pm}}(s;\widehat{\theta }) \, \mathrm{d}s R_{(i)}^{(1)}({\widehat{\beta }}_\mathrm{pm})^{\mathrm{t}}\\&= \frac{1}{n}\sum _{i=1}^nA^{\mathrm{d}}_\mathrm{pm}(T_i;\widehat{\theta })R_{(i)}^{(1)} ({\widehat{\beta }}_\mathrm{pm})^{\mathrm{t}}, \\ \widehat{J}_{22}&= \frac{1}{n}\sum _{i=1}^nR_{(i)}^{(2)} ({\widehat{\beta }}_\mathrm{pm}) \int _0^{T_i} \alpha _{\mathrm{pm}} (s;\widehat{\theta }) \, \mathrm{d}s =\frac{1}{n}\sum _{i=1}^nR_{(i)}^{(2)} ({\widehat{\beta }}_\mathrm{pm}) A_{\mathrm{pm}}(T_i;\widehat{\theta }). \end{aligned}$$Similarly to \(J_\mathrm{cox}\), J may be estimated by \(n^{-1}\) times minus the Hessian of the parametric log-likelihood in (11).
-
[5]
We continue with K as given in (14). The plug-in procedure applied to the formulae in (17) results in K being estimated by \(\widehat{K}\) having blocks
$$\begin{aligned} \widehat{K}_{11}&= \frac{1}{n}\sum _{i=1}^n\bigg [ \psi (T_i;\widehat{\theta })\psi (T_i;\widehat{\theta })^{\mathrm{t}}\\&\quad - \{A^{\mathrm{d}}_\mathrm{pm}(T_i;\widehat{\theta })\psi (T_i;\widehat{\theta })^{\mathrm{t}} + \psi (T_i;\widehat{\theta })A^{\mathrm{d}}_\mathrm{pm}(T_i;\widehat{\theta })^{\mathrm{t}}\} \frac{R_n^{(0)}(T_i;{\widehat{\beta }}_\mathrm{cox}+{\widehat{\beta }}_\mathrm{pm})}{R_n^{(0)}(T_i;{\widehat{\beta }}_\mathrm{cox})}\bigg ]D_i\\&\quad + \frac{1}{n}\sum _{i=1}^nR_{(i)}^{(0)}(2{\widehat{\beta }}_\mathrm{pm}) \int _0^{T_i} [A^{\mathrm{d}}_\mathrm{pm}(s;\widehat{\theta })\psi (s;\widehat{\theta })^{\mathrm{t}}\\&\quad + \psi (s;\widehat{\theta })A^{\mathrm{d}}_\mathrm{pm}(s;\widehat{\theta })^{\mathrm{t}}] \alpha _{\mathrm{pm}}(s;\widehat{\theta })\, \mathrm{d}s,\\ \widehat{K}_{12}&=\widehat{K}_{21}^{\mathrm{t}}= \frac{1}{n}\sum _{i=1}^n\bigg [\psi (T_i;\widehat{\theta }) E_n(T_i;{\widehat{\beta }}_\mathrm{cox})^{\mathrm{t}}\\&\quad - \{A^{\mathrm{d}}_\mathrm{pm}(T_i;\widehat{\theta })+\psi (T_i;\widehat{\theta }) A_\mathrm{pm}(T_i;\widehat{\theta })\}\frac{R_n^{(1)}(T_i; {\widehat{\beta }}_\mathrm{cox}+{\widehat{\beta }}_\mathrm{pm})^{\mathrm{t}}}{R_n^{(0)} (T_i;{\widehat{\beta }}_\mathrm{cox})} \bigg ]D_i \\&\quad + \frac{1}{n}\sum _{i=1}^n\left[ \int _0^{T_i} \{A^{\mathrm{d}}_\mathrm{pm}(s;\widehat{\theta })+\psi (s; \widehat{\theta })A_\mathrm{pm}(s;\widehat{\theta })\} \alpha _{\mathrm{pm}} (s;\widehat{\theta })\, \mathrm{d}s \right] R_{(i)}^{(1)} (2{\widehat{\beta }}_\mathrm{pm})^{\mathrm{t}}, \\ \widehat{K}_{22}&=\frac{1}{n}\sum _{i=1}^n\frac{R_n^{(2)}(T_i; {\widehat{\beta }}_{\mathrm{cox}})-2R_n^{(2)}(T_i;{\widehat{\beta }}_\mathrm{cox}+{\widehat{\beta }}_\mathrm{pm})A_\mathrm{pm}(T_i;\widehat{\theta })}{R_n^{(0)}(T_i;{\widehat{\beta }}_{\mathrm{cox}})} D_i \\&+ \quad \frac{2}{n} \sum _{i=1}^nR_{(i)}^{(2)}(2{\widehat{\beta }}_\mathrm{pm}) \int _0^{T_i} \alpha _{\mathrm{pm}}(s;\widehat{\theta })A_\mathrm{pm}(s; \widehat{\theta })\, \mathrm{d}s. \end{aligned}$$ -
[6]
We go on to the covariance \(\nu (t)=\mathrm{Cov}(W(t),U^{\mathrm{t}})\) as given in (29). This covariance formula may be estimated by
$$\begin{aligned} \widehat{\nu }(t)&= \begin{pmatrix} \sum _{T_i \le t} D_i\psi (T_i;\widehat{\theta })/R_n^{(0)} (T_i;{\widehat{\beta }}_\mathrm{cox}) \\ \widehat{F}(t) \end{pmatrix}^{\mathrm{t}}\\&\quad - \frac{1}{n}\sum _{i=1}^n\frac{D_i \widehat{\sigma }^2(\min (T_i,t))}{R_n^{(0)}(T_i;{\widehat{\beta }}_\mathrm{cox})} \begin{pmatrix} R_n^{(0)}(T_i;2{\widehat{\beta }}_\mathrm{cox}) \psi (T_i;\widehat{\theta }) \\ R_n^{(1)}(T_i;2{\widehat{\beta }}_\mathrm{cox}) \end{pmatrix}^{\mathrm{t}} \\&\quad + \sum _{i=1}^n\sum _{j:T_j < \min (T_i,t)} \frac{D_j}{R^{(0)}_n(T_j;{\widehat{\beta }}_\mathrm{cox})^2} \begin{pmatrix} R_{(i)}^{(0)}({\widehat{\beta }}_\mathrm{pm}+{\widehat{\beta }}_\mathrm{cox}) \{A^{\mathrm{d}}_{\mathrm{pm}}(T_i;\widehat{\theta })-A^{\mathrm{d}}_\mathrm{pm}(T_j;\widehat{\theta })\} \\ R_{(i)}^{(1)}({\widehat{\beta }}_\mathrm{pm}+{\widehat{\beta }}_\mathrm{cox}) \{A_\mathrm{pm}(T_i;\widehat{\theta })-A_\mathrm{pm}(T_j;\widehat{\theta })\} \end{pmatrix}^{\mathrm{t}}. \end{aligned}$$ -
[7]
Finally, we estimate the covariance \(G=\mathrm{Cov}(U_\mathrm{cox},U^{\mathrm{t}})\) as given in (28). We use
$$\begin{aligned} \widehat{G}&= - \frac{1}{n}\sum _{i=1}^n\frac{D_i}{R^{(0)}_n(T_i;{\widehat{\beta }}_\mathrm{cox})} \begin{pmatrix} \psi (T_i;\widehat{\theta })\{{\widehat{A}}_\mathrm{cox}(T_i) R^{(1)}_n(T_i;2{\widehat{\beta }}_\mathrm{cox})^{\mathrm{t}} - R^{(0)}_n(T_i; 2{\widehat{\beta }}_\mathrm{cox})\widehat{F}(T_i)^{\mathrm{t}}\} \\ {\widehat{A}}_ \mathrm{cox}(T_i)R^{(2)}_n(T_i;2{\widehat{\beta }}_\mathrm{cox}) - R^{(1)}_n(T_i; 2{\widehat{\beta }}_\mathrm{cox})\widehat{F}(T_i)^{\mathrm{t}} \end{pmatrix}^{\mathrm{t}} \\&\quad - \frac{1}{n} \sum _{i=1}^n\sum _{j:T_j \le T_i} \frac{D_j E_n(T_j;{\widehat{\beta }}_\mathrm{cox})}{R_n^{(0)}(T_j;{\widehat{\beta }}_\mathrm{cox})} \begin{pmatrix} R^{(0)}_{(i)}({\widehat{\beta }}_\mathrm{cox}+{\widehat{\beta }}_\mathrm{pm}) \{A^{\mathrm{d}}_\mathrm{pm}(T_i;\widehat{\theta }) - A^{\mathrm{d}}_\mathrm{pm}(T_j;\widehat{\theta })\} \\ R^{(1)}_{(i)}({\widehat{\beta }}_\mathrm{cox}+{\widehat{\beta }}_\mathrm{pm}) \{A_\mathrm{pm}(T_i;\widehat{\theta })-A_\mathrm{pm}(T_j;\widehat{\theta })\} \end{pmatrix}^{\mathrm{t}} \\&\quad + \frac{1}{n} \sum _{i=1}^n\sum _{j:T_j \le T_i} \frac{D_j}{R^{(0)}_n(T_j;{\widehat{\beta }}_\mathrm{cox})} \begin{pmatrix} \{A^{\mathrm{d}}_\mathrm{pm}(T_i;\widehat{\theta }) - A^{\mathrm{d}}_\mathrm{pm}(T_j;\widehat{\theta })\} R^{(1)}_{(i)}({\widehat{\beta }}_\mathrm{cox}+{\widehat{\beta }}_\mathrm{pm})^{\mathrm{t}} \\ \{A_\mathrm{pm}(T_i;\widehat{\theta })-A_\mathrm{pm}(T_j;\widehat{\theta })\} R^{(2)}_{(i)}({\widehat{\beta }}_\mathrm{cox}+{\widehat{\beta }}_\mathrm{pm}) \end{pmatrix}^{\mathrm{t}} \\&\quad + \begin{pmatrix} 0_{p \times q} \\ \widehat{J}_{\mathrm{cox}} \end{pmatrix}^{\mathrm{t}}. \end{aligned}$$
Relying strictly on the plug-in principle has the beneficial property that all estimators are consistent. This follows from the continuous mapping theorem since the precise formulae for the quantities in (39) are all seen to be continuous in the quantities and functions (in their appropriate spaces) for which we employ the plug-in principle.
To arrive at consistent estimators for \(v_\mathrm{cox}, v_c\) and \(v_\mathrm{pm}\) for the classes of focus parameters we have investigated, one typically needs consistent estimators also for the quantities: \(m'_\mathrm{pm}, m'_\mathrm{cox}, z'_\mathrm{pm}, z'_\mathrm{cox}\), \(\zeta _\mathrm{pm}(\cdot )\), \(\zeta _\mathrm{cox}(\cdot ), V_{t,\mathrm{pm}}(\cdot ), V_{t,\mathrm{cox}}(\cdot ),h_\mathrm{pm}(\phi _\mathrm{pm})\) and \(h_\mathrm{cox}(\phi _\mathrm{cox})\), as described in Section 4.1. All except the last of these are continuous when viewed as functions of the unknown quantities \(\theta _0, \beta _0, \beta _\mathrm{true}\) and \(A_\mathrm{true}(\cdot )\). These are therefore estimated consistently by plugging in empirical analogues, like above. The last quantity \(h_\mathrm{cox}(\phi _\mathrm{cox})=\alpha _\mathrm{true}(\phi _\mathrm{cox})\exp (x^{\mathrm{t}}\beta _\mathrm{true})\), with \(\phi _\mathrm{cox}=A^{-1}_\mathrm{true}(-\log (1-u)/\exp (x^{\mathrm{t}}\beta _\mathrm{true}))\) involved in estimation of a quantile (see Sect. 3 in the supplementary material (Jullum and Hjort, this work)), is more delicate as we need the estimator to be smooth or at least nonzero. The troublesome part is estimation of \(\alpha _\mathrm{true}\) at the unknown position \(\phi _\mathrm{cox}\). This position is estimated by \({\widehat{\phi }}_\mathrm{cox}={\widehat{A}}^{-1}_\mathrm{cox}(-\log (1-u)/\exp (x^{\mathrm{t}}{\widehat{\beta }}_\mathrm{cox}))\), while a smooth estimate of \(\alpha _\mathrm{true}\) is obtained e.g. via a kernel estimator \({\widehat{\alpha }}_\mathrm{cox}(t) = \int h^{-1} K^\circ ((t-s)/h) \, \mathrm{d}{\widehat{A}}_\mathrm{cox}(s)\) for some suitable kernel \(K^\circ \) and bandwidth \(h=h_n\), which then is evaluated in \({\widehat{\phi }}_\mathrm{cox}\). As long as the bandwidth has the property that \(h_n \rightarrow 0\), \(n h_n \rightarrow \infty \), and \(\alpha _\mathrm{true}\) is positive and two times differentiable in a neighborhood of \(\phi _\mathrm{cox}\), this strategy also yields a consistent estimator. Thus, replacing the quantities in the various forms of \(v_\mathrm{cox}\), \(v_c\), \(v_\mathrm{pm}\) towards the end of Section 4.1, by the estimators presented in this appendix, yields consistent estimators \({\widehat{v}}_\mathrm{cox}\), \({\widehat{v}}_c\), \({\widehat{v}}_\mathrm{pm}\).
Rights and permissions
About this article
Cite this article
Jullum, M., Hjort, N.L. What price semiparametric Cox regression?. Lifetime Data Anal 25, 406–438 (2019). https://doi.org/10.1007/s10985-018-9450-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-018-9450-7