Bivariate pseudo-observations for recurrent event analysis with terminal events

Furberg, Julie K.; Andersen, Per K.; Korn, Sofie; Overgaard, Morten; Ravn, Henrik

doi:10.1007/s10985-021-09533-5

Bivariate pseudo-observations for recurrent event analysis with terminal events

Published: 05 November 2021

Volume 29, pages 256–287, (2023)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Julie K. Furberg ORCID: orcid.org/0000-0001-8785-1462¹,
Per K. Andersen²,
Sofie Korn³,
Morten Overgaard⁴ &
…
Henrik Ravn¹

821 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

The analysis of recurrent events in the presence of terminal events requires special attention. Several approaches have been suggested for such analyses either using intensity models or marginal models. When analysing treatment effects on recurrent events in controlled trials, special attention should be paid to competing deaths and their impact on interpretation. This paper proposes a method that formulates a marginal model for recurrent events and terminal events simultaneously. Estimation is based on pseudo-observations for both the expected number of events and survival probabilities. Various relevant hypothesis tests in the framework are explored. Theoretical derivations and simulation studies are conducted to investigate the behaviour of the method. The method is applied to two real data examples. The bivariate marginal pseudo-observation model carries the strength of a two-dimensional modelling procedure and performs well in comparison with available models. Finally, an extension to a three-dimensional model, which decomposes the terminal event per death cause, is proposed and exemplified.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Article Open access 19 December 2014

Mendelian randomisation for mediation analysis: current methods and challenges for implementation

Article Open access 07 May 2021

When and how should multiple imputation be used for handling missing data in randomised clinical trials – a practical guide with flowcharts

Article Open access 06 December 2017

Availability of data and materials

The bladder cancer data set is available through the R package survival. Regarding the LEADER data (Marso et al. 2016): De-identified individual participant data, study protocol and redacted Clinical Study Report will be available according to Novo Nordisk data sharing commitments. The data will be made available permanently after research completion and approval of product and product use in both EU and US. Data will be shared with bona fide researchers submitting a research proposal requesting access to data and for use as approved by the Independent Review Board according to the IRB Charter (see novonordisk-trials.com). Access request proposal form and the access criteria can be found at novonordisk-trials.com. The data will be made available on a specialised SAS data platform.

Code availability

R code for the bivariate marginal pseudo-observation model is available upon request to the corresponding author.

References

Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10(4):1100–1120
Andersen PK, Perme MP (2010) Pseudo-observations in survival analysis. Stat Methods Med Res 19:71–99
Article MathSciNet Google Scholar
Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer series in statistics. Springer
Andersen PK, Klein JP, Rosthøj S (2003) Generalised linear models for correlated pseudo-observations, with applications to multi-state models. Biometrika 90:15–27
Article MathSciNet MATH Google Scholar
Andersen PK, Angst J, Ravn H (2019) Modeling marginal features in studies of recurrent events in the presence of a terminal event. Lifetime Data Anal 25(4):681–695
Article MathSciNet MATH Google Scholar
Binder N, Gerds TA, Andersen PK (2014) Pseudo-observations for competing risks with covariate dependent censoring. Lifetime Data Anal 20(2):303–315
Article MathSciNet MATH Google Scholar
Byar D (1980) The veterans administration study of chemoprophylaxis for recurrent stage I bladder tumours: comparisons of placebo, pyridoxine and topical thiotepa. Springer
Cook R, Lawless JF (1997) Marginal analysis of recurrent events and a terminating event. Stat Med 16:911–924
Article Google Scholar
Cox DR (1972) Regression models and life-tables. J R Stat Soc Ser B (Methodol) 34(2):187–220
MathSciNet MATH Google Scholar
Ghosh D, Lin D (2000) Nonparametric analysis of recurrent events and death. Biometrics 56:554–562
Article MathSciNet MATH Google Scholar
Ghosh D, Lin D (2002) Marginal regression models for recurrent and terminal events. Stat Sin 12:663–688
MathSciNet MATH Google Scholar
Graw F, Gerds TA, Schumacher M (2009) On pseudo-values for regression analysis in competing risks models. Lifetime Data Anal 15:241–255
Article MathSciNet MATH Google Scholar
Jacobsen M, Martinussen T (2016) A note on the large sample properties of estimators based on generalized linear models for correlated pseudo-observations. Scand J Stat 43:845–862
Article MathSciNet MATH Google Scholar
Liang KY, Zeger ST (1986) Longitudinal data analysis using generalized linear models. Biometrika 73(1):13–22
Article MathSciNet MATH Google Scholar
Lin D, Wei L, Yang I, Ying Z (2000) Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc 62(4):711–730
Article MathSciNet MATH Google Scholar
Liu L, Wolfe RA, Huang X (2004) Shared frailty models for recurrent events and a terminal event. Biometrics 60:747–756
Article MathSciNet MATH Google Scholar
Marso SP, Daniels GH, Brown-Frandsen K, Kristensen P, Mann JFE, Nauck MA, Nissen SE, Pocock S, Poulter NR, Ravn LS, Steinberg WM, Stockner M et al (2016) Liraglutide and cardiovascular outcomes in type 2 diabetes. N Engl J Med 375(4):311–322
Article Google Scholar
Overgaard M (2019) Counting processes in p-variation with application to recurrent events. https://arxiv.org/pdf/1903.04296.pdf
Overgaard M, Parner ET, Pedersen J (2017) Asymptotic theory of generalized estimating equations based on jack-knife pseudo-observations. Ann Stat 45(5):1988–2015
Article MathSciNet MATH Google Scholar
Pavlič K, Martinussen T, Andersen PK (2019) Goodness of fit tests for estimating equations based on pseudo-observations. Lifetime Data Anal 25:189–205
Article MathSciNet MATH Google Scholar
Wei LJ, Lin DY, Weissfeld L (1989) Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc 84:1065–1073
Article MathSciNet Google Scholar

Download references

Funding

This research was carried out as part of xxx’s Ph.D. education. For the PhD education, she received funding from Novo Nordisk A/S and Innovation Fund Denmark. xxx is supported by the Novo Nordisk Foundation Grant NNF17OC0028276.

Author information

Authors and Affiliations

Biostatistics GLP-1 and CV 1, Novo Nordisk A/S, Vandtårnsvej 114, Søborg, Denmark
Julie K. Furberg & Henrik Ravn
Section of Biostatistics, University of Copenhagen, Copenhagen, Denmark
Per K. Andersen
Biostatistics 1, LEO Pharma A/S, Ballerup, Denmark
Sofie Korn
Research unit for Biostatistics, Department of Public Health, Aarhus University, Aarhus, Denmark
Morten Overgaard

Authors

Julie K. Furberg
View author publications
You can also search for this author in PubMed Google Scholar
Per K. Andersen
View author publications
You can also search for this author in PubMed Google Scholar
Sofie Korn
View author publications
You can also search for this author in PubMed Google Scholar
Morten Overgaard
View author publications
You can also search for this author in PubMed Google Scholar
Henrik Ravn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julie K. Furberg.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

1.1 Theoretical details on bivariate normality of $(\hat{\beta }, \hat{\gamma })$

According to Overgaard et al. (2017), the pseudo-observation approach of this paper produces consistent and asymptotically normal parameter estimates under essentially two conditions. One condition is that the estimate $\hat{\theta }$ of $\theta = E(f(W))$ can be seen as a functional, $\phi $, of the empirical distribution, $F_n$, in a Banach space setting such that $\phi $ is two times (Fréchet) differentiable with a Lipschitz continuous second order derivative and such that $\Vert F_n\Vert $ converges at a certain rate. This condition ensures that the close approximation of the pseudo-observation $\hat{\theta }_i = \theta + \dot{\theta }(X_i) + \frac{1}{n-1}\sum _{j \ne i} \ddot{\theta }(X_i, X_j) + o_P(n^{-\frac{1}{2}})$ (uniformly in i) in terms of the estimator’s first and second order influence functions, $\dot{\theta }$ and $\ddot{\theta }$, holds. This, in turn, implies that the less close approximation $\hat{\theta }_i = \theta + \dot{\theta }(X_i) + o_P(1)$ also holds. The other condition is therefore that $E(\dot{\theta }(X) \mid Z) = E(f(W) \mid Z) - \theta $, which means that the pseudo-observations carry the right information and ensures that the estimating equation is unbiased under the model. The result of Overgaard et al. (2017) is formulated for one-dimensional pseudo-observations, but generalizes to multi-dimensional outcomes. In a multi-dimensional setting, the requirements then need to hold for each outcome separately.

For pseudo-observations of the Kaplan–Meier estimate $\hat{S}(t_l)$, the conditions above hold under assumption of positivity, i.e. $P(C> t_l) > 0$, and completely independent censoring, i.e. that C is independent of $(D^*, Z)$, as described by Overgaard et al. (2017) based on the work of Graw et al. (2009) and Jacobsen and Martinussen (2016). For pseudo-observations of $\hat{\mu }(t_l)$, the conditions were established by Overgaard (2019), see Example 8, under similar assumptions of positivity, completely independent censoring, here that C is independent of $(N^*, D^*, Z)$, and additionally the assumption that $N^*(t_l)$ has a little more than finite fourth moment.

The result of Overgaard et al. (2017) is that, under regularity conditions, estimates, $\hat{\xi } = \hat{\xi }_n$, exist that solve (4) with high probability for large n such that

$$\begin{aligned} \sqrt{n}(\hat{\xi }_n - \xi ) \end{aligned}$$

is asymptotically normal with mean 0 and variance

$$\begin{aligned} M^{-1} \Psi M^{-1}, \end{aligned}$$

where

$$\begin{aligned} M = E\left( \left( \frac{\partial m_i}{\partial \xi }\right) ^T V_i^{-1} \frac{\partial m_i}{\partial \xi } \right) \end{aligned}$$

and

$$\begin{aligned} \Psi = {\text {Var}}\left( \left( \frac{\partial m_i}{\partial \xi }\right) ^T V_i^{-1} (\theta + \dot{\theta }(X_i) - m(\xi ; Z_i)) + h(X_i)\right) \end{aligned}$$

with

$$\begin{aligned} h(x) = E\left( \left( \frac{\partial m_i}{\partial \xi }\right) ^T V_i^{-1} \ddot{\theta }(x, X_i) \right) . \end{aligned}$$

In summary, the suggested pseudo-observation approach produces consistent and asymptotically normal parameter estimates under the assumptions

1.
positivity, $P(C> t_k) > 0$,
2.
completely independent censoring, i.e. C is independent of $(N^*, D^*, Z)$,
3.
a little more than finite fourth moment of $N^*(t_k)$.

It is worth noting that the suggested estimate of $\Psi $ can be expected to consistently estimate ${\text {Var}}\Big (\big (\frac{\partial m_i}{\partial \xi }\big )^T V_i^{-1} (\theta + \dot{\theta }(X_i) - m(\xi ; Z_i))\Big )$ but not ${\text {Var}}\Big (\big (\frac{\partial m_i}{\partial \xi }\big )^T V_i^{-1} (\theta + \dot{\theta }(X_i) - m(\xi ; Z_i)) + h(X_i)\Big )$. In other words, any contribution from the second order terms of h are not included and so the estimate, and thereby the standard errors of the sandwich variance estimator, can be expected to be biased.

Plots from simulation of bivariate normality of $(\hat{\beta }, \hat{\gamma })$

This appendix displays additional plots visualizing the bivariate normal distribution of $(\hat{\beta }, \hat{\gamma })$ for different parameter settings and k.

1.1 $(n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, 0.2, 1)$ and $t=2$

See Appendix Fig. 8.

1.2 $(n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, -0.2, 1)$ and $t=2$

See Appendix Fig. 9.

1.3 $(n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, 0.2, 0.75)$ and $t=(1,2,3)$

See Appendix Fig. 10.

1.4 $(n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, 0.2, 1)$ and $t=(1,2,3)$

See Appendix Fig. 11.

1.5 $(n,\lambda _0^D, \beta ,\gamma _D, \rho ) = (100, 0.25, 0.5, -0.2, 1)$ and $t=(1,2,3)$

See Appendix Fig. 12.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Furberg, J.K., Andersen, P.K., Korn, S. et al. Bivariate pseudo-observations for recurrent event analysis with terminal events. Lifetime Data Anal 29, 256–287 (2023). https://doi.org/10.1007/s10985-021-09533-5

Download citation

Received: 25 March 2021
Accepted: 04 September 2021
Published: 05 November 2021
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10985-021-09533-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bivariate pseudo-observations for recurrent event analysis with terminal events

Abstract

Access this article

Similar content being viewed by others

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Mendelian randomisation for mediation analysis: current methods and challenges for implementation

When and how should multiple imputation be used for handling missing data in randomised clinical trials – a practical guide with flowcharts

Availability of data and materials

Code availability

References

Funding