A general piecewise multi-state survival model: application to breast cancer

Ruiz-Castro, Juan Eloy; Zenga, Mariangela

doi:10.1007/s10260-019-00505-6

A general piecewise multi-state survival model: application to breast cancer

Original Paper
Published: 17 December 2019

Volume 29, pages 813–843, (2020)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

1118 Accesses
4 Citations
Explore all metrics

Abstract

Multi-state models are considered in the field of survival analysis for modelling illnesses that evolve through several stages over time. Multi-state models can be developed by applying several techniques, such as non-parametric, semi-parametric and stochastic processes, particularly Markov processes. When the development of an illness is being analysed, its progression is tracked periodically. Medical reviews take place at discrete times, and a panel data analysis can be formed. In this paper, a discrete-time piecewise non-homogeneous Markov process is constructed for modelling and analysing a multi-state illness with a general number of states. The model is built, and relevant measures, such as survival function, transition probabilities, mean total times spent in a group of states and the conditional probability of state change, are determined. A likelihood function is built to estimate the parameters and the general number of cut-points included in the model. Time-dependent covariates are introduced, the results are obtained in a matrix algebraic form and the algorithms are shown. The model is applied to analyse the behaviour of breast cancer. A study of the relapse and survival times of 300 breast cancer patients who have undergone mastectomy is developed. The results of this paper are implemented computationally with MATLAB and R.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonparametric estimation of conditional transition probabilities in a non-Markov illness-death model

Article 22 October 2014

Luís Meira-Machado, Jacobo de Uña-Álvarez & Somnath Datta

Methods for checking the Markov condition in multi-state survival data

Article 04 August 2021

Gustavo Soutinho & Luís Meira-Machado

The Utility of Multistate Models: A Flexible Framework for Time-to-Event Data

Article Open access 29 June 2022

Jennifer G. Le-Rademacher, Terry M. Therneau & Fang-Shu Ou

Notes

The degree of freedom is given by 7 possible transitions (1 → 1, 1 → 2, 1 → 3 1 → C, 2 → 2, 2 → 3, 2 → C), 3 periods, 8 groups of patients divided by treatment regimen and 35 estimated parameters: (7 − 1) × (3 − 1) × (8 − 1) − 35 = 84 − 35 = 49.

References

Andersen PK, Keiding N (2001) Multi-state models for event history analysis. Stat Methods Med Res 11:91–115
Article MATH Google Scholar
Bacchetti P, Boylan RD, Terrault NA, Monto A, Berenguer M (2010) Non-Markov multistate modeling using time-varying covariates, with application to progression of liver fibrosis due to hepatitis C following liver transplant. Int J Biostat 6(1):1–14
Article MathSciNet Google Scholar
Chen B, Yi GY, Cook RJ (2010) Analysis of interval censored disease progression data via multistate models under a non ignorable inspection process. Stat Med 29:1175–1189
Article MathSciNet Google Scholar
Commenges D, Joly P (2004) Multi-state model for dementia, institutionalization and death. Commun Stat A 33:1315–1326
Article MathSciNet MATH Google Scholar
Cortese G, Andersen PK (2010) Competing risks and time-dependent covariates. Biom J 52(1):138–158
MathSciNet MATH Google Scholar
Faddy MJ (1998) On inferring the number of phases in a coxian phase-type distribution. Commun Stat Stoch Models 14(1–2):407–417
Article MATH Google Scholar
Farewell VT, Tom BDM (2014) The versatility of multi-state models for the analysis of longitudinal data with unobservable features. Lifetime Data Anal 20:51–75
Article MathSciNet MATH Google Scholar
Hollander M, Proschan F (1979) Testing to determine the underlying distribution using randomly censored data. Biometrics 35(2):393–401
Article MathSciNet MATH Google Scholar
Hougaard P (1999) Multi-state models: a review. Lifetime Data Anal 5:239–264
Article MathSciNet MATH Google Scholar
Ieva F, Jackson C, Sharples LD (2015) Multi-state modelling of repeated hospitalisation and death in patients with heart failure: the use of large administrative databases in clinical epidemiology. Stat Methods Med Res. https://doi.org/10.1177/0962280215578777
Article Google Scholar
Jackson CH (2011) Multi-state models for panel data: the msm package for R. J Stat Softw 38:1–29
Article Google Scholar
Jackson CH, Sharples LD, Thompson SG, Duffy SW, Couto E (2003) Multi-state Markov models for disease progression with classification error. Statistician 52:193–209
MathSciNet Google Scholar
Kalbfleisch JD, Lawless JF (1985) The analysis of panel data under a Markov assumption. J Am Stat Assoc 80:863–871
Article MathSciNet MATH Google Scholar
Kalbfleisch JD, Prentice RL (1980) The statistical analysis of failure time data. Wiley series in probability and mathematical statistics. Wiley, Hoboken
Google Scholar
Meira-Machado L, de Uña-Alvarez J, Cadarso-Suarez C (2009) Multi-state models for the analysis of time-to-event data. Stat Methods Med Res 18(2):195–222
Article MathSciNet Google Scholar
Neuts MF (1981) Matrix-geometric solutions in stochastic models. Volume 2 of Johns Hopkins series in the mathematical sciences. Johns Hopkins University Press, Baltimore
Google Scholar
Pérez-Ocón R, Ruiz-Castro JE, Gámiz-Pérez ML (1998) A multivariate model to measure the effect of treatments in survival to breast cancer. Biom J 40(6):703–715
Article MATH Google Scholar
Pérez-Ocón R, Ruiz-Castro JE, Gámiz-Pérez ML (2001) Non-homogeneous Markov processes for analysing the effect of treatments to breast cancer. Stat Med 20:109–122
Article Google Scholar
Putter H, Fiocco M, Geskus RB (2007) Tutorial in biostatistics: competing risks and multi-state models. Stat Med 26:2389–2430
Article MathSciNet Google Scholar
Santamaría C, García-Mora B, Rubio G, Navarro E (2009) A Markov model for analyzing the evolution of bladder carcinoma. Math Comput Model 50:726–732
Article MathSciNet MATH Google Scholar
Singer JD, Willett JB (2003) Applied longitudinal data analysis. Oxford University Press, Oxford
Book Google Scholar
Titman AC (2014) Estimating parametric semi-Markov models from panel data using phase-type approximations. Stat Comput 24:155–164
Article MathSciNet MATH Google Scholar
Titman AC, Sharples LD (2010) Model diagnostics for multi-state models. Stat Methods Med Res 19(6):621–651. https://doi.org/10.1177/0962280209105541
Article MathSciNet Google Scholar
Van De Hout A (2016) Multi-state survival models for interval-censored data. CRC Press, Boca Raton
Google Scholar

Download references

Acknowledgements

Funding was provided by Ministerio de Economía y Competitividad (Grant No. FQM-307), European Regional Development Fund (ERDF) (Grant No. MTM2017-88708-P), University of Milano-Bicocca (Grant No. 2014-ATE-0228).

Author information

Authors and Affiliations

Department of Statistics and Operational Research and IEMath-GR, Faculty of Science, University of Granada, Campus Fuentenueva s/n. 18071, Granada, Spain
Juan Eloy Ruiz-Castro
Department of Statistics and Quantitative Methods, University of Milano-Bicocca, Via Bicocca degli Arcimboldi, 8, 20126, Milan, Italy
Mariangela Zenga

Authors

Juan Eloy Ruiz-Castro
View author publications
You can also search for this author in PubMed Google Scholar
Mariangela Zenga
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mariangela Zenga.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

The parameters of the model are estimated by a maximum likelihood function. These parameters are the matrices T_u (or parameters inside these matrices), the regression covariate vectors β^u, for u = 1,…, k and the cut-points, all of them estimated jointly. We assume that n items are observed, all beginning in state 1, and item i is observed at m_i change times, the last time being death or censorship. Given that the item is observed at change times, then for any item, the value of the covariate vector and the corresponding state is observed. Therefore, a sequence of times, states and values of the covariate vector is achieved for each item i: $0 = t_{i,1} < t_{i,2} < \cdots < t_{{i,m_{i} }}$, $1 = x_{1}^{i} , \ldots , \, x_{{m_{i} }}^{i}$ and ${\mathbf{z}}_{{l_{1} }}^{i} , \ldots ,{\mathbf{z}}_{{l_{{m_{i} }} }}^{i}$, respectively. ${\mathbf{z}}_{{l_{s} }}^{i}$ corresponds to the covariate vector for the interval that contains the time $t_{i,s}$ for item i and for $s = 1, \ldots ,m_{i}$.

We assume k − 1 unknown positive integer cut-points, c₀ = 0 < c₁ < ··· < c_k−1 < c_k = ∞. The likelihood function for estimating the parameters is given by

$$L\left( {c_{1} , \ldots ,c_{k - 1} ,{\mathbf{T}}_{u} ,{\varvec{\upbeta}}^{u} ,u = 1, \ldots ,k} \right) = \prod\limits_{i = 1}^{n} {\prod\limits_{s = 2}^{{m_{i} }} {h_{{x_{s - 1}^{i} ,x_{s}^{i} }} \left( {\left. {{\mathbf{T}}_{u} ,{\varvec{\upbeta}}^{u} ,u = 1, \ldots ,k} \right|t_{i,s - 1} ,t_{i,s} ,{\mathbf{z}}_{{l_{s - 1} }}^{i} , \ldots ,{\mathbf{z}}_{{l_{s} }}^{i} } \right)} } .$$

For the calculations, we define the intervals $I_{q} = \left[ {c_{q - 1} ,c_{q} } \right[;J_{q} = \left] {c_{q - 1} ,c_{q} } \right] ,\, \, j = 1, \ldots ,k$. Let $f_{x}^{q} \left( {t,{\mathbf{z}}_{q}^{i} ;{\mathbf{T}}_{q} ,{\varvec{\upbeta}}^{q} } \right)$ be the sojourn time probability in state x at time t calculated by using the matrix ${\mathbf{P}}_{q} \left( {{\mathbf{z}}_{q}^{i} } \right)$. Given that the state at any cut-point is known, then the factors in the likelihood function have the following expressions,

1.
If t_i,s−1 and t_i,s belong to intervals I_j and J_j, respectively,
$$h_{{x_{s - 1}^{i} ,x_{s}^{i} }} \left( {\left. {{\mathbf{T}}_{j} ,{\varvec{\upbeta}}^{j} } \right|t_{i,s - 1} ,t_{i,s} ,{\mathbf{z}}_{{l_{s - 1} }}^{i} , \ldots ,{\mathbf{z}}_{{l_{s} }}^{i} } \right) = f_{{x_{s - 1}^{i} }}^{j} \left( {t_{i,s} - t_{i,s - 1} - 1,{\mathbf{z}}_{j}^{i} ;{\mathbf{T}}_{j} ,{\varvec{\upbeta}}^{j} } \right)T_{{x_{s - 1}^{i} x_{s}^{i} }}^{j} \left( {{\mathbf{z}}_{j}^{i} } \right) .$$
2.
If t_i,s−1 and t_i,s belong to interval I_j−1, J_j, respectively,
$$\begin{aligned} h_{{x_{s - 1}^{i} ,x_{s}^{i} }} \left( {\left. {{\mathbf{T}}_{u} ,{\varvec{\upbeta}}^{u} ,u = j - 1,j} \right|t_{i,s - 1} ,t_{i,s} ,{\mathbf{z}}_{{l_{s - 1} }}^{i} , \ldots ,{\mathbf{z}}_{{l_{s} }}^{i} } \right) = & f_{{x_{s - 1}^{i} }}^{j - 1} \left( {c_{j - 1} - t_{i,s - 1} ,{\mathbf{z}}_{j - 1}^{i} ;{\mathbf{T}}_{j - 1} ,{\varvec{\upbeta}}^{j - 1} } \right) \\ & \quad \times f_{{x_{s - 1}^{i} }}^{j} \left( {t_{i,s} - c_{j - 1} - 1,{\mathbf{z}}_{j}^{i} ;{\mathbf{T}}_{j} ,{\varvec{\upbeta}}^{j} } \right)T_{{x_{s - 1}^{i} ,x_{s}^{i} }}^{j} \left( {{\mathbf{z}}_{j}^{i} } \right). \\ \end{aligned}$$
3.
If $t_{i,s - 1} \in I_{j} \;{\text{and}}\;t_{i,s} \in J_{q} \;{\text{with}}\;q - j \ge 2$,
$$\begin{aligned} h_{{x_{s - 1}^{i} ,x_{s}^{i} }} \left( {\left. {{\mathbf{T}}_{u} ,{\varvec{\upbeta}}^{u} ,u = j, \ldots ,q} \right|t_{i,s - 1} ,t_{i,s} ,{\mathbf{z}}_{{l_{s - 1} }}^{i} , \ldots ,{\mathbf{z}}_{{l_{s} }}^{i} } \right) = & f_{{x_{s - 1}^{i} }}^{j} \left( {c_{j} - t_{i,s - 1} ,{\mathbf{z}}_{j}^{i} ;{\mathbf{T}}_{j} ,{\varvec{\upbeta}}^{j} } \right) \\ & \quad \times \prod\limits_{u = j + 1}^{q - 1} {f_{{x_{s - 1}^{i} }}^{u} \left( {c_{u} - c_{u - 1} ,{\mathbf{z}}_{u}^{i} ;{\mathbf{T}}_{u} ,{\varvec{\upbeta}}^{u} } \right)} f_{{x_{s - 1}^{i} }}^{q} \left( {t_{i,s} - c_{q} - 1,{\mathbf{z}}_{q}^{i} ;{\mathbf{T}}_{q} ,{\varvec{\upbeta}}^{q} } \right)T_{{x_{s - 1}^{i} ,x_{s}^{i} }}^{q} \left( {{\mathbf{z}}_{q}^{i} } \right). \\ \end{aligned}$$

The likelihood function is maximized by considering several restrictions. The matrices ${\mathbf{P}}_{q}$ and ${\mathbf{P}}_{q} \left( {{\mathbf{z}}_{q}^{i} } \right)$ associated with the model should be stochastic matrices for any covariate vector ${\mathbf{z}}_{q}^{i}$. This restriction will not allow probabilities less than zero or greater than one for any values of the parameters.

Then, the cut-points are estimated, and the optimum values $c_{1} , \ldots ,c_{k - 1}$ are the values that verify

$$c_{1} , \ldots ,c_{k - 1} \in {\rm N}\,{\text{such}}\,{\text{that}}\,L\left( {c_{1} , \ldots ,c_{k - 1} ,{\hat{\mathbf{T}}}_{u}^{{c_{1} , \ldots ,c_{k - 1} }} ,{\hat{\mathbf{\beta }}}_{u}^{{c_{1} , \ldots ,c_{k - 1} }} ,u = 1, \ldots ,k} \right) = \mathop {\hbox{max} }\limits_{{v_{j} }} \left\{ {L\left( {v_{1} , \ldots ,v_{k - 1} ,{\hat{\mathbf{T}}}_{u}^{{v_{1} , \ldots ,v_{k - 1} }} ,{\hat{\mathbf{\beta }}}_{u}^{{v_{1} , \ldots ,v_{k - 1} }} ,u = 1, \ldots ,k} \right)} \right\} ,$$

subject to $0 < v_{j} < v_{j + 1} \,{\text{for}}\, \, j = 1, \ldots ,k - 2$ and $v_{k - 1} < \mathop {\hbox{max} }\limits_{i} \left\{ {t_{{i,m_{i} }} } \right\}$, where v_j belongs to the set of natural numbers for any j with the corresponding restrictions. $\left( {{\hat{\mathbf{T}}}_{u}^{{v_{1} , \ldots ,v_{k - 1} }} ,{\hat{\mathbf{\beta }}}_{u}^{{v_{1} , \ldots ,v_{k - 1} }} ,u = 1, \ldots ,k} \right)$ are the maximum likelihood estimates of $\left( {{\mathbf{T}}^{u} ,{\varvec{\upbeta}}^{u} ,u = 1, \ldots ,k} \right)$ for $\nu_{1} , \ldots ,\nu_{k - 1}$.

The likelihood function has been implemented computationally with Matlab and it is maximized by using the function fmincon of this programme. This function is used to find the minimum of a constrained nonlinear multivariable function by using the interior-point algorithm.

Appendix B

See Tables 11 and 12.

Table 11 Contingency table of observed and expected counts for the homogeneous model

Full size table

Table 12 Contingency table of observed and expected counts for the piecewise model

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ruiz-Castro, J.E., Zenga, M. A general piecewise multi-state survival model: application to breast cancer. Stat Methods Appl 29, 813–843 (2020). https://doi.org/10.1007/s10260-019-00505-6

Download citation

Accepted: 08 December 2019
Published: 17 December 2019
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10260-019-00505-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A general piecewise multi-state survival model: application to breast cancer

Abstract

Access this article

Similar content being viewed by others

Nonparametric estimation of conditional transition probabilities in a non-Markov illness-death model

Methods for checking the Markov condition in multi-state survival data

The Utility of Multistate Models: A Flexible Framework for Time-to-Event Data

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A

Appendix B

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A general piecewise multi-state survival model: application to breast cancer

Abstract

Access this article

Similar content being viewed by others

Nonparametric estimation of conditional transition probabilities in a non-Markov illness-death model

Methods for checking the Markov condition in multi-state survival data

The Utility of Multistate Models: A Flexible Framework for Time-to-Event Data

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A

Appendix B

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation