Abstract
In this work, we propose nonparametric two-sample tests for population-averaged transition and state occupation probabilities for continuous-time and finite state space processes with clustered, right-censored, and/or left-truncated data. We consider settings where the two groups under comparison are independent or dependent, with or without complete cluster structure. The proposed tests do not impose assumptions regarding the structure of the within-cluster dependence and are applicable to settings with informative cluster size and/or non-Markov processes. The asymptotic properties of the tests are rigorously established using empirical process theory. Simulation studies show that the proposed tests work well even with a small number of clusters, and that they can be substantially more powerful compared to the only, to the best of our knowledge, previously proposed nonparametric test for this problem. The tests are illustrated using data from a multicenter randomized controlled trial on metastatic squamous-cell carcinoma of the head and neck.
Similar content being viewed by others
References
Aalen, O. O., Johansen, S. (1978). An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scandinavian Journal of Statistics, 5(3), 141–150.
Andersen, P. K., Borgan, O., Gill, R. D., & Keiding, N. (2012). Statistical models based on counting processes. New York: Springer Science & Business Media.
Athreya, K. B., & Lahiri, S. N. (2006). Measure theory and probability theory. New York: Springer Science & Business Media.
Bakoyannis, G. (2020). Nonparametric tests for transition probabilities in nonhomogeneous Markov processes. Journal of Nonparametric Statistics, 32(1), 131–156.
Bakoyannis, G. (2021). Nonparametric analysis of nonhomogeneous multistate processes with clustered observations. Biometrics, 77(2), 533–546.
Begg, C. B., Larson, M. (1982). A study of the use of the probability-of-being-in-response function as a summary of tumor response data. Biometrics, 38(1), 59–66.
Bluhmki, T., Dobler, D., Beyersmann, J., Pauly, M. (2019). The wild bootstrap for multivariate Nelson–Aalen estimators. Lifetime Data Analysis, 25(1), 97–127.
Bluhmki, T., Schmoor, C., Dobler, D., Pauly, M., Finke, J., Schumacher, M., Beyersmann, J. (2018). A wild bootstrap approach for the Aalen–Johansen estimator. Biometrics, 74(3), 977–985.
Cai, T., Wei, L., Wilcox, M. (2000). Semiparametric regression analysis for clustered failure time data. Biometrika, 87(4), 867–878.
Cameron, A. C., Gelbach, J. B., Miller, D. L. (2008). Bootstrap-based improvements for inference with clustered errors. The Review of Economics and Statistics, 90(3), 414–427.
Campbell, M., Donner, A., Klar, N. (2007). Developments in cluster randomized trials and Statistics in Medicine. Statistics in Medicine, 26(1), 2–19.
Capasso, V., & Bakstein, D. (2015). An introduction to continuous-time stochastic processes. Basel: Birkhäuser.
Datta, S., Satten, G. A. (2001). Validity of the Aalen–Johansen estimators of stage occupation probabilities and Nelson–Aalen estimators of integrated transition hazards for non-Markov models. Statistics & Probability Letters, 55(4), 403–411.
Datta, S., Satten, G. A. (2002). Estimation of integrated transition hazards and stage occupation probabilities for non-Markov systems under dependent censoring. Biometrics, 58(4), 792–802.
de Uña-Álvarez, J., Mandel, M. (2018). Nonparametric estimation of transition probabilities for a general progressive multi-state model under cross-sectional sampling. Biometrics, 74(4), 1203–1212.
de Uña-Álvarez, J., Meira-Machado, L. (2015). Nonparametric estimation of transition probabilities in the non-Markov illness-death model: A comparative study. Biometrics, 71(2), 364–375.
Ellis, S., Carroll, K. J., Pemberton, K. (2008). Analysis of duration of response in oncology trials. Contemporary Clinical Trials, 29(4), 456–465.
Fong, Y., Huang, Y., Lemos, M. P., Mcelrath, M. J. (2018). Rank-based two-sample tests for paired data with missing values. Biostatistics, 19(3), 281–294.
Glidden, D. V. (2002). Robust inference for event probabilities with non-Markov event data. Biometrics, 58(2), 361–368.
Kahan, B. C., Morris, T. P. (2012). Improper analysis of trials randomised using stratified blocks or minimisation. Statistics in Medicine, 31(4), 328–340.
Kosorok, M. R. (2008). Introduction to empirical processes and semiparametric inference. New York: Springer Science & Business Media.
Little, R. J., & Rubin, D. B. (2019). Statistical analysis with missing data (Vol. 793). Hoboken: John Wiley & Sons.
Liu, D., Kalbfleisch, J. D., Schaubel, D. E. (2011). A positive stable frailty model for clustered failure time data with covariate-dependent frailty. Biometrics, 67(1), 8–17.
Putter, H., Spitoni, C. (2018). Non-parametric estimation of transition probabilities in non-Markov multi-state models: The landmark Aalen–Johansen estimator. Statistical Methods in Medical Research, 27(7), 2081–2092.
Seaman, S., Pavlou, M., Copas, A. (2014). Review of methods for handling confounding by cluster and informative cluster size in clustered data. Statistics in Medicine, 33(30), 5371–5387.
Seaman, S. R., Pavlou, M., Copas, A. J. (2014). Methods for observed-cluster inference when cluster size is informative: A review and clarifications. Biometrics, 70(2), 449–456.
Shorack, G. R., & Wellner, J. A. (2009). Empirical processes with applications to statistics. Philadelphia: SIAM.
Studer, M., Struffolino, E., Fasang, A. E. (2018). Estimating the relationship between time-varying covariates and trajectories: The sequence analysis multistate model procedure. Sociological Methodology, 48(1), 103–135.
Tattar, P. N., Vaman, H. (2014). The \(k\)-sample problem in a multi-state model and testing transition probability matrices. Lifetime Data Analysis, 20(3), 387–403.
Temkin, N. R. (1978). An analysis for transient states with application to tumor shrinkage. Biometrics, 34(4), 571–580.
Titman, A. C. (2015). Transition probability estimates for non-Markov multi-state models. Biometrics, 71(4), 1034–1041.
US Food and Drug Administration, et al. (2018). Guidance for industry: Clinical trial endpoints for the approval of cancer drugs and biologics. Federal Register.
van der Vaart, A. W. (2000). Asymptotic statistics. Cambridge University Press.
Vermorken, J. B., Stöhlmacher-Williams, J., Davidenko, I., Licitra, L., Winquist, E., Villanueva, C., Foa, P., Rottey, S., Skladowski, K., Tahara, M., et al. (2013). Cisplatin and fluorouracil with or without panitumumab in patients with recurrent or metastatic squamous-cell carcinoma of the head and neck (SPECTRUM): An open-label phase 3 randomised trial. The Lancet Oncology, 14(8), 697–710.
Zhang, H., Schaubel, D. E., Kalbfleisch, J. D. (2011). Proportional hazards regression for the analysis of clustered survival data from case-cohort studies. Biometrics, 67(1), 18–28.
Zhou, B., Fine, J., Latouche, A., Labopin, M. (2012). Competing risks regression for clustered data. Biostatistics, 13(3), 371–383.
Acknowledgements
We thank the Associate Editor and the two anonymous reviewers for their insightful comments which helped us to significantly improve this manuscript. This article is based on research using data obtained from www.projectdatasphere.org, which is maintained by Project Data Sphere. Neither Project Data Sphere nor the owner(s) of any information from the web site have contributed to, approved, or are in any way responsible for the contents of this article. Bakoyannis acknowledges funding support from Grants R21AI145662 and R01AI140854 from the National Institutes of Health. Bandyopadhyay acknowledges funding support from Grant P30CA016059 from the National Institutes of Health.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
About this article
Cite this article
Bakoyannis, G., Bandyopadhyay, D. Nonparametric tests for multistate processes with clustered data. Ann Inst Stat Math 74, 837–867 (2022). https://doi.org/10.1007/s10463-021-00819-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-021-00819-x