Abstract
In spite of recent contributions to the literature, informative cluster size settings are not well known and understood. In this paper, we give a formal definition of the problem and describe it from different viewpoints. Data generating mechanisms, parametric and nonparametric models are considered in light of examples. Our emphasis is on nonparametric and robust approaches to the inference on the marginal distribution. Descriptive statistics and parameters of interest are defined as functionals and they are accompanied with a generally applicable testing procedure. The theory is illustrated with an example on patients with incomplete spinal cord injuries.
Similar content being viewed by others
References
Benhin E, Rao JNK, Scott AJ (2005) Mean estimating equation approach to analysing cluster-correlated data with nonignorable cluster sizes. Biometrika 92:435–450
Bickel PJ, Lehmann EL (1975a) Descriptive statistics for nonparametric models. I. Introduction. Ann Stat 3:1038–1044
Bickel PJ, Lehmann EL (1975b) Descriptive statistics for nonparametric models. II. Location. Ann Stat 3:1045–1069
Billingsley P (1995) Probability and measure, 3rd edn. Wiley, New York
Chiang CT, Lee KY (2008) Efficient estimation methods for informative cluster size data. Stat Sinica 18:121–133
Datta S, Satten GA (2005) Rank-sum tests for clustered data. J Am Stat Assoc 100:908–915
Datta S, Satten GA (2008) A signed-rank test for clustered data. Biometrics 64:501–507
Dunson DB, Chen Z, Harry J (2003) A Bayesian approach for joint modeling of cluster size and subunit-specific outcomes. Biometrics 59:521–530
Gueorguieva RV (2005) Comments about joint modeling of cluster size and binary and continuous subunit-specific outcomes. Biometrics 61:862–867
Harkema SJ, Schmidt-Read M, Behrman A, Bratta A, Sisto SA, Edgerton VR (2012) Establishing the neurorecovery network: multi-site rehabilitation centers that provide activity based therapies and assessments for neurologic disorders. Arch Phys Med Rehabil 93:1498–1507
Hoffman EB, Sen PK, Weinberg CR (2001) Within-cluster resampling. Biometrika 88:1121–1134
Huang Y, Chen YQ (2003) Marginal regression of gaps between recurrent events. Lifetime Data Anal 9(3):293–303
Larocque D, Nevalainen J, Oja H (2007) A weighted multivariate sign test for cluster correlated data. Biometrika 94:267–283
Lorenz DJ, Datta S, Harkema SJ (2011) Marginal association measures for clustered data. Stat Med 30(27):3181–3191. doi:10.1002/sim.4368
Neuhaus JM, McCulloch CE (2011) Estimation of covariate effects in generalized linear mixed models with informative cluster sizes. Biometrika 98:147–162
Panageas KS, Schrag D, Russell LA, Venkatraman ES, Begg CB (2007) Properties of analysis methods that account for clustering in volume-outcome studies when the primary predictor is cluster size. Stat Med 26:2017–2035
van Hedel H, Wirz M, Dietz V (2005) Assessing walking ability in subjects with spinal cord injury: validity and reliability of 3 walking tests. Arch Phys Med Rehabil 86:190–196
Wang M, Kong MK, Datta S (2011) Inference for marginal linear models for clustered longitudinal data with potentially informative cluster sizes. Stat Methods Med Res 20:347–367. doi:10.1177/0962280209347043
Williamson JM, Datta S, Satten GA (2003) Marginal analyses of clustered data when cluster size is informative. Biometrics 59:36–42
Williamson JM, Kim HY, Warner L (2007) Weighting condom use data to account for nonignorable cluster size. Ann Epidemiol 17:603–607
Williamson JM, Kim HY, Manatunga A, Addiss DG (2008) Modeling survival data with informative cluster size. Stat Med 27:543–555
Acknowledgments
The authors are grateful for the use of data from the NeuroRecovery Network, and thank the directors of centers participating in the NRN: Steve Ahr (Frazier Rehab Institute, Louisville, KY), Steve Williams, MD (Boston Medical Center, Boston, MA), Daniel Graves, PhD (Memorial Hermann/The Institute of Rehabilitation and Research, Houston, TX), Keith Tansey, MD, PhD (Shepherd Center, Atlanta, GA), Gail Forrest, PhD (Kessler Medical Rehabilitation Research and Education Corporation, West Orange, NJ), D. Michele Basso PT, EdD (The Ohio State University Medical Center, Columbus, OH) and Mary Schmidt Read, PT, DPT, MS (Magee Rehabilitation, Philadelphia, PA). This research was supported by the Academy of Finland and by NIH Grants 1R03DE020839-01A1, 5R03DE020839-02 and 1R03DE022538-01.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nevalainen, J., Datta, S. & Oja, H. Inference on the marginal distribution of clustered data with informative cluster size. Stat Papers 55, 71–92 (2014). https://doi.org/10.1007/s00362-013-0504-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-013-0504-3