当前位置: X-MOL 学术J. Comput. Graph. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
One-Step Generalized Estimating Equations With Large Cluster Sizes
Journal of Computational and Graphical Statistics ( IF 2.4 ) Pub Date : 2017-07-03 , DOI: 10.1080/10618600.2017.1321552
Stuart Lipsitz 1 , Garrett Fitzmaurice 2 , Debajyoti Sinha 3 , Nathanael Hevelone 1 , Jim Hu 4 , Louis L Nguyen 1
Affiliation  

ABSTRACT Medical studies increasingly involve a large sample of independent clusters, where the cluster sizes are also large. Our motivating example from the 2010 Nationwide Inpatient Sample (NIS) has 8,001,068 patients and 1049 clusters, with average cluster size of 7627. Consistent parameter estimates can be obtained naively assuming independence, which are inefficient when the intra-cluster correlation (ICC) is high. Efficient generalized estimating equations (GEE) incorporate the ICC and sum all pairs of observations within a cluster when estimating the ICC. For the 2010 NIS, there are 92.6 billion pairs of observations, making summation of pairs computationally prohibitive. We propose a one-step GEE estimator that (1) matches the asymptotic efficiency of the fully iterated GEE; (2) uses a simpler formula to estimate the ICC that avoids summing over all pairs; and (3) completely avoids matrix multiplications and inversions. These three features make the proposed estimator much less computationally intensive, especially with large cluster sizes. A unique contribution of this article is that it expresses the GEE estimating equations incorporating the ICC as a simple sum of vectors and scalars.

中文翻译:

具有大集群规模的一步广义估计方程

摘要 医学研究越来越多地涉及大量独立集群样本,其中集群规模也很大。我们来自 2010 年全国住院样本 (NIS) 的激励示例有 8,001,068 名患者和 1049 个集群,平均集群大小为 7627。假设独立性可以天真地获得一致的参数估计,当集群内相关性 (ICC) 高时,这是低效的. 有效的广义估计方程 (GEE) 包含 ICC,并在估计 ICC 时将集群内的所有观测值对相加。对于 2010 年 NIS,有 926 亿对观测值,这使得对对的求和在计算上变得非常困难。我们提出了一种一步 GEE 估计器,它 (1) 匹配完全迭代 GEE 的渐近效率;(2) 使用更简单的公式来估计避免对所有对求和的 ICC;(3) 完全避免矩阵乘法和求逆。这三个特征使所提出的估计器的计算强度大大降低,尤其是在集群规模较大的情况下。本文的一个独特贡献是它将包含 ICC 的 GEE 估计方程表达为向量和标量的简单总和。
更新日期:2017-07-03
down
wechat
bug