当前位置: X-MOL 学术Stat. Anal. Data Min. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bag of little bootstraps for massive and distributed longitudinal data
Statistical Analysis and Data Mining ( IF 2.1 ) Pub Date : 2021-11-22 , DOI: 10.1002/sam.11563
Xinkai Zhou 1 , Jin J Zhou 2 , Hua Zhou 1, 3
Affiliation  

Linear mixed models are widely used for analyzing longitudinal datasets, and the inference for variance component parameters relies on the bootstrap method. However, health systems and technology companies routinely generate massive longitudinal datasets that make the traditional bootstrap method infeasible. To solve this problem, we extend the highly scalable bag of little bootstraps method for independent data to longitudinal data and develop a highly efficient Julia package MixedModelsBLB.jl. Simulation experiments and real data analysis demonstrate the favorable statistical performance and computational advantages of our method compared to the traditional bootstrap method. For the statistical inference of variance components, it achieves 200 times speedup on the scale of 1 million subjects (20 million total observations), and is the only currently available tool that can handle more than 10 million subjects (200 million total observations) using desktop computers.

中文翻译:

用于海量和分布式纵向数据的小引导程序包

线性混合模型广泛用于分析纵向数据集,方差分量参数的推断依赖于 bootstrap 方法。然而,卫生系统和技术公司通常会生成大量纵向数据集,这使得传统的 bootstrap 方法不可行。为了解决这个问题,我们将用于独立数据的高度可扩展的 bag of little bootstraps 方法扩展到纵向数据,并开发了一个高效的 Julia 包MixedModelsBLB.jl。仿真实验和真实数据分析证明了我们的方法与传统的 bootstrap 方法相比具有良好的统计性能和计算优势。对于方差分量的统计推断,在100万受试者(2000万总观察)的规模上实现了200倍的加速,是目前唯一可以使用桌面处理超过1000万受试者(2亿总观察)的工具电脑。
更新日期:2021-11-22
down
wechat
bug