A Massive Data Framework for M-Estimators with Cubic-Rate,Journal of the American Statistical Association

当前位置： X-MOL 学术 › J. Am. Stat. Assoc. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Massive Data Framework for M-Estimators with Cubic-Rate
Journal of the American Statistical Association ( IF 3.0 ) Pub Date : 2018-06-19 , DOI: 10.1080/01621459.2017.1360779
Chengchun Shi ₁ , Wenbin Lu ₁ , Rui Song ₁

Affiliation

ABSTRACT The divide and conquer method is a common strategy for handling massive data. In this article, we study the divide and conquer method for cubic-rate estimators under the massive data framework. We develop a general theory for establishing the asymptotic distribution of the aggregated M-estimators using a weighted average with weights depending on the subgroup sample sizes. Under certain condition on the growing rate of the number of subgroups, the resulting aggregated estimators are shown to have faster convergence rate and asymptotic normal distribution, which are more tractable in both computation and inference than the original M-estimators based on pooled data. Our theory applies to a wide class of M-estimators with cube root convergence rate, including the location estimator, maximum score estimator, and value search estimator. Empirical performance via simulations and a real data application also validate our theoretical findings. Supplementary materials for this article are available online.

中文翻译：

具有立方率的 M 估计器的海量数据框架

摘要分治法是处理海量数据的常用策略。在本文中，我们研究了海量数据框架下立方速率估计量的分治法。我们开发了一个通用理论，用于使用加权平均值建立聚合 M 估计量的渐近分布，权重取决于子组样本大小。在子组数增长速率一定的条件下，得到的聚合估计量表现出更快的收敛速度和渐近正态分布，在计算和推理上都比基于池化数据的原始 M 估计量更易于处理。我们的理论适用于多种具有立方根收敛速度的 M 估计器，包括位置估计器、最大得分估计器和值搜索估计器。通过模拟和真实数据应用的经验表现也验证了我们的理论发现。本文的补充材料可在线获取。

更新日期：2018-06-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11