当前位置: X-MOL 学术Commun. Stat. Simul. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Composite quasi-likelihood for single-index models with massive datasets
Communications in Statistics - Simulation and Computation ( IF 0.8 ) Pub Date : 2020-04-16 , DOI: 10.1080/03610918.2020.1753074
Rong Jiang 1 , Meng-Fan Guo 1 , Xin Liu 1
Affiliation  

Abstract

The single-index models (SIMs) provide an efficient way of coping with high-dimensional nonparametric estimation problems and avoid the “curse of dimensionality.” Many existing estimation procedures for SIMs were built on least square loss, which is popular for its mathematical beauty but is non-robust to non-normal errors and outliers. This article addressed the question of both robustness and efficiency of estimation methods based on a new data-driven weighted linear combination of convex loss functions instead of only quadratic loss for SIMs. The optimal weights can be chosen to provide maximum efficiency and these optimal weights can be estimated from data. As a specific example, we introduce a robust method of composite least square and least absolute deviation methods. Moreover, we extend the proposed method to the analysis of massive datasets via a divide-and-conquer strategy. The proposed approach significantly reduces the required primary memory and the resulting estimate is as efficient as if the entire dataset was analyzed simultaneously. The asymptotic normality of the proposed estimators is established. The simulation studies and real data applications are conducted to illustrate the finite sample performance of the proposed methods.



中文翻译:

具有海量数据集的单指标模型的复合准似然

摘要

单指标模型 (SIM) 提供了一种有效的方法来应对高维非参数估计问题并避免“维度灾难”。许多现有的 SIM 估计程序都是建立在最小二乘损失的基础上的,最小二乘损失因其数学美而广受欢迎,但对非正态误差和异常值不鲁棒。本文解决了基于凸损失函数的新数据驱动加权线性组合的估计方法的稳健性和效率问题,而不仅仅是 SIM 的二次损失。可以选择最佳权重以提供最大效率,并且可以从数据中估计这些最佳权重。作为一个具体的例子,我们介绍了一种稳健的复合最小二乘法和最小绝对偏差法。而且,我们通过分而治之的策略将所提出的方法扩展到分析海量数据集。所提出的方法显着减少了所需的主内存,并且由此产生的估计与同时分析整个数据集一样有效。建立了所提出的估计量的渐近正态性。进行了仿真研究和实际数据应用,以说明所提出方法的有限样本性能。

更新日期:2020-04-16
down
wechat
bug