Prediction-based Termination Rule for Greedy Learning with Massive Data,Statistica Sinica

当前位置： X-MOL 学术 › Stat. Sin. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Prediction-based Termination Rule for Greedy Learning with Massive Data
Statistica Sinica ( IF 1.5 ) Pub Date : 2016-01-01 , DOI: 10.5705/ss.202014.0068
Chen Xu ₁ , Shaobo Lin ₂ , Jian Fang ₂ , Runze Li ₁

Affiliation

The appearance of massive data has become increasingly common in contemporary scientific research. When sample size n is huge, classical learning methods become computationally costly for the regression purpose. Recently, the orthogonal greedy algorithm (OGA) has been revitalized as an efficient alternative in the context of kernel-based statistical learning. In a learning problem, accurate and fast prediction is often of interest. This makes an appropriate termination crucial for OGA. In this paper, we propose a new termination rule for OGA via investigating its predictive performance. The proposed rule is conceptually simple and convenient for implementation, which suggests an [Formula: see text] number of essential updates in an OGA process. It therefore provides an appealing route to conduct efficient learning for massive data. With a sample dependent kernel dictionary, we show that the proposed method is strongly consistent with an [Formula: see text] convergence rate to the oracle prediction. The promising performance of the method is supported by both simulation and real data examples.

中文翻译：

基于预测的海量数据贪婪学习终止规则

海量数据的出现在当代科学研究中越来越普遍。当样本量 n 很大时，经典学习方法在回归方面的计算成本很高。最近，正交贪婪算法（OGA）在基于内核的统计学习的背景下作为一种有效的替代方法而重新焕发活力。在学习问题中，准确和快速的预测通常是令人感兴趣的。这使得适当的终止对 OGA 至关重要。在本文中，我们通过研究 OGA 的预测性能提出了一种新的 OGA 终止规则。提议的规则在概念上简单且便于实施，这表明在 OGA 流程中[公式：见正文] 数量的基本更新。因此，它为对海量数据进行高效学习提供了一条有吸引力的途径。使用样本相关内核字典，我们表明所提出的方法与 [公式：见正文] 收敛速度与预言机预测高度一致。该方法的有前途的性能得到了模拟和真实数据示例的支持。

更新日期：2016-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11