当前位置: X-MOL 学术J. Comput. Graph. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Penalized Quantile Regression for Distributed Big Data Using the Slack Variable Representation
Journal of Computational and Graphical Statistics ( IF 2.4 ) Pub Date : 2020-12-11 , DOI: 10.1080/10618600.2020.1840996
Ye Fan 1 , Nan Lin 2 , Xianjun Yin 1
Affiliation  

Abstract

Penalized quantile regression is a widely used tool for analyzing high-dimensional data with heterogeneity. Although its estimation theory has been well studied in the literature, its computation still remains a challenge in big data, due to the nonsmoothness of the check loss function and the possible nonconvexity of the penalty term. In this article, we propose the QPADM-slack method, a parallel algorithm formulated via the alternating direction method of multipliers (ADMM) that supports penalized quantile regression in big data. Our proposal is different from the recent QPADM algorithm and uses the slack variable representation of the quantile regression problem. Simulation studies demonstrate that this new formulation is significantly faster than QPADM, especially when the data volume n or the dimension p is large, and has favorable estimation accuracy in big data analysis for both nondistributed and distributed environments. We further illustrate the practical performance of QPADM-slack by analyzing a news popularity dataset.



中文翻译:

使用松弛变量表示的分布式大数据的惩罚分位数回归

摘要

惩罚分位数回归是一种广泛使用的工具,用于分析具有异质性的高维数据。尽管其估计理论已在文献中得到很好的研究,但由于检查损失函数的不平滑性和惩罚项的可能不凸性,其计算在大数据中仍然是一个挑战。在本文中,我们提出了 QPADM-slack 方法,这是一种通过乘法器交替方向法 (ADMM) 制定的并行算法,支持大数据中的惩罚分位数回归。我们的提议与最近的 QPADM 算法不同,它使用了分位数回归问题的松弛变量表示。仿真研究表明,这种新公式比 QPADM 快得多,尤其是当数据量n或维数p较大,在非分布式和分布式环境的大数据分析中具有良好的估计精度。我们通过分析新闻流行度数据集进一步说明了 QPADM-slack 的实际性能。

更新日期:2020-12-11
down
wechat
bug