Journal of the American Statistical Association ( IF 3.0 ) Pub Date : 2021-04-12 , DOI: 10.1080/01621459.2021.1891925 Xi Chen 1 , Weidong Liu 2 , Yichen Zhang 3
Abstract
This article studies distributed estimation and inference for a general statistical problem with a convex loss that could be nondifferentiable. For the purpose of efficient computation, we restrict ourselves to stochastic first-order optimization, which enjoys low per-iteration complexity. To motivate the proposed method, we first investigate the theoretical properties of a straightforward divide-and-conquer stochastic gradient descent approach. Our theory shows that there is a restriction on the number of machines and this restriction becomes more stringent when the dimension p is large. To overcome this limitation, this article proposes a new multi-round distributed estimation procedure that approximates the Newton step only using stochastic subgradient. The key component in our method is the proposal of a computationally efficient estimator of , where is the population Hessian matrix and w is any given vector. Instead of estimating (or ) that usually requires the second-order differentiability of the loss, the proposed first-order Newton-type estimator (FONE) directly estimates the vector of interest as a whole and is applicable to nondifferentiable losses. Our estimator also facilitates the inference for the empirical risk minimizer. It turns out that the key term in the limiting covariance has the form of , which can be estimated by FONE.
中文翻译:
用于分布式估计和推理的一阶牛顿型估计器
摘要
本文研究了具有可能不可微的凸损失的一般统计问题的分布式估计和推理。为了高效计算,我们将自己限制在随机一阶优化上,它具有较低的每次迭代复杂度。为了激发所提出的方法,我们首先研究了一种直接分而治之的随机梯度下降法的理论特性。我们的理论表明,机器的数量是有限制的,并且当维度p时,这种限制变得更加严格很大。为了克服这一限制,本文提出了一种新的多轮分布式估计程序,该程序仅使用随机子梯度来逼近牛顿步。我们方法的关键组成部分是提出一个计算有效的估计器, 在哪里是总体 Hessian 矩阵,w是任何给定的向量。而不是估计(或者) 通常需要损失的二阶可微性,建议的一阶牛顿型估计器 (FONE) 直接估计感兴趣的向量作为一个整体,适用于不可微分的损失。我们的估算器还有助于推断经验风险最小化器。事实证明,极限协方差中的关键项具有以下形式,可以用 FONE 估算。