Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time,arXiv - CS - Data Structures and Algorithms

当前位置： X-MOL 学术 › arXiv.cs.DS › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time
arXiv - CS - Data Structures and Algorithms Pub Date : 2020-06-23 , DOI: arxiv-2006.13312
Jerry Li, Guanghao Ye

Robust covariance estimation is the following, well-studied problem in high dimensional statistics: given $N$ samples from a $d$-dimensional Gaussian $\mathcal{N}(\boldsymbol{0}, \Sigma)$, but where an $\varepsilon$-fraction of the samples have been arbitrarily corrupted, output $\widehat{\Sigma}$ minimizing the total variation distance between $\mathcal{N}(\boldsymbol{0}, \Sigma)$ and $\mathcal{N}(\boldsymbol{0}, \widehat{\Sigma})$. This corresponds to learning $\Sigma$ in a natural affine-invariant variant of the Frobenius norm known as the \emph{Mahalanobis norm}. Previous work of Cheng et al demonstrated an algorithm that, given $N = \Omega (d^2 / \varepsilon^2)$ samples, achieved a near-optimal error of $O(\varepsilon \log 1 / \varepsilon)$, and moreover, their algorithm ran in time $\widetilde{O}(T(N, d) \log \kappa / \mathrm{poly} (\varepsilon))$, where $T(N, d)$ is the time it takes to multiply a $d \times N$ matrix by its transpose, and $\kappa$ is the condition number of $\Sigma$. When $\varepsilon$ is relatively small, their polynomial dependence on $1/\varepsilon$ in the runtime is prohibitively large. In this paper, we demonstrate a novel algorithm that achieves the same statistical guarantees, but which runs in time $\widetilde{O} (T(N, d) \log \kappa)$. In particular, our runtime has no dependence on $\varepsilon$. When $\Sigma$ is reasonably conditioned, our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors, showing that we can get robustness essentially "for free."

中文翻译：

近矩阵乘法时间内的鲁棒高斯协方差估计

稳健的协方差估计是以下高维统计中经过充分研究的问题：给定来自 $d$ 维高斯 $\mathcal{N}(\boldsymbol{0}, \Sigma)$ 的 $N$ 样本，但是其中$\varepsilon$-部分样本被任意破坏，输出 $\widehat{\Sigma}$ 最小化 $\mathcal{N}(\boldsymbol{0}, \Sigma)$ 和 $\mathcal 之间的总变异距离{N}(\boldsymbol{0}, \widehat{\Sigma})$。这对应于在 Frobenius 范数的自然仿射不变变体中学习 $\Sigma$，称为 \emph{Mahalanobis 范数}。Cheng 等人之前的工作展示了一种算法，该算法在给定 $N = \Omega (d^2 / \varepsilon^2)$ 样本的情况下，实现了 $O(\varepsilon \log 1 / \varepsilon)$ 的接近最优误差，而且，他们的算法及时运行 $\widetilde{O}(T(N, d) \log \kappa / \mathrm{poly} (\varepsilon))$，其中 $T(N, d)$ 是 $d \times N$ 矩阵乘以其转置所需的时间，$\ kappa$ 是 $\Sigma$ 的条件数。当 $\varepsilon$ 相对较小时，它们在运行时对 $1/\varepsilon$ 的多项式依赖性非常大。在本文中，我们展示了一种新算法，它实现了相同的统计保证，但运行时间为 $\widetilde{O} (T(N, d) \log \kappa)$。特别是，我们的运行时不依赖于 $\varepsilon$。当 $\Sigma$ 处于合理条件时，我们的运行时间与最快的协方差估计算法相匹配，没有异常值，最多可达多对数因子，这表明我们基本上可以“免费”获得鲁棒性。d)$ 是将 $d \times N$ 矩阵乘以其转置所需的时间，$\kappa$ 是 $\Sigma$ 的条件数。当 $\varepsilon$ 相对较小时，它们在运行时对 $1/\varepsilon$ 的多项式依赖性非常大。在本文中，我们展示了一种新算法，它实现了相同的统计保证，但运行时间为 $\widetilde{O} (T(N, d) \log \kappa)$。特别是，我们的运行时不依赖于 $\varepsilon$。当 $\Sigma$ 处于合理条件时，我们的运行时间与最快的协方差估计算法相匹配，没有异常值，最多可达多对数因子，这表明我们基本上可以“免费”获得鲁棒性。d)$ 是将 $d \times N$ 矩阵乘以其转置所需的时间，$\kappa$ 是 $\Sigma$ 的条件数。当 $\varepsilon$ 相对较小时，它们在运行时对 $1/\varepsilon$ 的多项式依赖性非常大。在本文中，我们展示了一种新算法，它实现了相同的统计保证，但运行时间为 $\widetilde{O} (T(N, d) \log \kappa)$。特别是，我们的运行时不依赖于 $\varepsilon$。当 $\Sigma$ 处于合理条件时，我们的运行时间与最快的协方差估计算法相匹配，没有异常值，最多可达多对数因子，这表明我们基本上可以“免费”获得鲁棒性。它们在运行时对 $1/\varepsilon$ 的多项式依赖非常大。在本文中，我们展示了一种新算法，它实现了相同的统计保证，但运行时间为 $\widetilde{O} (T(N, d) \log \kappa)$。特别是，我们的运行时不依赖于 $\varepsilon$。当 $\Sigma$ 处于合理条件时，我们的运行时间与最快的协方差估计算法相匹配，没有异常值，最多可达多对数因子，这表明我们基本上可以“免费”获得鲁棒性。它们在运行时对 $1/\varepsilon$ 的多项式依赖非常大。在本文中，我们展示了一种新算法，它实现了相同的统计保证，但运行时间为 $\widetilde{O} (T(N, d) \log \kappa)$。特别是，我们的运行时不依赖于 $\varepsilon$。当 $\Sigma$ 处于合理条件时，我们的运行时间与最快的协方差估计算法相匹配，没有异常值，最多可达多对数因子，这表明我们基本上可以“免费”获得鲁棒性。

更新日期：2020-06-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文