当前位置:
X-MOL 学术
›
arXiv.cs.DS
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sparse sketches with small inversion bias
arXiv - CS - Data Structures and Algorithms Pub Date : 2020-11-21 , DOI: arxiv-2011.10695 Michał Dereziński, Zhenyu Liao, Edgar Dobriban, Michael W. Mahoney
arXiv - CS - Data Structures and Algorithms Pub Date : 2020-11-21 , DOI: arxiv-2011.10695 Michał Dereziński, Zhenyu Liao, Edgar Dobriban, Michael W. Mahoney
For a tall $n\times d$ matrix $A$ and a random $m\times n$ sketching matrix
$S$, the sketched estimate of the inverse covariance matrix $(A^\top A)^{-1}$
is typically biased: $E[(\tilde A^\top\tilde A)^{-1}]\ne(A^\top A)^{-1}$, where
$\tilde A=SA$. This phenomenon, which we call inversion bias, arises, e.g., in
statistics and distributed optimization, when averaging multiple independently
constructed estimates of quantities that depend on the inverse covariance. We
develop a framework for analyzing inversion bias, based on our proposed concept
of an $(\epsilon,\delta)$-unbiased estimator for random matrices. We show that
when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries,
then after simple rescaling, the estimator $(\frac m{m-d}\tilde A^\top\tilde
A)^{-1}$ is $(\epsilon,\delta)$-unbiased for $(A^\top A)^{-1}$ with a sketch of
size $m=O(d+\sqrt d/\epsilon)$. This implies that for $m=O(d)$, the inversion
bias of this estimator is $O(1/\sqrt d)$, which is much smaller than the
$\Theta(1)$ approximation error obtained as a consequence of the subspace
embedding guarantee for sub-gaussian sketches. We then propose a new sketching
technique, called LEverage Score Sparsified (LESS) embeddings, which uses ideas
from both data-oblivious sparse embeddings as well as data-aware leverage-based
row sampling methods, to get $\epsilon$ inversion bias for sketch size
$m=O(d\log d+\sqrt d/\epsilon)$ in time $O(\text{nnz}(A)\log n+md^2)$, where
nnz is the number of non-zeros. The key techniques enabling our analysis
include an extension of a classical inequality of Bai and Silverstein for
random quadratic forms, which we call the Restricted Bai-Silverstein
inequality; and anti-concentration of the Binomial distribution via the
Paley-Zygmund inequality, which we use to prove a lower bound showing that
leverage score sampling sketches generally do not achieve small inversion bias.
中文翻译:
具有较小反演偏差的稀疏草图
对于较高的$ n \ times d $矩阵$ A $和随机的$ m \ timesn $草图矩阵$ S $,反协方差矩阵$(A ^ \ top A)^ {-1} $的草图估计通常有偏差:$ E [(\\波浪线A ^ \ top \波浪线A)^ {-1}] \ ne(A ^ \ top A)^ {-1} $,其中$ \波浪线A = SA $。这种现象,我们称为反演偏差,是在统计和分布式优化中出现的,它是对依赖于逆协方差的数量的多个独立构造的估计求平均时。我们基于提出的随机矩阵的$(\ epsilon,\ delta)$-无偏估计量的概念,开发了一个用于分析反演偏差的框架。我们显示出,当草图矩阵$ S $密集且具有iid次高斯项时,则在简单重新缩放后,估算器$(\ frac m {md} \ tilde A ^ \ top \ tilde A)^ {-1} $是$(\ epsilon,\ delta)$-对$(A ^ \ top A)^ {-1} $无偏,其草图大小为$ m = O(d + \ sqrt d / \ epsilon)$。这意味着对于$ m = O(d)$,此估计量的反演偏差为$ O(1 / \ sqrt d)$,它远小于由此获得的$ \ Theta(1)$逼近误差。次高斯草图的子空间嵌入保证。然后,我们提出一种新的草图绘制技术,称为LEverage Score Sparsified(LESS)嵌入,它使用来自数据不明显的稀疏嵌入以及基于数据感知的基于杠杆的行采样方法的思想,以获得$ \ epsilon $的草图反转反转大小$ m = O(d \ log d + \ sqrt d / \ epsilon)$及时$ O(\ text {nnz}(A)\ log n + md ^ 2)$,其中nnz是非零数。支持我们进行分析的关键技术包括将Bai和Silverstein的经典不等式扩展为随机二次形式,我们称其为受限的Bai-Silverstein不等式;以及通过Paley-Zygmund不等式对二项式分布的反集中,我们用它来证明下界,这表明杠杆分数抽样草图通常不会实现小的反演偏差。
更新日期:2020-11-25
中文翻译:
具有较小反演偏差的稀疏草图
对于较高的$ n \ times d $矩阵$ A $和随机的$ m \ timesn $草图矩阵$ S $,反协方差矩阵$(A ^ \ top A)^ {-1} $的草图估计通常有偏差:$ E [(\\波浪线A ^ \ top \波浪线A)^ {-1}] \ ne(A ^ \ top A)^ {-1} $,其中$ \波浪线A = SA $。这种现象,我们称为反演偏差,是在统计和分布式优化中出现的,它是对依赖于逆协方差的数量的多个独立构造的估计求平均时。我们基于提出的随机矩阵的$(\ epsilon,\ delta)$-无偏估计量的概念,开发了一个用于分析反演偏差的框架。我们显示出,当草图矩阵$ S $密集且具有iid次高斯项时,则在简单重新缩放后,估算器$(\ frac m {md} \ tilde A ^ \ top \ tilde A)^ {-1} $是$(\ epsilon,\ delta)$-对$(A ^ \ top A)^ {-1} $无偏,其草图大小为$ m = O(d + \ sqrt d / \ epsilon)$。这意味着对于$ m = O(d)$,此估计量的反演偏差为$ O(1 / \ sqrt d)$,它远小于由此获得的$ \ Theta(1)$逼近误差。次高斯草图的子空间嵌入保证。然后,我们提出一种新的草图绘制技术,称为LEverage Score Sparsified(LESS)嵌入,它使用来自数据不明显的稀疏嵌入以及基于数据感知的基于杠杆的行采样方法的思想,以获得$ \ epsilon $的草图反转反转大小$ m = O(d \ log d + \ sqrt d / \ epsilon)$及时$ O(\ text {nnz}(A)\ log n + md ^ 2)$,其中nnz是非零数。支持我们进行分析的关键技术包括将Bai和Silverstein的经典不等式扩展为随机二次形式,我们称其为受限的Bai-Silverstein不等式;以及通过Paley-Zygmund不等式对二项式分布的反集中,我们用它来证明下界,这表明杠杆分数抽样草图通常不会实现小的反演偏差。