当前位置: X-MOL 学术arXiv.cs.IT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Statistical and computational thresholds for the planted $k$-densest sub-hypergraph problem
arXiv - CS - Information Theory Pub Date : 2020-11-23 , DOI: arxiv-2011.11500
Luca Corinzia, Paolo Penna, Wojciech Szpankowski, Joachim M. Buhmann

Recovery a planted signal perturbed by noise is a fundamental problem in machine learning. In this work, we consider the problem of recovery a planted $k$-densest sub-hypergraph on $h$-uniform hypergraphs over $n$ nodes. This fundamental problem appears in different contexts, e.g., community detection, average case complexity, and neuroscience applications. We first observe that it can be viewed as a structural variant of tensor PCA in which the hypergraph parameters $k$ and $h$ determine the structure of the signal to be recovered when the observations are contaminated by Gaussian noise. In this work, we provide tight information-theoretic upper and lower bounds for the recovery problem, as well as the first non-trivial algorithmic bounds based on approximate message passing algorithms. The problem exhibits a typical information-to-computational-gap observed in analogous settings, that widens with increasing sparsity of the problem. Interestingly, the bounds show that the structure of the signal does have an impact on the existing bounds of tensor PCA that the unstructured planted signal does not capture.

中文翻译:

种植的$ k $密度子超图问题的统计和计算阈值

恢复被噪声干扰的植入信号是机器学习中的一个基本问题。在这项工作中,我们考虑恢复的问题,即在$ n $节点上的$ h $均匀超图上植入$ k $密度子超图。这个基本问题出现在不同的环境中,例如社区检测,平均病例复杂度和神经科学应用。我们首先观察到它可以看作是张量PCA的结构变体,其中超图参数$ k $和$ h $决定了当观测值被高斯噪声污染时要恢复的信号的结构。在这项工作中,我们为恢复问题提供了严格的信息理论上下限,以及基于近似消息传递算法的第一个非平凡算法界限。该问题表现出在类似情况下观察到的典型的信息到计算差距,随着问题稀疏程度的增加而扩大。有趣的是,边界显示信号的结构确实对张量PCA的现有边界产生了影响,而未结构化的种植信号无法捕获该边界。
更新日期:2020-11-25
down
wechat
bug