The Power of Pivoting for Exact Clique Counting,arXiv - CS - Social and Information Networks

当前位置： X-MOL 学术 › arXiv.cs.SI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The Power of Pivoting for Exact Clique Counting
arXiv - CS - Social and Information Networks Pub Date : 2020-01-19 , DOI: arxiv-2001.06784
Shweta Jain, C. Seshadhri

Clique counting is a fundamental task in network analysis, and even the simplest setting of $3$-cliques (triangles) has been the center of much recent research. Getting the count of $k$-cliques for larger $k$ is algorithmically challenging, due to the exponential blowup in the search space of large cliques. But a number of recent applications (especially for community detection or clustering) use larger clique counts. Moreover, one often desires \textit{local} counts, the number of $k$-cliques per vertex/edge. Our main result is Pivoter, an algorithm that exactly counts the number of $k$-cliques, \textit{for all values of $k$}. It is surprisingly effective in practice, and is able to get clique counts of graphs that were beyond the reach of previous work. For example, Pivoter gets all clique counts in a social network with a 100M edges within two hours on a commodity machine. Previous parallel algorithms do not terminate in days. Pivoter can also feasibly get local per-vertex and per-edge $k$-clique counts (for all $k$) for many public data sets with tens of millions of edges. To the best of our knowledge, this is the first algorithm that achieves such results. The main insight is the construction of a Succinct Clique Tree (SCT) that stores a compressed unique representation of all cliques in an input graph. It is built using a technique called \textit{pivoting}, a classic approach by Bron-Kerbosch to reduce the recursion tree of backtracking algorithms for maximal cliques. Remarkably, the SCT can be built without actually enumerating all cliques, and provides a succinct data structure from which exact clique statistics ($k$-clique counts, local counts) can be read off efficiently.

中文翻译：

精确派系计数的枢轴功能

Clique 计数是网络分析中的一项基本任务，即使是最简单的 $3$-cliques（三角形）设置也已成为最近研究的中心。由于大型集团的搜索空间呈指数级增长，因此获得更大的 $k$ 的 $k$-cliques 的数量在算法上具有挑战性。但是最近的一些应用程序（尤其是社区检测或聚类）使用了更大的集团计数。此外，人们经常希望 \textit{local} 计数，每个顶点/边的 $k$-cliques 数量。我们的主要结果是 Pivoter，这是一种精确计算 $k$-cliques、\textit{for all values of $k$} 的算法。它在实践中出奇地有效，并且能够获得之前工作无法实现的图的群计数。例如，Pivoter 在两小时内在商用机器上获取了具有 100M 边缘的社交网络中的所有派系计数。以前的并行算法不会在几天内终止。Pivoter 还可以为许多具有数千万条边的公共数据集获取本地每顶点和每边 $k$-clique 计数（对于所有 $k$）。据我们所知，这是第一个达到这种结果的算法。主要见解是构建一个简洁的派系树 (SCT)，该树在输入图中存储所有派系的压缩唯一表示。它是使用一种称为 \textit{pivoting} 的技术构建的，这是 Bron-Kerbosch 的一种经典方法，用于减少最大集团的回溯算法的递归树。值得注意的是，可以在不实际枚举所有派系的情况下构建 SCT，

更新日期：2020-01-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文