Provably and Efficiently Approximating Near-cliques using the Tur\'an Shadow: PEANUTS,arXiv - CS - Data Structures and Algorithms

当前位置： X-MOL 学术 › arXiv.cs.DS › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Provably and Efficiently Approximating Near-cliques using the Tur\'an Shadow: PEANUTS
arXiv - CS - Data Structures and Algorithms Pub Date : 2020-06-24 , DOI: arxiv-2006.13483
Shweta Jain, C. Seshadhri

Clique and near-clique counts are important graph properties with applications in graph generation, graph modeling, graph analytics, community detection among others. They are the archetypal examples of dense subgraphs. While there are several different definitions of near-cliques, most of them share the attribute that they are cliques that are missing a small number of edges. Clique counting is itself considered a challenging problem. Counting near-cliques is significantly harder more so since the search space for near-cliques is orders of magnitude larger than that of cliques. We give a formulation of a near-clique as a clique that is missing a constant number of edges. We exploit the fact that a near-clique contains a smaller clique, and use techniques for clique sampling to count near-cliques. This method allows us to count near-cliques with 1 or 2 missing edges, in graphs with tens of millions of edges. To the best of our knowledge, there was no known efficient method for this problem, and we obtain a 10x - 100x speedup over existing algorithms for counting near-cliques. Our main technique is a space-efficient adaptation of the Tur\'an Shadow sampling approach, recently introduced by Jain and Seshadhri (WWW 2017). This approach constructs a large recursion tree (called the Tur\'an Shadow) that represents cliques in a graph. We design a novel algorithm that builds an estimator for near-cliques, using an online, compact construction of the Tur\'an Shadow.

中文翻译：

使用 Tur\'an Shadow: PEANUTS 可证明有效地逼近 Near-cliques

Clique 和 Near-clique 计数是重要的图属性，可应用于图生成、图建模、图分析、社区检测等。它们是密集子图的典型例子。虽然近群有几种不同的定义，但它们中的大多数都有一个共同属性，即它们是缺少少量边的群。Clique计数本身被认为是一个具有挑战性的问题。计数近团要困难得多，因为近团的搜索空间比团的搜索空间大几个数量级。我们给出了一个近似团的公式，作为一个缺少恒定数量边的团。我们利用近群包含较小群的事实，并使用群采样技术来计算近群。这种方法允许我们在具有数千万条边的图中计算具有 1 或 2 条缺失边的近群。据我们所知，这个问题没有已知的有效方法，我们获得了比现有算法高 10 到 100 倍的加速，用于计算近群。我们的主要技术是对 Tur\'an Shadow 采样方法的空间高效适应，该方法最近由 Jain 和 Seshadhri（WWW 2017）引入。这种方法构建了一个大的递归树（称为图尔安阴影），在图中表示派系。我们设计了一种新颖的算法，该算法使用 Tur\'an Shadow 的在线紧凑结构来构建近似派的估计器。并且我们获得了比现有算法的 10 倍 - 100 倍的加速，用于计数近群。我们的主要技术是对 Tur\'an Shadow 采样方法的空间高效适应，该方法最近由 Jain 和 Seshadhri（WWW 2017）引入。这种方法构建了一个大的递归树（称为图尔安阴影），在图中表示派系。我们设计了一种新颖的算法，该算法使用 Tur\'an Shadow 的在线紧凑结构来构建近似派的估计器。并且我们获得了比现有算法的 10 倍 - 100 倍的加速，用于计数近群。我们的主要技术是对 Tur\'an Shadow 采样方法的空间高效适应，该方法最近由 Jain 和 Seshadhri（WWW 2017）引入。这种方法构建了一个大的递归树（称为图尔安阴影），在图中表示派系。我们设计了一种新颖的算法，该算法使用 Tur\'an Shadow 的在线紧凑结构来构建近似派的估计器。

更新日期：2020-06-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>