当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficiently Finding a Maximal Clique Summary via Effective Sampling
arXiv - CS - Databases Pub Date : 2020-09-22 , DOI: arxiv-2009.10376
Xiaofan Li, Rui Zhou, Lu Chen, Chengfei Liu, Qiang He, Yun Yang

Maximal clique enumeration (MCE) is a fundamental problem in graph theory and is used in many applications, such as social network analysis, bioinformatics, intelligent agent systems, cyber security, etc. Most existing MCE algorithms focus on improving the efficiency rather than reducing the output size. The output unfortunately could consist of a large number of maximal cliques. In this paper, we study how to report a summary of less overlapping maximal cliques. The problem was studied before, however, after examining the pioneer approach, we consider it still not satisfactory. To advance the research along this line, our paper attempts to make four contributions: (a) we propose a more effective sampling strategy, which produces a much smaller summary but still ensures that the summary can somehow witness all the maximal cliques and the expectation of each maximal clique witnessed by the summary is above a predefined threshold; (b) we prove that the sampling strategy is optimal under certain optimality conditions; (c) we apply clique-size bounding and design new enumeration order to approach the optimality conditions; and (d) to verify experimentally, we test eight real benchmark datasets that have a variety of graph characteristics. The results show that our new sampling strategy consistently outperforms the state-of-the-art approach by producing smaller summaries and running faster on all the datasets.

中文翻译:

通过有效抽样有效地找到最大集团摘要

最大团枚举(MCE)是图论中的一个基本问题,被用于许多应用,如社交网络分析、生物信息学、智能代理系统、网络安全等。现有的大多数 MCE 算法都专注于提高效率而不是减少输出尺寸。不幸的是,输出可能包含大量的极大团。在本文中,我们研究如何报告较少重叠的极大团的摘要。之前研究过这个问题,但是在研究了先锋方法之后,我们认为它仍然不能令人满意。为了推进这一方向的研究,我们的论文试图做出四个贡献:(a)我们提出了一种更有效的抽样策略,这产生了一个小得多的摘要,但仍然确保摘要可以以某种方式见证所有最大团,并且摘要见证的每个最大团的期望都高于预定义的阈值;(b) 我们证明了采样策略在某些最优条件下是最优的;(c) 我们应用 clique-size bounding 并设计新的枚举顺序来接近最优条件;(d) 为了通过实验进行验证,我们测试了八个具有各种图形特征的真实基准数据集。结果表明,我们的新采样策略通过生成更小的摘要并在所有数据集上运行得更快,始终优于最先进的方法。(b) 我们证明了采样策略在某些最优条件下是最优的;(c) 我们应用 clique-size bounding 并设计新的枚举顺序来接近最优条件;(d) 为了通过实验进行验证,我们测试了八个具有各种图形特征的真实基准数据集。结果表明,我们的新采样策略通过生成更小的摘要并在所有数据集上运行得更快,始终优于最先进的方法。(b) 我们证明了采样策略在某些最优条件下是最优的;(c) 我们应用 clique-size bounding 并设计新的枚举顺序来接近最优条件;(d) 为了通过实验进行验证,我们测试了八个具有各种图形特征的真实基准数据集。结果表明,我们的新采样策略通过生成更小的摘要并在所有数据集上运行得更快,始终优于最先进的方法。
更新日期:2020-09-23
down
wechat
bug