当前位置: X-MOL 学术VLDB J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient community discovery with user engagement and similarity
The VLDB Journal ( IF 4.2 ) Pub Date : 2019-10-26 , DOI: 10.1007/s00778-019-00579-4
Fan Zhang , Xuemin Lin , Ying Zhang , Lu Qin , Wenjie Zhang

In this paper, we investigate the problem of (k,r)-core which intends to find cohesive subgraphs on social networks considering both user engagement and similarity perspectives. In particular, we adopt the popular concept of k-core to guarantee the engagement of the users (vertices) in a group (subgraph) where each vertex in a (k,r)-core connects to at least k other vertices. Meanwhile, we consider the pairwise similarity among users based on their attributes. Efficient algorithms are proposed to enumerate all maximal (k,r)-cores and find the maximum (k,r)-core, where both problems are shown to be NP-hard. Effective pruning techniques substantially reduce the search space of two algorithms. A novel (\(k\),\(k'\))-core based (\(k\),\(r\))-core size upper bound enhances the performance of the maximum (k,r)-core computation. We also devise effective search orders for two algorithms with different search priorities for vertices. Besides, we study the diversified (\(k\),\(r\))-core search problem to find l maximal (\(k\),\(r\))-cores which cover the most vertices in total. These maximal (\(k\),\(r\))-cores are distinctive and informationally rich. An efficient algorithm is proposed with a guaranteed approximation ratio. We design a tight upper bound to prune unpromising partial (\(k\),\(r\))-cores. A new search order is designed to speed up the search. Initial candidates with large size are generated to further enhance the pruning power. Comprehensive experiments on real-life data demonstrate that the maximal (k,r)-cores enable us to find interesting cohesive subgraphs, and performance of three mining algorithms is effectively improved by all the proposed techniques.

中文翻译:

通过用户参与和相似性进行有效的社区发现

在本文中,我们研究了(kr)核心的问题,该问题旨在同时考虑用户参与度和相似性观点来找到社交网络上的内聚子图。特别是,我们采用流行的k核概念来保证用户(顶点)参与组(子图)的参与,其中(kr)核中的每个顶点至少连接其他k个顶点。同时,我们根据用户的属性考虑用户之间的成对相似性。高效的算法提出了枚举所有最大ķ[R)-cores并找到最大ķr)核心,这两个问题均显示为NP困难。有效的修剪技术大大减少了两种算法的搜索空间。一种新颖的(\(k \)\(k'\))核基于(\(k \)\(r \))核大小上限提高了最大(kr)核的性能计算。我们还为两种算法设计了有效的搜索顺序,这两种算法对顶点的搜索优先级不同。此外,我们研究了分散的(\(k \)\(r \))核搜索问题,以找到l个最大的(\(k \)\(r \))核,它们总共覆盖了最多的顶点。这些最大(\(k \)\(r \))核心是独特的且信息丰富。提出了一种具有保证近似率的高效算法。我们设计了一个严格的上限,以修剪无用的部分(\(k \)\(r \))核。设计了新的搜索顺序以加快搜索速度。生成具有较大尺寸的初始候选对象以进一步增强修剪能力。对现实数据的综合实验表明,最大(kr)核使我们能够找到有趣的内聚子图,并且所有提出的技术都有效地改进了三种挖掘算法的性能。
更新日期:2019-10-26
down
wechat
bug