当前位置: X-MOL 学术IEEE Trans. Knowl. Data. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scalable Spectral Clustering for Overlapping Community Detection in Large-Scale Networks
IEEE Transactions on Knowledge and Data Engineering ( IF 8.9 ) Pub Date : 2020-04-01 , DOI: 10.1109/tkde.2019.2892096
Hadrien Van Lierde , Tommy W. S. Chow , Guanrong Chen

While the majority of methods for community detection produce disjoint communities of nodes, most real-world networks naturally involve overlapping communities. In this paper, a scalable method for the detection of overlapping communities in large networks is proposed. The method is based on an extension of the notion of normalized cut to cope with overlapping communities. A spectral clustering algorithm is formulated to solve the related cut minimization problem. When available, the algorithm may take into account prior information about the likelihood for each node to belong to several communities. This information can either be extracted from the available metadata or from node centrality measures. We also introduce a hierarchical version of the algorithm to automatically detect the number of communities. In addition, a new benchmark model extending the stochastic blockmodel for graphs with overlapping communities is formulated. Our experiments show that the proposed spectral method outperforms the state-of-the-art algorithms in terms of computational complexity and accuracy on our benchmark graph model and on five real-world networks, including a lexical network and large-scale social networks. The scalability of the proposed algorithm is also demonstrated on large synthetic graphs with millions of nodes and edges.

中文翻译:

用于大规模网络中重叠社区检测的可扩展谱聚类

虽然大多数社区检测方法会产生不相交的节点社区,但大多数现实世界的网络自然会涉及重叠社区。在本文中,提出了一种用于检测大型网络中重叠社区的可扩展方法。该方法基于标准化切割概念的扩展,以应对重叠社区。制定了谱聚类算法来解决相关的切割最小化问题。当可用时,该算法可以考虑关于每个节点属于多个社区的可能性的先验信息。该信息可以从可用元数据或节点中心性度量中提取。我们还引入了算法的分层版本来自动检测社区的数量。此外,一个新的基准模型扩展了具有重叠社区的图的随机块模型。我们的实验表明,所提出的谱方法在我们的基准图模型和五个真实世界网络(包括词汇网络和大规模社交网络)的计算复杂性和准确性方面优于最先进的算法。所提出算法的可扩展性也在具有数百万个节点和边的大型合成图上得到了证明。包括词汇网络和大规模社交网络。所提出算法的可扩展性也在具有数百万个节点和边的大型合成图上得到了证明。包括词汇网络和大规模社交网络。所提出算法的可扩展性也在具有数百万个节点和边的大型合成图中得到了证明。
更新日期:2020-04-01
down
wechat
bug