当前位置: X-MOL 学术Big Data Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Community Detection Algorithm for Big Social Networks Using Hybrid Architecture
Big Data Research ( IF 3.5 ) Pub Date : 2017-10-26 , DOI: 10.1016/j.bdr.2017.10.003
Rahil Sharma , Suely Oliveira

One of the most relevant and widely studied structural properties of networks is their community structure. Detecting communities is of great importance in social networks where systems are often represented as graphs. With the advent of web-based social networks like Twitter, Facebook and LinkedIn. community detection became even more difficult due to the massive network size, which can reach up to hundreds of millions of vertices and edges. This large graph structured data cannot be processed without using distributed algorithms due to memory constraints of one machine and also the need to achieve high performance. In this paper, we present a novel hybrid (shared + distributed memory) parallel algorithm to efficiently detect high quality communities in massive social networks. For our simulations, we use synthetic graphs ranging from 100K to 16M vertices to show the scalability and quality performance of our algorithm. We also use two massive real world networks: (a) section of Twitter-2010 network having 41M vertices and 1.4B edges (b) UK-2007 (.uk web domain) having 105M vertices and 3.3B edges. Simulation results on MPI setup with 8 compute nodes having 16 cores each show that, upto 6X speedup is achieved for synthetic graphs in detecting communities without compromising the quality of the results.



中文翻译:

基于混合架构的大型社交网络社区检测算法

网络的最相关和广泛研究的结构特性之一是它们的社区结构。在通常以图形表示系统的社交网络中,检测社区非常重要。随着基于Web的社交网络(如Twitter,Facebook和LinkedIn)的出现。由于庞大的网络规模,社区检测变得更加困难,网络规模可以达到数亿个顶点和边缘。由于一台机器的内存限制以及实现高性能的需要,如果不使用分布式算法就无法处理这种大图形结构化数据。在本文中,我们提出了一种新颖的混合(共享+分布式内存)并行算法,可以有效地检测大规模社交网络中的高质量社区。对于我们的模拟,我们使用范围从100K到16M顶点的合成图来显示算法的可扩展性和质量性能。我们还使用了两个庞大的现实世界网络:(a)Twitter-2010网络的一部分41中号 顶点和 1.4 边缘(b)UK-2007(.uk网络域)具有 105中号 顶点和 3.3边缘。在具有8个计算节点(每个节点具有16个核)的MPI设置上的仿真结果表明,6X 在不影响结果质量的情况下,可以在检测社区时加快合成图的速度。

更新日期:2017-10-26
down
wechat
bug