Finding skyline communities in multi-valued networks,The VLDB Journal

当前位置： X-MOL 学术 › VLDB J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Finding skyline communities in multi-valued networks
The VLDB Journal ( IF 4.2 ) Pub Date : 2020-06-08 , DOI: 10.1007/s00778-020-00618-5
Rong-Hua Li , Lu Qin , Fanghua Ye , Guoren Wang , Jeffrey Xu Yu , Xiaokui Xiao , Nong Xiao , Zibin Zheng

Given a scientific collaboration network, how can we find a group of collaborators with high research indicator (e.g., h-index) and diverse research interests? Given a social network, how can we identify the communities that have high influence (e.g., PageRank) and also have similar interests to a specified user? In such settings, the network can be modeled as a multi-valued network where each node has d (\(d \ge 1\)) numerical attributes (i.e., h-index, diversity, PageRank, similarity score, etc.). In the multi-valued network, we want to find communities that are not dominated by the other communities in terms of d numerical attributes. Most existing community search algorithms either completely ignore the numerical attributes or only consider one numerical attribute of the nodes. To capture d numerical attributes, we propose a novel community model, called skyline community, based on the concepts of k-core and skyline. A skyline community is a maximal connected k-core that cannot be dominated by the other connected k-cores in the d-dimensional attribute space. We develop an elegant space-partition algorithm to efficiently compute the skyline communities. Two striking advantages of our algorithm are that (1) its time complexity relies mainly on the size of the answer s (i.e., the number of skyline communities), and thus, it is very efficient if s is small; and (2) it can progressively output the skyline communities, which is very useful for applications that only require part of the skyline communities. In addition, we also develop three efficient graph reduction techniques to further speed up the proposed algorithms. Extensive experiments on both synthetic and real-world networks demonstrate the efficiency, scalability, and effectiveness of the proposed algorithm.

中文翻译：

在多值网络中查找天际社区

有了科学的协作网络，我们如何找到一群研究指标高（例如h指数）且研究兴趣多样的合作者？给定一个社交网络，我们如何确定具有较高影响力的社区（例如，PageRank）并且对指定用户也具有相似的兴趣？在这样的设置中，网络可以建模为多值网络，其中每个节点具有d（\（d \ ge 1 \））个数字属性（即h索引，分集，PageRank，相似性得分等）。在多值网络中，我们希望找到在d方面不受其他社区支配的社区数值属性。大多数现有的社区搜索算法要么完全忽略数字属性，要么仅考虑节点的一个数字属性。为了捕获d的数值属性，我们基于k核心和天际线的概念，提出了一种新颖的社区模型，称为天际线社区。天际社区是最大连接k核，不能由d维属性空间中的其他连接k核支配。我们开发了一种优雅的空间划分算法，可以有效地计算天际线社区。我们算法的两个显着优势是：（1）其时间复杂度主要取决于答案的大小（即，天际线社区的数量），因此，如果s小，将非常有效；（2）可以逐步输出天际社区，这对于只需要一部分天际社区的应用程序非常有用。此外，我们还开发了三种有效的图形约简技术，以进一步加快提出的算法。在合成网络和实际网络上的大量实验证明了所提出算法的效率，可扩展性和有效性。

更新日期：2020-06-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>