当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fast computation of Katz index for efficient processing of link prediction queries
Data Mining and Knowledge Discovery ( IF 4.8 ) Pub Date : 2021-04-16 , DOI: 10.1007/s10618-021-00754-8
Mustafa Coşkun , Abdelkader Baggag , Mehmet Koyutürk

Network proximity computations are among the most common operations in various data mining applications, including link prediction and collaborative filtering. A common measure of network proximity is Katz index, which has been shown to be among the best-performing path-based link prediction algorithms. With the emergence of very large network databases, such proximity computations become an important part of query processing in these databases. Consequently, significant effort has been devoted to developing algorithms for efficient computation of Katz index between a given pair of nodes or between a query node and every other node in the network. Here, we present LRC-Katz, an algorithm based on indexing and low rank correction to accelerate Katz index based network proximity queries. Using a variety of very large real-world networks, we show that LRC-Katzoutperforms the fastest existing method, Conjugate Gradient, for a wide range of parameter values. Taking advantage of the acceleration in the computation of Katz index, we propose a new link prediction algorithm that exploits locality of networks that are encountered in practical applications. Our experiments show that the resulting link prediction algorithm drastically outperforms state-of-the-art link prediction methods based on the vanilla and truncated Katz.



中文翻译:

快速计算Katz索引以有效处理链接预测查询

网络邻近度计算是各种数据挖掘应用程序中最常见的操作之一,包括链接预测和协作过滤。网络接近度的一种常见度量是Katz索引,Katz索引已被证明是性能最佳的基于路径的链路预测算法之一。随着超大型网络数据库的出现,这种邻近度计算已成为这些数据库中查询处理的重要组成部分。因此,已投入大量精力来开发算法,以有效计算给定节点对之间或查询节点与网络中每个其他节点之间的Katz索引。在这里,我们介绍LRC-Katz,这是一种基于索引和低秩校正的算法,可以加速基于Katz索引的网络邻近查询。通过使用各种非常大的真实世界网络,我们证明了LRC-Katz在各种参数值方面都优于最快的现有方法“共轭梯度”。利用Katz索引计算中的加速优势,我们提出了一种新的链路预测算法,该算法利用了在实际应用中遇到的网络的局部性。我们的实验表明,所得的链接预测算法大大优于基于香草和截短的Katz的最新链接预测方法。

更新日期:2021-04-16
down
wechat
bug