当前位置:
X-MOL 学术
›
ACM Trans. Web
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Efficient Pairwise Penetrating-rank Similarity Retrieval
ACM Transactions on the Web ( IF 2.6 ) Pub Date : 2019-12-18 , DOI: 10.1145/3368616 Weiren Yu 1 , Julie McCann 2 , Chengyuan Zhang 3
ACM Transactions on the Web ( IF 2.6 ) Pub Date : 2019-12-18 , DOI: 10.1145/3368616 Weiren Yu 1 , Julie McCann 2 , Chengyuan Zhang 3
Affiliation
Many web applications demand a measure of similarity between two entities, such as collaborative filtering, web document ranking, linkage prediction, and anomaly detection. P-Rank (Penetrating-Rank) has been accepted as a promising graph-based similarity measure, as it provides a comprehensive way of encoding both incoming and outgoing links into assessment. However, the existing method to compute P-Rank is iterative in nature and rather cost-inhibitive. Moreover, the accuracy estimate and stability issues for P-Rank computation have not been addressed. In this article, we consider the optimization techniques for P-Rank search that encompasses its accuracy, stability, and computational efficiency. (1) The accuracy estimation is provided for P-Rank iterations, with the aim to find out the number of iterations, k , required to guarantee a desired accuracy. (2) A rigorous bound on the condition number of P-Rank is obtained for stability analysis. Based on this bound, it can be shown that P-Rank is stable and well-conditioned when the damping factors are chosen to be suitably small. (3) Two matrix-based algorithms, applicable to digraphs and undirected graphs, are, respectively, devised for efficient P-Rank computation, which improves the computational time from O ( kn 3 ) to O (υ n 2 +υ 6 ) for digraphs, and to O (υ n 2 ) for undirected graphs, where n is the number of vertices in the graph, and υ (≪ n ) is the target rank of the graph. Moreover, our proposed algorithms can significantly reduce the memory space of P-Rank computations from O ( n 2 ) to O (υ n +υ 4 ) for digraphs, and to O (υ n ) for undirected graphs, respectively. Finally, extensive experiments on real-world and synthetic datasets demonstrate the usefulness and efficiency of the proposed techniques for P-Rank similarity assessment on various networks.
中文翻译:
高效的成对穿透秩相似度检索
许多 Web 应用程序需要衡量两个实体之间的相似性,例如协同过滤、Web 文档排名、链接预测和异常检测。P-Rank (Penetrating-Rank) 已被认为是一种很有前途的基于图的相似性度量,因为它提供了一种将传入和传出链接编码为评估的综合方法。然而,现有的计算 P-Rank 的方法本质上是迭代的,而且成本相当低。此外,尚未解决 P-Rank 计算的准确度估计和稳定性问题。在本文中,我们考虑了 P-Rank 搜索的优化技术,包括其准确性、稳定性和计算效率。(1) 为 P-Rank 迭代提供准确度估计,目的是找出迭代次数,ķ ,需要保证所需的精度。(2) 得到P-Rank条件数的严格界,用于稳定性分析。基于这个界限,可以证明当阻尼因子选择得适当小时,P-Rank 是稳定且良好的。(3) 分别设计了两种适用于有向图和无向图的基于矩阵的算法,用于高效的 P-Rank 计算,从○ (kn 3 ) 到○ (υn 2 +υ6 ) 对于有向图,并○ (υn 2 ) 对于无向图,其中n 是图中的顶点数,υ (≪n ) 是图的目标等级。此外,我们提出的算法可以显着减少 P-Rank 计算的内存空间○ (n 2 ) 到○ (υn +υ4 ) 对于有向图,并○ (υn ) 分别用于无向图。最后,对真实世界和合成数据集的广泛实验证明了所提出的技术在各种网络上进行 P-Rank 相似性评估的有用性和效率。
更新日期:2019-12-18
中文翻译:
高效的成对穿透秩相似度检索
许多 Web 应用程序需要衡量两个实体之间的相似性,例如协同过滤、Web 文档排名、链接预测和异常检测。P-Rank (Penetrating-Rank) 已被认为是一种很有前途的基于图的相似性度量,因为它提供了一种将传入和传出链接编码为评估的综合方法。然而,现有的计算 P-Rank 的方法本质上是迭代的,而且成本相当低。此外,尚未解决 P-Rank 计算的准确度估计和稳定性问题。在本文中,我们考虑了 P-Rank 搜索的优化技术,包括其准确性、稳定性和计算效率。(1) 为 P-Rank 迭代提供准确度估计,目的是找出迭代次数,