当前位置: X-MOL 学术ACM SIGMOD Rec. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
No PANE, No Gain: Scaling Attributed Network Embedding in a Single Server: ACM SIGMOD Record: Vol 51, No 1
ACM SIGMOD Record ( IF 0.9 ) Pub Date : 2022-06-01 , DOI: 10.1145/3542700.3542711
Renchi Yang 1 , Jieming Shi 2 , Xiaokui Xiao 1 , Yin Yang 3 , Sourav S. Bhowmick 4 , Juncheng Liu 1
Affiliation  

Given a graph G where each node is associated with a set of attributes, attributed network embedding (ANE) maps each node v 2 G to a compact vector Xv, which can be used in downstream machine learning tasks in a variety of applications. Existing ANE solutions do not scale to massive graphs due to prohibitive computation costs or generation of low-quality embeddings. This paper proposes PANE, an effective and scalable approach to ANE computation for massive graphs in a single server that achieves state-of-the-art result quality on multiple benchmark datasets for two common prediction tasks: link prediction and node classification. Under the hood, PANE takes inspiration from well-established data management techniques to scale up ANE in a single server. Specifically, it exploits a carefully formulated problem based on a novel random walk model, a highly efficient solver, and non-trivial parallelization by utilizing modern multi-core CPUs. Extensive experiments demonstrate that PANE consistently outperforms all existing methods in terms of result quality, while being orders of magnitude faster.



中文翻译:

没有窗格,没有增益:在单个服务器中扩展属性网络嵌入:ACM SIGMOD 记录:第 51 卷,第 1 期

给定一个图 G,其中每个节点与一组属性相关联,属性网络嵌入 (ANE) 将每个节点 v 2 G 映射到一个紧凑向量 Xv,该向量可用于各种应用中的下游机器学习任务。由于计算成本过高或生成的低质量嵌入,现有的 ANE 解决方案无法扩展到海量图。本文提出了 PANE,这是一种有效且可扩展的 ANE 计算方法,用于在单个服务器中对海量图进行 ANE 计算,可在多个基准数据集上实现最先进的结果质量,用于两种常见的预测任务:链路预测和节点分类。在幕后,PANE 从成熟的数据管理技术中汲取灵感,在单个服务器中扩展 ANE。具体来说,它利用了一个基于新颖的随机游走模型精心制定的问题,利用现代多核 CPU 实现的高效求解器和非平凡的并行化。大量实验表明,PANE 在结果质量方面始终优于所有现有方法,同时速度要快几个数量级。

更新日期:2022-06-02
down
wechat
bug