当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Incremental Lossless Graph Summarization
arXiv - CS - Databases Pub Date : 2020-06-17 , DOI: arxiv-2006.09935
Jihoon Ko, Yunbum Kook, Kijung Shin

Given a fully dynamic graph, represented as a stream of edge insertions and deletions, how can we obtain and incrementally update a lossless summary of its current snapshot? As large-scale graphs are prevalent, concisely representing them is inevitable for efficient storage and analysis. Lossless graph summarization is an effective graph-compression technique with many desirable properties. It aims to compactly represent the input graph as (a) a summary graph consisting of supernodes (i.e., sets of nodes) and superedges (i.e., edges between supernodes), which provide a rough description, and (b) edge corrections which fix errors induced by the rough description. While a number of batch algorithms, suited for static graphs, have been developed for rapid and compact graph summarization, they are highly inefficient in terms of time and space for dynamic graphs, which are common in practice. In this work, we propose MoSSo, the first incremental algorithm for lossless summarization of fully dynamic graphs. In response to each change in the input graph, MoSSo updates the output representation by repeatedly moving nodes among supernodes. MoSSo decides nodes to be moved and their destinations carefully but rapidly based on several novel ideas. Through extensive experiments on 10 real graphs, we show MoSSo is (a) Fast and 'any time': processing each change in near-constant time (less than 0.1 millisecond), up to 7 orders of magnitude faster than running state-of-the-art batch methods, (b) Scalable: summarizing graphs with hundreds of millions of edges, requiring sub-linear memory during the process, and (c) Effective: achieving comparable compression ratios even to state-of-the-art batch methods.

中文翻译:

增量无损图摘要

给定一个完全动态的图,表示为边插入和删除的流,我们如何获取和增量更新其当前快照的无损摘要?由于大规模图很普遍,为了有效的存储和分析,简明地表示它们是不可避免的。无损图摘要是一种有效的图压缩技术,具有许多理想的特性。它旨在将输入图紧凑地表示为(a)由超级节点(即节点集)和超级边(即超级节点之间的边)组成的汇总图,提供粗略的描述,以及(b)修复错误的边校正由粗略的描述引起。虽然已经开发了许多适用于静态图的批处理算法,以实现快速和紧凑的图摘要,它们在动态图的时间和空间方面非常低效,这在实践中很常见。在这项工作中,我们提出了 MoSSo,这是第一个用于全动态图无损汇总的增量算法。为了响应输入图中的每个变化,MoSSo 通过在超级节点之间重复移动节点来更新输出表示。MoSSo 根据几个新颖的想法谨慎但迅速地决定要移动的节点及其目的地。通过对 10 个真实图的大量实验,我们表明 MoSSo 是 (a) 快速且“任何时间”:在近乎恒定的时间(小于 0.1 毫秒)内处理每个变化,比运行状态快 7 个数量级最先进的批处理方法,(b) 可扩展:汇总具有数亿条边的图,在此过程中需要亚线性内存,以及 (c) 有效:
更新日期:2020-06-18
down
wechat
bug