当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mining explainable local and global subgraph patterns with surprising densities
Data Mining and Knowledge Discovery ( IF 4.8 ) Pub Date : 2020-11-10 , DOI: 10.1007/s10618-020-00721-9
Junning Deng , Bo Kang , Jefrey Lijffijt , Tijl De Bie

The connectivity structure of graphs is typically related to the attributes of the vertices. In social networks for example, the probability of a friendship between any pair of people depends on a range of attributes, such as their age, residence location, workplace, and hobbies. The high-level structure of a graph can thus possibly be described well by means of patterns of the form ‘the subgroup of all individuals with certain properties X are often (or rarely) friends with individuals in another subgroup defined by properties Y’, ideally relative to their expected connectivity. Such rules present potentially actionable and generalizable insight into the graph. Prior work has already considered the search for dense subgraphs (‘communities’) with homogeneous attributes. The first contribution in this paper is to generalize this type of pattern to densities between a pair of subgroups, as well as between all pairs from a set of subgroups that partition the vertices. Second, we develop a novel information-theoretic approach for quantifying the subjective interestingness of such patterns, by contrasting them with prior information an analyst may have about the graph’s connectivity. We demonstrate empirically that in the special case of dense subgraphs, this approach yields results that are superior to the state-of-the-art. Finally, we propose algorithms for efficiently finding interesting patterns of these different types.



中文翻译:

以惊人的密度挖掘可解释的局部和全局子图模式

图的连通性结构通常与顶点的属性有关。例如,在社交网络中,任何一对人之间建立友谊的可能性取决于一系列属性,例如他们的年龄,居住地点,工作场所和兴趣爱好。因此,可以通过以下形式的图形来很好地描述图形的高级结构:“具有某些属性X的所有个体的子组经常与(或很少)与由属性Y定义的另一个子组的个体成为朋友”相对于其预期的连通性。这样的规则在图形中呈现了潜在的可行且可概括的见解。先前的工作已经考虑过搜索具有同质属性的密集子图(“社区”)。一对子组,以及一组划分顶点的子组中的所有对。其次,我们通过与分析师可能具有的关于图的连通性的先验信息进行对比,开发了一种新颖的信息理论方法来量化此类模式的主观兴趣。我们凭经验证明,在稠密子图的特殊情况下,这种方法产生的结果要优于最新技术。最后,我们提出了有效查找这些不同类型有趣模式的算法。

更新日期:2020-11-12
down
wechat
bug