当前位置: X-MOL 学术Mach. Learn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Incorporating symbolic domain knowledge into graph neural networks
Machine Learning ( IF 4.3 ) Pub Date : 2021-06-13 , DOI: 10.1007/s10994-021-05966-z
Tirtharaj Dash , Ashwin Srinivasan , Lovekesh Vig

Our interest is in scientific problems with the following characteristics: (1) Data are naturally represented as graphs; (2) The amount of data available is typically small; and (3) There is significant domain-knowledge, usually expressed in some symbolic form (rules, taxonomies, constraints and the like). These kinds of problems have been addressed effectively in the past by symbolic machine learning methods like Inductive Logic Programming (ILP), by virtue of 2 important characteristics: (a) The use of a representation language that easily captures the relation encoded in graph-structured data, and (b) The inclusion of prior information encoded as domain-specific relations, that can alleviate problems of data scarcity, and construct new relations. Recent advances have seen the emergence of deep neural networks specifically developed for graph-structured data (Graph-based Neural Networks, or GNNs). While GNNs have been shown to be able to handle graph-structured data, less has been done to investigate the inclusion of domain-knowledge. Here we investigate this aspect of GNNs empirically by employing an operation we term vertex-enrichment and denote the corresponding GNNs as VEGNNs. Using over 70 real-world datasets and substantial amounts of symbolic domain-knowledge, we examine the result of vertex-enrichment across 5 different variants of GNNs. Our results provide support for the following: (a) Inclusion of domain-knowledge by vertex-enrichment can significantly improve the performance of a GNN. That is, the performance of VEGNNs is significantly better than GNNs across all GNN variants; (b) The inclusion of domain-specific relations constructed using ILP improves the performance of VEGNNs, across all GNN variants. Taken together, the results provide evidence that it is possible to incorporate symbolic domain knowledge into a GNN, and that ILP can play an important role in providing high-level relationships that are not easily discovered by a GNN.



中文翻译:

将符号领域知识纳入图神经网络

我们的兴趣在于具有以下特征的科学问题:(1)数据自然地表示为图形;(2) 可用的数据量通常很小;(3) 有重要的领域知识,通常以某种符号形式(规则、分类法、约束等)表示。此类问题过去已通过符号机器学习方法(如归纳逻辑编程 (ILP))有效解决,凭借两个重要特征:(a) 使用表示语言可轻松捕获以图形结构编码的关系(b) 包含编码为特定领域关系的先验信息,可以缓解数据稀缺的问题,并构建新的关系。最近的进展见证了专门为图结构数据开发的深度神经网络(基于图的神经网络,或 GNN)的出现。虽然 GNN 已被证明能够处理图结构数据,但在研究领域知识的包含方面做得很少。在这里,我们通过使用我们称为的操作来凭经验研究 GNN 的这一方面vertex-enrichment并将相应的 GNN 表示为VEGNN。使用超过 70 个真实世界的数据集和大量符号领域知识,我们检查了 5 个不同 GNN 变体的顶点丰富结果。我们的结果为以下提供了支持:(a)通过顶点丰富包含领域知识可以显着提高 GNN 的性能。也就是说,在所有 GNN 变体中,VEGNN s的性能明显优于GNN s;(b) 包含使用 ILP 构建的特定领域关系提高了VEGNN的性能,跨越所有 GNN 变体。总之,结果证明可以将符号领域知识合并到 GNN 中,并且 ILP 可以在提供 GNN 不易发现的高级关系方面发挥重要作用。

更新日期:2021-06-14
down
wechat
bug