DTransE: Distributed Translating Embedding for Knowledge Graph,IEEE Transactions on Parallel and Distributed Systems

当前位置： X-MOL 学术 › IEEE Trans. Parallel Distrib. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

DTransE: Distributed Translating Embedding for Knowledge Graph
IEEE Transactions on Parallel and Distributed Systems ( IF 5.6 ) Pub Date : 2021-03-17 , DOI: 10.1109/tpds.2021.3066442
Dandan Song , Feng Zhang , Meiyan Lu , Sicheng Yang , Heyan Huang

Knowledge graphs play an important role in many applications, such as link prediction and question answering. Translating embedding for knowledge graphs is done with the aim of encoding structured information on entities and their rich relations in a low-dimensional embedding space. TransE is one of the most important methods in translation-based models, and uses translation invariance to implement translating embedding for knowledge graphs. In this line of work, translating embedding models represent the relation as a translation from the head entity to the tail entity and have achieved impressive results. Currently, the TransE model is only developed on single-node machines. Unfortunately, the computing and storage capacities of a single machine can easily reach their limits as knowledge graphs become larger and more complex, which limits the application scope of TransE. In order to solve this problem, we propose a distributed TransE method, known as DTransE, which can utilize distributed computing resources to calculate knowledge graph embeddings. However, building a distributed TransE is complicated and involves challenges of knowledge graph partitioning and computation. To solve these challenges, we provide a high-quality edge partitioning algorithm for the power-law graph by considering the high-degree and low-degree vertices with adaptive weights, which can balance the workload. By using the unactivated Gather-Apply-Scatter model on TransE, the processes periodically exchange messages in a loop. The irregular data distribution among the processes is also optimized to further accelerate communication. As far as we know, this is the first work on a distributed TransE method. We use link prediction to evaluate the DTransE in a distributed environment. Experiments show that, compared to the original TransE method, our proposed DTransE is, on average, 24.5 times faster with a minimum loss of accuracy; compared to the state-of-the-art parallel TransE implementation, DTransE is two times faster on average.

中文翻译：

DTransE：知识图的分布式翻译嵌入

知识图在许多应用程序中扮演重要角色，例如链接预测和问题解答。知识图的嵌入翻译是为了在低维嵌入空间中对实体及其丰富关系上的结构化信息进行编码而进行的。TransE是基于翻译的模型中最重要的方法之一，它使用翻译不变性来实现知识图的翻译嵌入。在这方面的工作中，翻译嵌入模型将关系表示为从头部实体到尾部实体的转换，并取得了令人印象深刻的结果。当前，TransE模型仅在单节点计算机上开发。不幸的是，随着知识图变得越来越大，越来越复杂，一台机器的计算和存储能力很容易达到其极限，这限制了TransE的应用范围。为了解决这个问题，我们提出了一种分布式的TransE方法，称为DTransE，它可以利用分布式的计算资源来计算知识图的嵌入。但是，构建分布式TransE十分复杂，并且涉及知识图分区和计算的挑战。为了解决这些挑战，我们通过考虑具有自适应权重的高阶和低阶顶点，可以平衡工作量，为幂律图提供了一种高质量的边缘划分算法。通过在TransE上使用未激活的Gather-Apply-Scatter模型，进程可以周期性地循环交换消息。进程之间的不规则数据分配也得到了优化，以进一步加快通信速度。据我们所知，这是分布式TransE方法的第一项工作。我们使用链接预测来评估分布式环境中的DTransE。实验表明，与原始的TransE方法相比，我们提出的DTransE平均快了24.5倍，而准确性损失最小；与最新的并行TransE实施相比，DTransE平均快两倍。

更新日期：2021-04-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11