Elsevier

Information Fusion

Volume 90, February 2023, Pages 111-119
Information Fusion

Full length article
DFMKE: A dual fusion multi-modal knowledge graph embedding framework for entity alignment

https://doi.org/10.1016/j.inffus.2022.09.012Get rights and content

Highlights

  • A dual multi-modal knowledge graph embedding framework called DFMKE is proposed.

  • A late fusion method using modality-specific low-rank factors is proposed.

  • A combination strategy for early and late fusion modules is proposed.

Abstract

Entity alignment is critical for multiple knowledge graphs (KGs) integration. Although researchers have made significant efforts to explore the relational embeddings between different KGs, existing approaches may not describe multi-modal knowledge well in some tasks, e.g., entity alignment. In this paper, we propose DFMKE, a dual fusion multi-modal knowledge graph embedding framework, to address entity alignment. We first devise an early fusion method for fusing features of multi-modal entity representations of a KG. Simultaneously, multiple representations of various types of knowledge are generated independently by various techniques and fused by a low-rank multi-modal late fusion method. Finally, the outputs of early and late fusion methods are combined using a dual fusion scheme. DFMKE provides an ultimate fusion solution by leveraging the advantages of early and late fusion methods. Extensive experiments on two public datasets show that the DFMKE outperforms state-of-the-art methods by a significant margin achieving at least 10% more regard to Hits@n and MRR metrics.

Introduction

Entity alignment is critical for integrating distinct knowledge graphs (KGs) by connecting entities that refer to the same real-world object. Because most KGs are created with a specific purpose, different KGs showcase different representations for the same concepts [1].

Early studies on entity alignment primarily focus on attribute similarity [2], [3], which frequently suffers from attribute heterogeneity, making entity alignment error-prone [4]. Subsequently, other approaches to aligning entities in KGs required human intervention [5], or extra-resources [6], that is, a description of involved entities and their relations. The methods above require many examples to work in downstream tasks correctly, but obtaining a sufficiently large number of examples in real-life applications is challenging and expensive. More recently, some authors suggested semi-supervised models perform entity alignment in KGs; that is, they sought new algorithmic approaches which benefit from both labeled and unlabeled entities [7], [8].

For instance, [7] proposed the SEA (Semi-supervised Entity Alignment) framework. The SEA framework extends the popular TransE embedding method [9] to deal with the degree difference of entities. SEA, in particular, applies adversarial training to prevent entities with a similar degree of popularity are aggregated in the same region of the embedding space during the training step. After generating the embedding of two input KGs, SEA constructs and optimizes a suitable objective function that contains both labeled and unlabeled entities.

A further exciting approach is KECG (Knowledge Embedding model and Cross-Graph model), proposed in [8]. The KECG approach formulates the problem of aligning entities as an optimization problem in which the objective function is the sum of two components. The first component of the objective function to optimize is related to the so-called cross-graph model, which captures inner KG structures and alignments between entities in different KGs. The cross-graph model applies an attention mechanism, and, in particular, it is built utilizing an extended GAT as an encoder: in this way, the KECG system can ignore unimportant nodes. We observe that KECG uses both labeled and unlabeled examples in the cross-graph model construction process; thus, it has to be regarded as a semi-supervised approach. The second component of the objective function to optimize uses the TransE algorithm to learn the embedding of entities and relations in different KGs, and then it aligns these representations into a unified vector space.

SEA and KECG have achieved excellent performance on a range of popular datasets, but they cannot handle multi-modal KGs: in other words, KGs are often populated by heterogeneous data such as texts, numbers, or images to describe the same piece of reality. Distinct knowledge forms play a crucial role as auxiliary data to complete KGs and perform entity alignment; thus, we are strongly motivated to design new entity matching algorithms that can exploit their full potential multi-modal data.

Fig. 1 provides an example of entity alignment for multi-modal KGs, and it clarifies the peculiarities arising from the presence of multi-modal data in a KG. Here, the images associated with the entity “THU” indicate that the type of this entity is “university” in this case. However, leveraging multi-modal knowledge to perform entity alignment is not trivial. The inevitable heterogeneity among different modalities makes the entity alignment task challenging. For example, in Fig. 1, it is difficult to conclude that the “Tsinghua University” entity in KG1 and the “THU” entity in KG2 refer to the same object using only images or text information.

Recent studies [10], [11], [12] have put forth several models that combine multi-modal data from KGs into a joint embedding and enable the alignment model to modify modality weights automatically. However, the approaches above did not consider modal correlation at the feature level, and thus, these approaches may achieve poor results if multiple modalities are highly correlated. In addition, most existing works do not work well when seed entities, a list of labeled entities across KGs to initialize the training process, are not broadly available.

To address the issues above, we propose a dual fusion multi-modal knowledge graph embedding framework (DFMKE) for modeling the entity associations of multi-modal KGs and locating entities referring to the same real-world identity. Specifically, we propose an early fusion strategy to perform feature fusion among different modalities, which can exploit correlations between low-level features of each modality. Then, we discriminatively generate knowledge representations for each modality and design a late fusion method based on low-rank weight decomposition to leverage knowledge from multiple modalities for the entity alignment task. This work offers three contributions:

  • To alleviate the inconsistency of original data in each modality, we propose a dual multi-modal knowledge graph embedding framework called DFMKE that can incorporate the advantages of both early fusion and late fusion techniques for entity alignment with joint training. The main idea of DFMKE is to integrate knowledge representations of multiple modalities from separate spaces to a shared spaces.

  • We present a novel late fusion method for multi-modal fusion using modality-specific low-rank factors. This method can easily combine the output features from the early fusion method and reduce the computational complexity caused by input transformation into a tensor.

  • We prove the performance of the DFMKE by conducting experiments on two public multi-modal datasets, and we compared DFMKE with several state-of-the-art entity alignment methods. DFMKE works well even when no seed entities are available to initialize the training process. We also offer interpretable analysis in our work by conducting ablation studies on early and late fusion modules’ contributions.

The remainder of the paper is organized as follows. Section 2 briefly discusses the main existing works; Section 3 introduces the technical details of the work; Section 4 reports and comments on the experimental results according to various benchmarks; Section 5 concludes this paper.

Section snippets

Related literature

KGs are a powerful tool to efficiently organize, manage and retrieve a large body of information which is usually represented as a collection of RDF triplets of the form (head, predicate, tail) [13].

Two core problems in the KG research area are: (a) Link Prediction, i.e., we aim at completing triplets of the form (head?, predicate, tail) (in which the head is not specified) or (head, predicate, tail?) (in which the tail is not specified); (b) Entity Matching, i.e., given two KGs, we wish to

Proposed framework

In this section, we first formulate the problem and then describe the technical details of the proposed framework. A multi-modal KG can be viewed as a tuple G=(E,I,R,A), where E, I, R, A denote the sets of entities, images, relations and attributes, respectively. Given a pair of entities esEs from source KG Gs and etEt from target KG Gt, the task of entity alignment is to match entities describing the same object in different KGs.

To tackle the entity alignment task, we propose a framework

Experiments

In this part, we assess DFMKE using two real-world datasets and show how it may be used to attain cutting-edge performance by utilizing multi-modal knowledge in the entity alignment task.

Conclusions

In this paper, we propose a dual fusion multi-modal KG embedding framework that integrates several representations of various types of information based on knowledge embedding for the entity alignment task. We first introduce an early fusion method for fusing features of multi-modal entities. Moreover, an efficient late fusion method using modality-specific low-rank factors was designed through shared space learning with the output vectors from early fusion to migrate features under different

Declaration of Competing Interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Zhu JIA reports financial support was provided by National Natural Science Foundation of China.

Acknowledgments

This work was supported by the Key Laboratory of Intelligent Education Technology and Application of Zhejiang Province, Zhejiang Normal University, Zhejiang, China, the Key Research and Development Program of Zhejiang Province (No. 2021C03141), and the National Natural Science Foundation of China under Grant (62077015, 61877020 and 62037001).

References (58)

  • HolzingerA. et al.

    Towards multi-modal causability with graph neural networks enabling information fusion for explainable AI

    Inf. Fusion

    (2021)
  • D. Wijaya, P.P. Talukdar, T. Mitchell, PIDGIN: ontology alignment using web text as interlingua, in: ACM International...
  • BleiholderJ. et al.

    Data fusion

    ACM Comput. Surv.

    (2009)
  • J. Volz, C. Bizer, M. Gaedke, G. Kobilarov, Discovering and maintaining links on the web of data, in: Proceedings of...
  • B.D. Trisedya, J. Qi, R. Zhang, Entity alignment between knowledge graphs using attribute embeddings, in: Proceedings...
  • MahdisoltaniF. et al.

    Yago3: A knowledge base from multilingual wikipedias

  • W. Hu, J. Chen, Y. Qu, A self-training approach for resolving object coreference on the semantic web, in: Proc. of the...
  • S.C. Pei, L. Yu, R. Hoehndorf, X.L. Zhang, Semi-supervised entity alignment via knowledge graph embedding with...
  • C. Li, Y. Cao, L. Hou, J. Shi, J. Li, T.-S. Chua, Semi-supervised entity alignment via joint knowledge embedding model...
  • A. Bordes, N. Usunier, A. García-Duran, J. Weston, O. Yakhnenko, Translating embeddings for modeling multi-relational...
  • L. Chen, Z. Li, Y. Wang, T. Xu, Z. Wang, E. Chen, MMEA: Entity Alignment for Multi-modal Knowledge Graph, in:...
  • F. Liu, M. Chen, D. Roth, N. Collier, Visual Pivoting for (Unsupervised) Entity Alignment, in: Proceedings of the AAAI...
  • Q. Zhang, Z. Sun, W. Hu, M. Chen, L. Guo, Y. Qu, Multi-view Knowledge Graph Embedding for Entity Alignment, in:...
  • KlyneG.

    Resource description framework (RDF): Concepts and abstract syntax

    (2004)
  • A. El-Roby, A. Aboulnaga, ALEX: Automatic link exploration in linked data, in: Proc. of the 2015 ACM SIGMOD...
  • Y. Raimond, C. Sutton, M. Sandler, Automatic Interlinking of Music Datasets on the Semantic Web, in: Proceedings of...
  • SuchanekF.M. et al.

    PARIS: probabilistic alignment of relations, instances, and schema

    Proc. VLDB Endow.

    (2011)
  • M. Chen, Y. Tian, M. Yang, C. Zaniolo, Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment,...
  • CaoY. et al.

    Multi-channel graph neural network for entity alignment

  • M. Chen, Y. Tian, K. Chang, S. Skiena, C. Zaniolo, Co-training Embeddings of Knowledge Graphs and Entity Descriptions...
  • PeiS. et al.

    Semi-supervised entity alignment via knowledge graph embedding with awareness of degree difference

  • Z. Sun, W. Hu, Q. Zhang, Y. Qu, Bootstrapping entity alignment with knowledge graph embedding, in: Proceedings of the...
  • Y. Wu, X. Liu, Y. Feng, Z. Wang, R. Yan, D. Zhao, Relation-Aware Entity Alignment for Heterogeneous Knowledge Graphs,...
  • H. Zhu, R. Xie, Z. Liu, M. Sun, Iterative entity alignment via joint knowledge embeddings, in: Proceedings of the...
  • C. Li, Y. Cao, L. Hou, J. Shi, J. Li, T.-S. Chua, Semi-supervised entity alignment via knowledge graph embedding with...
  • S. Pei, L. Yu, R. Hoehndorf, X. Zhang, Semi-supervised entity alignment via knowledge graph embedding with awareness of...
  • C. Li, Y. Cao, L. Hou, J. Shi, J. Li, T.-S. Chua, Semi-supervised entity alignment via joint knowledge embedding model...
  • WangZ. et al.

    Knowledge graph embedding by translating on hyperplanes

  • LinY. et al.

    Learning entity and relation embeddings for knowledge graph completion

  • Cited by (12)

    View all citing articles on Scopus
    View full text