Frame-based Neural Network for Machine Reading Comprehension

https://doi.org/10.1016/j.knosys.2021.106889Get rights and content

Abstract

Machine Reading Comprehension (MRC) is one of the most challenging tasks in Natural Language Understanding (NLU). In particular, MRC systems typically answer a question by only utilizing the information contained in a given piece of text passage itself, while human beings can easily understand the meanings of the passage based on their background knowledge. To bridge the gap, we propose a novel Frame-based Neural Network for Machine Reading Comprehension(FNN-MRC) method, which employs Frame semantic knowledge to facilitate question answering. Specifically, different from existing Frame based methods that only model lexical units (LUs), our FNN-MRC has a Frame representation model, which utilizes both LUs in Frame and Frame-to-Frame (F-to-F) relations, designed to model Frames and sentences (in passage) together with attention schema. In addition, FNN-MRC has a Frame-based Sentence Representation (FSR) model, which is able to integrate multiple-Frame semantic information to obtain much better sentence representation. As such, FNN-MRC explicitly leverages the above Frame knowledge to assist its semantic understanding and representation. Extensive experiments demonstrate that our FNN-MRC method is able to achieve better results than existing state-of-the-art techniques across multiple datasets.

Introduction

Machine Reading Comprehension (MRC) requires machines to read and understand text passages, and answer relevant questions about it. It is regarded as an effective way to measure language understanding and typically requires a deep understanding of the given passage in order to answer its question correctly. Clearly, human beings can easily understand the meanings of a text passage based on their background knowledge. For instance, given a sentence Katie bought some chocolate cookies, people know Katie is a buyer, and chocolate cookies are goods that belong to Food class etc. Existing machine learning approaches, however, face great challenges to address the complicated MRC questions, as they do not have above semantic background knowledge.

Nevertheless, FrameNet [1], [2], as a knowledge base, provides schematic scenario representation that could be potentially leveraged to facilitate text understanding. It enables the development of wide-coverage Frame parsers [3], [4], as well as various real-world applications, ranging form event recognition [5], textual entailment [6], question answering [7], narrative schemas [8] and paraphrase identification [9], etc. In particular, a Frame (F) is defined as a composition of Lexical Units (LUs) and a set of Frame Elements (FEs). Given a sentence, if its certain word/phrase evokes a Frame by matching a LU, then it is called Target (T). It is worth mentioning that FrameNet arranges different relevant Frames into a network by defining Frame-to-Frame (F-to-F) relations. Fig. 1 provides an example of F, FEs, LUs, T and F-to-F, where the target word bought in sentence Katie bought some chocolate cookies evokes a Frame Commerce_buy as it matches with a LU buy (bought is the past tense of buy). In addition, another target word chocolate cookies evokes a different Frame Food. Finally, a couple of relevant Frames, including Commerce_buy, Shopping, Seeking, Locating, form F-to-F relations.

There exist other semantic resources, such as WordNet [10], PropBank [11]. In particular, WordNet is a lexicon that clusters words into sets of synonyms (synsets) and describes semantic relationships between them. In comparison, FrameNet annotates sentences/examples with both syntactic and semantic information for each Lexical Unit, which clearly provides more rich information than WordNet. On the other hand, PropBank is a corpus annotated with argument role labels for verbs. Roles in PropBank are general, while Roles in FrameNet are specific to lexical unit (Buyer vs. Arg0, Goods vs. Arg1).

It is clear that FrameNet can bring in additional background semantic knowledge that could be leveraged to improve MRC performance. However, how to effectively utilize these useful semantic knowledge from FrameNet is an important issue. Previously, feature-based [12] supervised learning models were proposed to integrate Frame knowledge to MRC, where they require language experts design complex features, which is typically a time consuming and expensive process and may not be generic enough to handle different MRC tasks. Later, end-to-end solutions with neural models [13], [14], [15] achieved good performance on the MRC tasks. Although such techniques can effectively incorporate contextual information from large-scale external unlabeled data into machine learning models, we still lack of effective representation learning techniques to help us incorporate Frame knowledge into a good representation so that we can leverage to build a successful MRC system. In addition, we observe the existing works mainly focus on LU embedding within a Frame [16], [17], [18], without modeling a Frame as a whole. Furthermore, many sentences could have more than one target words that evoke multiple semantically correlated Frames, but existing methods do not focus on integrating multi-Frame from FrameNet to enrich accurate and comprehensive sentence semantic representations.

To address the problems mentioned above, in this paper, we propose a novel Frame-based Sentence Representation (FSR) model, which leverages rich Frame semantic knowledge, including both generalizations of LUs and F-to-F relations, to better model the sentences in given text passage. To take full advantages of LUs and F-to-F, we propose three different strategies for Frame representation. Finally, we integrate multiple-Frame semantic information to get more comprehensive sentence representation based on individual Frame representation.

In this paper, we propose a Frame-based Neural Network for MRC (FNN-MRC). Specifically, we first utilize the FSR model to capture the multiple Frame semantic information of every sentence, and GRU [19] is used to aggregate a document-level frame-based representation. In experiments, we evaluate our FNN-MRC method on multi-choice MRC task, such as MCTest [20], non-extractive MRC, which requests to choose the right option from a set of candidate answers according to given passage and question. This is different from relatively easy extractive MRC datasets such as SQuAD [21] and NewsQA [22], which require a model to extract an answer span to a question from reference passage. In non-extractive MRC, however, machine learning models need to perform reasoning and inference. In addition, its difficulty is also reflected by the required background knowledge that are not expressed in given passage. We show improvements on two widely used neural models, i.e., traditional deep learning methods (with LSTM [23]) and Transformer (with BERT [15]) in the experiments.

The key contributions of this work can be summarized as follows:

  • 1.

    We propose novel attention-based Frame Representation Models, which take full advantage of LUs and F-to-F relations to model Frames with attention schema.

  • 2.

    We propose a new Frame-based Sentence Representation (FSR) model that integrates multi-Frame semantic information to obtain richer semantic aggregation for better sentence representation.

  • 3.

    We propose a Frame-based Neural Network for MRC (FNN-MRC), which explicitly leverages the Frame representation and Frame-based sentence representation knowledge to assist non-extractive question answering.

  • 4.

    Our experimental results demonstrate our Frame-based Neural Network for MRC (FNN-MRC) is very effective on Machine Reading Comprehension (MRC) task, comparing with state-of-the-art techniques.

Section snippets

Related work

In this section, we will first provide a brief introduction to MRC datasets, which play an important role in recent progress in reading comprehension, and subsequently describe machine learning models applied specifically to two MRC datasets, namely, MCTest and RACE, in details.

Frame representation model

For each sentence in given passage, we can get its corresponding Frame semantic annotations by Frame Annotator SEMAFOR [4], which will clearly bring additional background semantic knowledge to help us to better tackle MRC task. In this section, we present our Frame representation model to represent the semantic information of Frames, considering Lexical Units (LUs), Frame-to-Frame (F-to-F) relations and corresponding sentence. In particular, Frame (F) is defined as a composition of LUs and

Frame-based sentence representation

Given a sentence s={x1,x2,,xk,}, where each xk is a word, let Tk be the k-th Frame-evoking target of s, and Tk evokes Fk Frame. FEki denotes the i-th Frame element of Fk, and Pki denotes the i-th span fulfilling FEki. We define a Frame semantic quadruple ck=Tk,Fk,FEkn,Pkn, where ck represents the k-th quadruple of s.

Frame-based Neural Network for Machine Reading Comprehension

Frame-based Neural Network for Machine Reading Comprehension (FNN-MRC) architecture comprises three key components: raw context representation, Frame-based context representation and answer prediction. The architecture is illustrated in Fig. 8. In this section, we provide the details on how to implement the three components and explain how they work.

Experiments

In this section, we conduct comprehensive experiments to compare our FNN-MRC model with existing state-of-the-art techniques. To better analyze the performance of our FNN-MRC method on MRC, we consider two types of neural models: (i) traditional deep learning methods LSTM [23], (ii) the powerful pre-trained language model. For pre-trained model, we use BERT as the backbone to illustrate how the proposed method works, as its superior performance in a range of MRC tasks.

Conclusion and future work

In this paper, we proposed a novel Frame-based Neural Network for Machine Reading Comprehension (FNN-MRC). Specifically, we utilize both Lexical Units (LUs) and Frame-to-Frame (F-to-F) relations to built the Frame representation model, and propose a novel Frame-based sentence representation model to integrate multi-Frame semantic information in order to facilitate sentence modeling. Our extensive experimental results across four datasets demonstrate our proposed FNN-MRC works very well for the

CRediT authorship contribution statement

Shaoru Guo: Conceptualization, Methodology, Software, Writing - original draft. Yong Guan: Software, investigation, Data curation. Hongye Tan: Resources, Validation. Ru Li: Supervision. Xiaoli Li: Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We thank the anonymous reviewers for their insightful comments. This work was sponsored by the National Natural Science Foundation of China (No. 61936012, No. 61772324).

References (55)

  • LuJ. et al.

    Structural property-aware multilayer network embedding for latent factor analysis

    Pattern Recognit.

    (2018)
  • FillmoreC.J.

    Frame semantics and the nature of language

    Ann. New York Acad. Sci.

    (1976)
  • BakerC.F. et al.

    The berkeley framenet project

  • GildeaD. et al.

    Automatic labeling of semantic roles

    Comput. Linguist.

    (2002)
  • DasD. et al.

    Frame-semantic parsing

    Comput. Linguist.

    (2014)
  • LiuS. et al.

    Leveraging framenet to improve automatic event detection

  • BurchardtA. et al.

    Assessing the impact of frame semantics on textual entailment

    Nat. Lang. Eng,

    (2009)
  • OfoghiB. et al.

    The impact of frame semantic annotation levels, frame-alignment techniques, and fusion methods on factoid answer processing

    J. Am. Soc. Inf. Sci. Technol.

    (2009)
  • ChambersN. et al.

    A database of narrative schemas

  • X. Zhang, X. Sun, H. Wang, Duplicate question identification by integrating framenet with neural networks, in:...
  • MillerG.A.

    Wordnet: A lexical database for english

    Commun. ACM

    (1995)
  • PalmerM. et al.

    The proposition bank: An annotated corpus of semantic roles

    Comput. Linguist.

    (2005)
  • WangH. et al.

    Machine comprehension with syntax, frames, and semantics

  • HermannK.M. et al.

    Teaching machines to read and comprehend

  • KapashiD. et al.

    Answering Reading Comprehension Using Memory NetworksReport for Stanford University Course cs224d

    (2015)
  • DevlinJ. et al.

    BERT: pre-training of deep bidirectional transformers for language understanding

    (2018)
  • HermannK.M. et al.

    Multilingual models for compositional distributed semantics

    (2014)
  • BojanowskiP. et al.

    Enriching word vectors with subword information

    Trans. Assoc. Comput. Linguist.

    (2017)
  • GlavasG. et al.

    How to (properly) evaluate cross-lingual word embeddings: On strong baselines, comparative analyses, and some misconceptions

    (2019)
  • BahdanauD. et al.

    Neural machine translation by jointly learning to align and translate

    (2014)
  • RichardsonM. et al.

    MCTest: A challenge dataset for the open-domain machine comprehension of text

  • RajpurkarP. et al.

    Squad: 100,000+ questions for machine comprehension of text

    (2016)
  • TrischlerA. et al.

    NewsQA: A machine comprehension dataset

  • HochreiterS. et al.

    Long short-term memory

    Neural Comput.

    (1997)
  • CuiY. et al.

    Consensus attention-based neural networks for Chinese reading comprehension

    (2016)
  • TaylorW.L.

    “Cloze procedure”: A new tool for measuring readability

    Journalism Q.

    (1953)
  • JoshiM. et al.

    Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension

    (2017)
  • Cited by (0)

    The code (and data) in this article has been certified as Reproducible by Code Ocean:https://help.codeocean.com/en/articles/1120151-code-ocean-s-verification-process-for-computational-reproducibility. More information on the Reproducibility Badge Initiative isavailable at www.elsevier.com/locate/knosys.

    View full text