PH-model: enhancing multi-passage machine reading comprehension with passage reranking and hierarchical information

Cong, Yao; Wu, Yimin; Liang, Xinbo; Pei, Jiayan; Qin, Zishan

doi:10.1007/s10489-020-02168-3

PH-model: enhancing multi-passage machine reading comprehension with passage reranking and hierarchical information

Published: 09 January 2021

Volume 51, pages 5440–5452, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Yao Cong¹,
Yimin Wu¹,
Xinbo Liang¹,
Jiayan Pei² &
…
Zishan Qin¹

434 Accesses
6 Citations
Explore all metrics

Abstract

Machine reading comprehension(MRC), which employs computers to answer questions from given passages, is a popular research field. In natural language, a natural hierarchical representation can be seen: characters, words, phrases, sentences, paragraphs, and documents. Current studies have demonstrated that hierarchical information can help machines understand natural language. However, prior works focused on the overall performance of MRC tasks without considering hierarchical information. In addition, the noise problem still has not been adequately addressed, even though many researchers have adopted the technique of passage reranking. Thus, in this paper, focusing on noise information processing and the extraction of hierarchical information, we propose a model (PH-Model) with a passage reranking framework (P) and hierarchical neural network (H) for a Chinese multi-passage MRC task. PH-Model produces more precise answers by reducing noise information and extracting hierarchical information. Experimental results on the DuReader 2.0 dataset (a large scale real-world Chinese MRC dataset) show that PH-Model outperforms the ROUGE-L and BLEU-4 baseline by 18.24% and 24.17%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Hierarchical Answer Selection Framework for Multi-passage Machine Reading Comprehension

Multiple Perspective Answer Reranking for Multi-passage Reading Comprehension

HACAN: a hierarchical answer-aware and context-aware network for question generation

Article 18 December 2023

Notes

https://ai.baidu.com/broad/subordinate?dataset=dureader
The codes can be found at https://github.com/Jackcong1/ECMRC

References

Bengio Y, et al. (2009) Learning deep architectures for ai. Foundations and trends®;. Mach Learn 2(1):1–127
MathSciNet MATH Google Scholar
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Article Google Scholar
Catelli R, Gargiulo F, Casola V, De Pietro G, Fujita H, Esposito M (2020) Crosslingual named entity recognition for clinical de-identification applied to a covid-19 italian data set. Appl Soft Comput 97(106):779. https://doi.org/10.1016/j.asoc.2020.106779
Google Scholar
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, volume 1 (Long and Short Papers). Association for Computational Linguistics, pp 4171–4186
Esposito M, Damiano E, Minutolo A, De Pietro G, Fujita H (2020) Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering. Inform Sci 514:88–105. https://doi.org/10.1016/j.ins.2019.12.002
Article Google Scholar
He W, Liu K, Liu J, Lyu Y, Zhao S, Xiao X, Liu Y, Wang Y, Wu H, She Q, Liu X, Wu T, Wang H (2018) Dureader: a chinese machine reading comprehension dataset from real-world applications. In: Proceedings of the workshop on machine reading for question answering@ACL 2018, Melbourne, Australia, July 19, 2018, pp 37–46
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neur Comput 9(8):1735–1780
Article Google Scholar
Jang E, Gu S, Poole B (2017) Categorical reparameterization with gumbel-softmax. In: 5th International conference on learning representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference track proceedings. OpenReview.net
Jiahua L, Wan W, Hao C, Yantao D (2018) Machine reading comprehension for multi-document and multi-answer. Journal of Chinese Information Processing
Jiang Y, Joshi N, Chen Y, Bansal M (2019) Explore, propose, and assemble: an interpretable model for multi-hop reading comprehension. In: Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, volume 1: Long Papers. Association for Computational Linguistics, pp 2714–2725
Lai G, Xie Q, Liu H, Yang Y, Hovy E (2017) Race: large-scale reading comprehension dataset from examinations. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 785–794
Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2020) ALBERT: a lite BERT for self-supervised learning of language representations. In: 8th International conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net
LeCun Y, Bengio Y, Hinton G (2015) . Deep learning nature 521(7553):436–444
Google Scholar
Li Z, Xu J, Lan Y, Guo J, Feng Y, Cheng X (2018) Hierarchical answer selection framework for multi-passage machine reading comprehension. In: China Conference on information retrieval. Springer, pp 93–104
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Lin D, Tang J, Pang K, Li S, Wang T (2019) Selecting paragraphs to answer questions for multi-passage machine reading comprehension. In: China conference on information retrieval. Springer, pp 121–132
Liu J, Wei W, Sun M, Chen H, Du Y, Lin D (2018) A multi-answer multi-task framework for real-world machine reading comprehension. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 2109–2118
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Nguyen T, Rosenberg M, Song X, Gao J, Tiwary S, Majumder R, Deng L (2016) MS MARCO: a human generated machine reading comprehension dataset. In: Proceedings of the workshop on cognitive computation: integrating neural and symbolic approaches 2016 co-located with the 30th annual conference on neural information processing systems (NIPS 2016), Barcelona, Spain, December 9, 2016, CEUR Workshop Proceedings, vol 1773
Nishida K, Nishida K, Nagata M, Otsuka A, Saito I, Asano H, Tomita J (2019) Answering while summarizing: multi-task learning for multi-hop QA with evidence extraction. In: Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, volume 1: Long Papers. Association for Computational Linguistics, pp 2335–2345
Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp 311–318
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Pota M, Esposito M, Pietro GD, Fujita H (2020) Best practices of convolutional neural networks for question classification. Appl Sci 10(14):4710. https://doi.org/10.3390/app10144710
Article Google Scholar
Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) Squad: 100, 000+ questions for machine comprehension of text. In: Proceedings of the 2016 conference on empirical methods in natural language processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, pp 2383–2392
Reddy S, Chen D, Manning CD (2019) Coqa: a conversational question answering challenge. Trans Assoc Comput Ling 7:249–266
Google Scholar
Ren M, Huang H, Wei R, Liu H, Bai Y, Wang Y, Gao Y (2019) Multiple perspective answer reranking for multi-passage reading comprehension. In: CCF International conference on natural language processing and Chinese computing. Springer, pp 736–747
Richardson M, Burges CJ, Renshaw E (2013) Mctest: a challenge dataset for the open-domain machine comprehension of text. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 193–203
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neur Netw 61:85–117
Article Google Scholar
Seo MJ, Kembhavi A, Farhadi A, Hajishirzi H (2017) Bidirectional attention flow for machine comprehension. In: 5th International conference on learning representations, ICLR 2017, Toulon, France, April 24-26, 2017, conference track proceedings
Shen Y, Tan S, Sordoni A, Courville AC (2019) Ordered neurons: integrating tree structures into recurrent neural networks. In: 7th International conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Advances in neural information processing systems, pp 2692–2700
Wang S, Jiang J (2017) Machine comprehension using match-lstm and answer pointer. In: 5th International conference on learning representations, ICLR 2017, Toulon, France, April 24-26, 2017, conference track proceedings
Wang W, Yang N, Wei F, Chang B, Zhou M (2017) Gated self-matching networks for reading comprehension and question answering. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), pp 189–198
Wang Y, Liu K, Liu J, He W, Lyu Y, Wu H, Li S, Wang H (2018) Multi-passage machine reading comprehension with cross-passage answer verification. In: Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, volume 1: long papers, pp 1918–1927
Yan M, Xia J, Wu C, Bi B, Zhao Z, Zhang J, Si L, Wang R, Wang W, Chen H (2019) A deep cascade model for multi-document reading comprehension. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 7354–7361
Yang A, Liu K, Liu J, Lyu Y, Li S (2018) Adaptations of ROUGE and BLEU to better evaluate machine reading comprehension task. In: Proceedings of the workshop on machine reading for question answering@ACL 2018, Melbourne, Australia, July 19, 2018. Association for Computational Linguistics, pp 98–104
Yang Z, Dai Z, Yang Y, Carbonell JG, Salakhutdinov R, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. In: Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pp 5754–5764

Download references

Acknowledgements

This work is partially supported by the National Key R & D Program of China under Grant Number 2016YFB1200402-020. Any opinions, discussions, and conclusions in this material are those of the authors and do not necessarily reflect the views of the National Key R & D Program of China. We also appreciate the reviewers’ valuable and profound comments.

Author information

Authors and Affiliations

School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006, China
Yao Cong, Yimin Wu, Xinbo Liang & Zishan Qin
School of Software Engineering, South China University of Technology, Guangzhou, 510006, China
Jiayan Pei

Authors

Yao Cong
View author publications
You can also search for this author in PubMed Google Scholar
Yimin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xinbo Liang
View author publications
You can also search for this author in PubMed Google Scholar
Jiayan Pei
View author publications
You can also search for this author in PubMed Google Scholar
Zishan Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yimin Wu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cong, Y., Wu, Y., Liang, X. et al. PH-model: enhancing multi-passage machine reading comprehension with passage reranking and hierarchical information. Appl Intell 51, 5440–5452 (2021). https://doi.org/10.1007/s10489-020-02168-3

Download citation

Accepted: 21 December 2020
Published: 09 January 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s10489-020-02168-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PH-model: enhancing multi-passage machine reading comprehension with passage reranking and hierarchical information

Abstract

Access this article

Similar content being viewed by others

Hierarchical Answer Selection Framework for Multi-passage Machine Reading Comprehension

Multiple Perspective Answer Reranking for Multi-passage Reading Comprehension

HACAN: a hierarchical answer-aware and context-aware network for question generation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PH-model: enhancing multi-passage machine reading comprehension with passage reranking and hierarchical information

Abstract

Access this article

Similar content being viewed by others

Hierarchical Answer Selection Framework for Multi-passage Machine Reading Comprehension

Multiple Perspective Answer Reranking for Multi-passage Reading Comprehension

HACAN: a hierarchical answer-aware and context-aware network for question generation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation