Skip to main content
Log in

Somun: entity-centric summarization incorporating pre-trained language models

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Text summarization resolves the issue of capturing essential information from a large volume of text data. Existing methods either depend on the end-to-end models or hand-crafted preprocessing steps. In this study, we propose an entity-centric summarization method which extracts named entities and produces a small graph with a dependency parser. To extract entities, we employ well-known pre-trained language models. After generating the graph, we perform the summarization by ranking entities using the harmonic centrality algorithm. Experiments illustrate that we outperform the state-of-the-art unsupervised learning baselines by improving the performance more than 10% for ROUGE-1 and more than 50% for ROUGE-2 scores. Moreover, we achieve comparable results to recent end-to-end models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://stanfordnlp.github.io/stanza/index.html.

  2. https://huggingface.co/albert-base-v2.

  3. https://huggingface.co/transformers/index.html.

  4. https://stanfordnlp.github.io/stanza/index.html.

  5. https://networkx.github.io.

  6. https://pypi.org/project/pyrouge/0.1.3/.

  7. https://huggingface.co/transformers/pretrained_models.html.

  8. https://spacy.io/.

References

  1. Arora S, Liang Y, Ma T (2019) A simple but tough-to-beat baseline for sentence embeddings. In: 5th international conference on learning representations, ICLR 2017

  2. Bae S, Kim T, Kim J, Lee SG (2019) Summary level training of sentence rewriting for abstractive summarization. In: Proceedings of the 2nd workshop on new frontiers in summarization, pp 10–20

  3. Boldi P, Vigna S (2014) Axioms for centrality. Internet Math 10(3–4):222–262

    Article  MathSciNet  Google Scholar 

  4. Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30(1–7):107–117

    Article  Google Scholar 

  5. Carbonell J, Goldstein J (1998) The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, pp 335–336

  6. Devlin J, Chang M.W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  7. Diao Y, Lin H, Yang L, Fan X, Chu Y, Wu D, Zhang D, Xu K (2020) CRHASum: extractive text summarization with contextualized-representation hierarchical-attention summarization network. Neural Comput Appl 32(15):11491–11503

    Article  Google Scholar 

  8. Dong Y, Shen Y, Crawford E, van Hoof H, Cheung JCK (2018) Banditsum: extractive summarization as a contextual bandit. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3739–3748

  9. Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479

    Article  Google Scholar 

  10. Freeman LC (1978) Centrality in social networks conceptual clarification. Soc Netw 1(3):215–239

    Article  Google Scholar 

  11. Gao J, Li L (2019) A robust geometric mean-based subspace discriminant analysis feature extraction approach for image set classification. Optik 199:163368

    Article  Google Scholar 

  12. Gao J, Li L, Guo B (2020) A new extendface representation method for face recognition. Neural Process Lett 51(1):473–486

    Article  Google Scholar 

  13. Gao X, Sun Q, Xu H, Wei D, Gao J (2019) Multi-model fusion metric learning for image set classification. Knowl-Based Syst 164:253–264

    Article  Google Scholar 

  14. Grusky M, Naaman M, Artzi Y (2018) Newsroom: a dataset of 1.3 million summaries with diverse extractive strategies. In: Proceedings of NAACL-HLT, pp 708–719

  15. Hermann KM, Kocisky T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. In: Advances in neural information processing systems, pp 1693–1701

  16. Imani M, Ghassemian H (2016) Binary coding based feature extraction in remote sensing high dimensional data. Inf Sci 342:191–208

    Article  Google Scholar 

  17. Joshi M, Chen D, Liu Y, Weld DS, Zettlemoyer L, Levy O (2019) Spanbert: improving pre-training by representing and predicting spans. arXiv:1907.10529

  18. Kanapala A, Jannu S, Pamula R (2019) Summarization of legal judgments using gravitational search algorithm. Neural Comput Appl 31(12):8631–8639

    Article  Google Scholar 

  19. Kupiec J, Pedersen J, Chen F (1995) A trainable document summarizer. In: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, pp 68–73

  20. Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2019) Albert: a lite bert for self-supervised learning of language representations. arXiv:1909.11942

  21. Li L, Ge H, Gao J, Zhang Y (2019) Hyperspectral image feature extraction using maclaurin series function curve fitting. Neural Process Lett 49(1):357–374

    Article  Google Scholar 

  22. Li Q, Li H, Ji H, Wang W, Zheng J, Huang F (2012) Joint bilingual name tagging for parallel corpora. In: Proceedings of the 21st ACM international conference on Information and knowledge management, pp 1727–1731

  23. Lin CY, Och FJ (2004) Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In: Proceedings of the 42nd annual meeting on association for computational linguistics. Association for Computational Linguistics, p 605

  24. Lin H, Bilmes J (2011) A class of submodular functions for document summarization. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol 1. Association for Computational Linguistics, pp 510–520

  25. Liu Y, Lapata M (2019) Text summarization with pretrained encoders. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 3721–3731

  26. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv:1907.11692

  27. Luhn HP (1958) The automatic creation of literature abstracts. IBM J Res Dev 2(2):159–165

    Article  MathSciNet  Google Scholar 

  28. Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Association for computational linguistics (ACL) system demonstrations, pp 55–60. http://www.aclweb.org/anthology/P/P14/P14-5010

  29. Mihalcea R, Tarau P (2004) TextRank: bringing order into text. In: Proceedings of the 2004 conference on empirical methods in natural language processing. Association for Computational Linguistics, Barcelona, Spain, pp 404–411. https://www.aclweb.org/anthology/W04-3252

  30. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 3111–3119

  31. Moore EF (1959) The shortest path through a maze. Proc Int Symp Switch Theory 1959:285–292

    MathSciNet  Google Scholar 

  32. Nallapati R, Zhou B, dos Santos C, Gulcehre C, Xiang B (2016) Abstractive text summarization using sequence-to-sequence RNNS and beyond. In: Proceedings of the 20th SIGNLL conference on computational natural language learning, pp 280–290

  33. Narayan S, Cohen SB, Lapata M (2018) Ranking sentences for extractive summarization with reinforcement learning. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (Long Papers), pp 1747–1759

  34. Nielsen FÅ (2017) Wembedder: Wikidata entity embedding web service. arXiv:1710.04099

  35. Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  36. Rehurek R, Sojka P (2010) Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks. Citeseer

  37. Ren P, Wei F, Chen Z, Ma J, Zhou M (2016) A redundancy-aware sentence regression framework for extractive summarization. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 33–43

  38. Sanh V, Wolf T, Ruder S (2019) A hierarchical multi-task approach for learning embeddings from semantic tasks. Proc AAAI Conf Artif Intell 33:6949–6956

    Google Scholar 

  39. See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), pp 1073–1083

  40. Sidorov G, Gelbukh A, Gómez-Adorno H, Pinto D (2014) Soft similarity and soft cosine measure: similarity of features in vector space model. Comput Sistemas 18(3):491–504

    Google Scholar 

  41. Tu Z, Lu Z, Liu Y, Liu X, Li H (2016) Modeling coverage for neural machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers), pp 76–85

  42. Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: Advances in neural information processing systems, pp 2692–2700

  43. Vrandečić D, Krötzsch M (2014) Wikidata: a free collaborative knowledgebase. Commun ACM 57(10):78–85

    Article  Google Scholar 

  44. Wiemer-Hastings P, Wiemer-Hastings K, Graesser A (2004) Latent semantic analysis. In: Proceedings of the 16th international joint conference on Artificial intelligence, pp 1–14. Citeseer

  45. Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. In: Advances in neural information processing systems, pp 5754–5764

  46. Zhang X, Lapata M, Wei F, Zhou M (2018) Neural latent extractive document summarization. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, Belgium, pp 779–784. https://doi.org/10.18653/v1/D18-1088. https://www.aclweb.org/anthology/D18-1088

  47. Zhang X, Wei F, Zhou M (2019) Hibert: Document level pre-training of hierarchical bidirectional transformers for document summarization. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5059–5069

  48. Zhong M, Liu P, Wang D, Qiu X, Huang XJ (2019) Searching for effective neural extractive summarization: what works and what’s next. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 1049–1058

  49. Zhou Q, Yang N, Wei F, Huang S, Zhou M, Zhao T (2018) Neural document summarization by jointly learning to score and select sentences. In: Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Melbourne, Australia, pp 654–663. https://doi.org/10.18653/v1/P18-1061. https://www.aclweb.org/anthology/P18-1061

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emrah Inan.

Ethics declarations

Conflict of interest

The author declares that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Inan, E. Somun: entity-centric summarization incorporating pre-trained language models. Neural Comput & Applic 33, 5301–5311 (2021). https://doi.org/10.1007/s00521-020-05319-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05319-2

Keywords

Navigation