Skip to main content
Log in

Does semantics aid syntax? An empirical study on named entity recognition and classification

  • S.I. : WorldCIST’20
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Many researchers jointly model multiple linguistic tasks (e.g., joint modeling of named entity recognition and named entity classification and joint modeling of syntactic parsing and semantic parsing) with an implicit assumption that these individual tasks can enhance each other via the joint modeling. Before conducting research on jointly modeling multiple tasks, however, such researchers hardly examine whether such assumption is true or not. In this paper, we empirically examine whether named entity classification improves the performance of named entity recognition as an empirical case of examining whether semantics improves the performance of a syntactic task. To this end, we firstly specify the way to determine whether a linguistic task is a syntactic task or a semantic task according to both syntactic theory and semantic theory. After that, we design and conduct extensive experiments on two well-known benchmark datasets using three representative yet diverse state-of-the-art models. Experimental results demonstrate that named entity recognition does not lie at the semantic level and is not a semantic task; instead, it is a syntactic task and that the joint modeling of named entity recognition and classification does not improve the performance of named entity recognition. Experimental results also demonstrate that traditional handcrafted feature models can achieve state-of-the-art performance in comparison with the auto-learned feature model on named entity recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

Notes

  1. Term clarification, in this paper, named entity recognition (NER), denotes the task of recognizing named entities from unstructured text; named entity classification (NEC) denotes the task of classifying these recognized named entities into certain predefined categories; and named entity recognition and classification (NERC) denotes the task of treating NER and NEC as an end-to-end joint task.

  2. Language context contains both the syntactic and semantic information, and statistical models (e.g., word embeddings [25, 40, 47]) can learn both the information from context. A model that is optimized for NER does not aim to learn the semantic information but aims to learn the syntactic from context, while a model that is optimized for NEC aims to learn the semantic information from context. In this paper, we are mainly concerns with the impact of the semantic information that is learned from context for the NER performance.

  3. The 18 entity types of the OntoNotes5 dataset are CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PNERCENT, PNERSON, PRODUCT, QUANTITY, TIME, and WORK_OF_ART.

  4. The entity types we remove from the OntoNotes5 dataset to derive the OntoNotes* dataset include CARDINAL, DATE, MONEY, ORDINAL, PNERCENT, QUANTITY, and TIME.

  5. The official version is written by Perl: http://www.cnts.ua.ac.be/conll2000/chunking/conlleval.txt; an alternative version written by Python can be found at https://github.com/spyysalo/conlleval.py

  6. The syntactic information from word embeddings does not improve the NER performance, because \(\rm {UGTO}_{E1}\) already leverages sufficient lexical and syntactic information (which includes those syntactic information learned from context) that covers the syntactic information from word embeddings.

  7. In fact, the syntactic information that is carried by the POS tags is also learned from context.

References

  1. Al-Smadi M, Al-Zboon S, Jararweh Y, Juola P (2020) Transfer learning for Arabic named entity recognition with deep neural networks. IEEE Access 99(8):37736–37745

    Article  Google Scholar 

  2. Alex B, Haddow B, Grover C (2007) Recognising nested named entities in biomedical text. In: Proceedings of the workshop on BioNLP 2007: biological, translational, and clinical language processing, pp 65–72

  3. Borthwick A, Sterling J, Agichtein E, Grishman R (1998) Nyu: description of the mene named entity system as used in muc-7. In: Proceedings of the 7th message understanding conference

  4. Chinchor NA (1997) Muc-7 named entity task definition. In: Proceedings of the 7th message understanding conference, vol 29

  5. Chomsky N (1957) Syntactic structures. Mouton Publishers, Berlin

    Book  Google Scholar 

  6. Chomsky N (1965) Aspects of the theory of syntax. MIT Press, Cambridge

    Google Scholar 

  7. Collins M, Singer Y (1999) Unsupervised models for named entity classification. In: Proceedings of the 1999 joint SIGDAT conference on empirical methods in natural language processing and very large corpora

  8. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa PP (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537

    MATH  Google Scholar 

  9. Dashtipour K, Gogate M, Adeel A, Howard AAN, Hussain A (2017) Persian named entity recognition. In: 2017 IEEE 16th international conference on cognitive informatics and cognitive computing, pp 79–83

  10. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  11. Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1. Association for Computational Linguistics, Minneapolis, pp 4171–4186

  12. Doddington G, Mitchell A, Przybocki M, Ramshaw L, Strassel S, Weischedel R (2004) The automatic content extraction (ace) program tasks, data, and evaluation. In: Proceedings of the 2004 conference on language resources and evaluation, pp 1–4

  13. Dowty DR, Wall RE, Peters S (1981) Introduction to montague semantics. Reidel, Dordrecht

    Google Scholar 

  14. Finkel JR, Manning C (2009) Nested named entity recognition. In: Proceedings of the 2009 conference on empirical methods in natural language processing, pp 141–150

  15. Finkel JR, Grenager T, Manning C (2005) Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43nd annual meeting of the association for computational linguistics, pp 363–370

  16. Gildea D, Palmer M (2002) The necessity of parsing for predicate argument recognition. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 239–246

  17. Giuliano C (2009) Fine-grained classification of named entities exploiting latent semantic kernels. In: CoNLL

  18. Grishman R, Sundheim B (1996) Message understanding conference-6: a brief history. In: Proceedings of the 16th international conference on computational linguistics

  19. Hajič J, Ciaramita M, Johansson R, Kawahara D, Martí MA, Màrquez L, Meyers A, Nivre J, Padó S, Štěpánek P, Surdeanu M, Xue N, Zhang Y (2009) The conll-2009 shared task: syntactic and semantic dependencies in multiple languages. In: Proceedings of the 13th conference on computational natural language learning, pp 1–18

  20. Hashimoto K, Xiong C, Tsuruoka Y, Socher R (2017) A joint many-task model: growing a neural network for multiple nlp tasks. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 1923–1933

  21. Henderson J, Merlo P, Titov I, Musillo G (2013) Multilingual joint parsing of syntactic and semantic dependencies with a latent variable model. Comput Linguist 39(4):949–998

    Article  Google Scholar 

  22. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780

    Article  Google Scholar 

  23. Huang Z, Xu W, Yu K (2015) Bidirectional lstm-crf models for sequence tagging. https://arxiv.org/abs/1508.01991v1

  24. Johansson R, Nugues P (2008) Dependency-based syntactic-semantic analysis with propbank and nombank. In: Proceedings of the 12th conference on computational natural language learning, pp 183–187

  25. Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics, pp 427–431

  26. Katz JJ, Fodor JA (1963) The structure of a semantic theory. Language 39(2):170–210

    Article  Google Scholar 

  27. Kazama J, Torisawa K (2007) Exploiting wikipedia as external knowledge for named entity recognition. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning, pp 698–707

  28. Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of international conference on machine learning, pp 281–289

  29. Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architecture for named entity recognition. In: Proceedings of the 15th annual conference of the north american chapter of the association for computational linguistics, pp 260–270

  30. Liang P (2005) Semi-supervised learning for natural language. Master’s thesis, Massachusetts Institute of Technology

  31. Ling W, Dyer C, Black AW, Trancoso I, Fermandez R, Amir S, Marujo L, Luis T (2015) Finding function in form: compositional character models for open vocabulary word representation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1520–1530

  32. Ling X, Weld DS (2012) Fine-grained entity recognition. In: Proceedings of the twenty-sixth conference on artificial intelligence

  33. Liu L, Shang J, Ren X, Xu FF, Gui H, Peng J, Han J (2018) Empower sequence labeling with task-aware neural language model. In: Proceedings of the 32nd AAAI conference on artifical intelligence

  34. Liu X, Zhang S, Wei F, Zhou M (2011) Recognizing named entities in tweets. In: Proceedings of the 49th annual meeting of the association for computational linguistics, pp 359–367

  35. Lluís X, Carreras X, Màrquez L (2013) Joint arc-factored parsing of syntactic and semantic dependencies. Trans Assoc Comput Linguist 1:219–230

    Article  Google Scholar 

  36. Luo G, Huang X, Lin CY, Nie Z (2015) Joint named entity recognition and disambiguation. In: Proceedings of the 2005 conference on empirical methods in natural language processing, pp 879–888

  37. Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional lstm-cnns-crf. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp 1064–1074

  38. Maynard D, Tablan V, Ursu C, Cunningham H, Wilks Y (2001) Named entity recognition from diverse text types. In: Proceedings of 2001 recent advances in natural language processing conference, pp 257–274

  39. McCallum A, Li W (2003) Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the 7th conference on computational natural language learning

  40. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of 27th conference on neural information processing systems, pp 3111–3119

  41. Mikolov T, Grave E, Bojanowski P, Puhrsch C, Joulin A (2018) Advances in pre-training distributed word representations. In: Proceedings of the international conference on language resources and evaluation

  42. Montague R (1970) English as a formal language. Linguaggi nella Societa nella Tecnica, pp 189–224

  43. Montague R (1973) The proper treatment of quantification in ordinary English. In: Approaches to natural language, pp 221–242

  44. Moon S, Lee G, Chi S, Oh H (2021) Automated construction specification review with namedentity recognition using natural language processing. J Construct Eng Manag 147(1):04020147

    Article  Google Scholar 

  45. Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1):3–26

    Article  Google Scholar 

  46. Nakashole N, Tylenda T, Weikum G (2013) Fine-grained semantic typing of emerging entities. In: Proceedings of the 51st annual meeting of the association for computational linguistics, pp 1488–1497

  47. Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1532–1543

  48. Peters ME, Ammar W, Bhagavatula C, Power R (2017) Semi-supervised suquence tagging with bidirectional language models. In: Proceedings of the 55th annual meeting of the association for computational linguistics, pp 1756–1765

  49. Poibeau T, Kosseim L (2001) Proper name extraction from non-journalistic texts. Lang Comput 37:144–157

    MATH  Google Scholar 

  50. Pradhan S, Moschitti A, Xue N, Ng HT, Bjorkelund A, Uryupina O, Zhang Y, Zhong Z (2013) Towards robust linguistic analysis using ontonotes. In: Proceedings of the 7th conference on computational natural language learning, pp 143–152

  51. Punyakanok V, Roth D, tau Yih W (2005) The necessity of syntactic parsing for semantic role labeling. In: Proceedings of the 19th international joint conference on artificial intelligence, pp 1117–1123

  52. Punyakanok V, Roth D, W tau Yih, (2007) The importance of syntactic parsing and inference in semantic role labeling. Comput Linguist 6(9):1–30

    Google Scholar 

  53. Pustejovsky J, Castano J, Ingria R, Sauri R, Gaizauskas R, Setzer A, Katz G, Radev D (2003a) Timeml: Robust specification of event and temporal expressions in text. New Direct Question Answer 3:28–34

    Google Scholar 

  54. Pustejovsky J, Hanks P, Sauri R, See A, Gaizauskas R, Setzer A, Sundheim B, Radev D, Day D, Ferro L, Lazo M (2003b) The timebank corpus. Corpus Linguist 2003:647–656

    Google Scholar 

  55. Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training

  56. Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In: Proceedings of the thirteenth conference on computational natural language learning. Association for Computational Linguistics, Boulder, Colorado, USA, pp 147–155

  57. Ritter A, Clark S, Mausam, Etzioni O (2011) Named entity recognition in tweets: an experimental study. In: Proceedings of the 2011 conference on empirical methods in natural language processing, pp 1524–1534

  58. Sang EFTK, Meulder FD (2003) Introduction to the conll-2003 shared task: language-independent named entity recognition. In: Proceedings of the 7th conference on natural language learning, pp 142–147

  59. Santos CND, Guimaraes V (2015) Boosting named entity recognition with neural character embeddings. In: Proceedings of the 5th named entities workshop, pp 25–33

  60. Shi P, Zhang Y (2017) Joint bi-affine parsing and semantic role labeling. In: Proceedings of the 2017 international conference on Asian language processing, pp 338–341

  61. Silva JFD, Kozareva Z, Lopes JGP (2004) Cluster analysis and classification of named entities. In: Proceedings of the 2004 conference on language resources and evaluation

  62. Strubell E, Verga P, Belanger D, McCallum A (2017) Fast and accurate entity recognition with iterated dilated convolutions. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 2670–2680

  63. Surdeanu M, Johansson R, Meyers A, Màrquez L, Nivre J (2008) The conll-2008 shared task on joint parsing of syntactic and semantic dependencies. In: Proceedings of the 12th conference on computational natural language learning, pp 159–177

  64. Sutton C, McCallum A (2005) Joint parsing and semantic role labeling. In: Proceedings of the 9th conference on computational natural language learning, pp 225–229

  65. Swayamdipta S, Ballesteros M, Dyer C, Smith NA (2016) Greedy, joint syntactic-semantic parsing with stack lstms. In: Proceedings of the 20th SIGNLL conference on computational natural language learning, pp 187–197

  66. Swayamdipta S, Thomson S, Lee K, Zettlemoyer L, Dyer C, Smith NA (2018) Syntactic scaffolds for semantic structures. In: Proceedings of the 2018 confernece on empirical methods in natural language processing, pp 3772–3782

  67. UzZaman N, Llorens H, Derczynski L, Verhagen M, Allen J, Pustejovsky J (2013) Semeval-2013 task 1: Tempeval-3: evaluating time expressions, events, and temporal relations. In: Proceedings of the 7th international workshop on semantic evaluation, pp 1–9

  68. Verhagen M, Gaizauskas R, Schilder F, Hepple M, Katz G, Pustejovsky J (2007) Semeval-2007 task 15: tempeval temporal relation identification. In: Proceedings of the 4th international workshop on semantic evaluation, pp 75–80

  69. Verhagen M, Sauri R, Caselli T, Pustejovsky J (2010) Semeval-2010 task 13: Tempeval-2. In: Proceedings of the 5th international workshop on semantic evaluation, pp 57–62

  70. Wang LJ, Li WC, Chang CH (1992) Recognizing unregistered names for mandarin word identification. In: Proceedings of the 14th conference on computational linguistics, vol 4, pp 1239–1243

  71. Yadav V, Bethard S (2018) A survey on recent advances in named entity recognition from deep learning models. In: Proceedings of the 27th international conference on computational linguistics, pp 2145–2158

  72. Zhong X (2020) Time expression and named entity analysis and recognition. PhD thesis, Nanyang Technological University

  73. Zhong X, Cambria E (2018) Time expression recognition using a constituent-based tagging scheme. In: Proceedings of the 2018 world wide web conference, pp 983–992

  74. Zhong X, Sun A, Cambria E (2017) Time expression analysis and recognition using syntactic token types and general heuristic rules. In: Proceedings of the 55th annual meeting of the association for computational linguistics, pp 420–429

  75. Zhong X, Cambria E, Hussain A (2020) Extracting time expressions and named entities with constituent-based tagging schemes. Cognit Comput 12(4):844–862

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their valuable feedback which helps us improve the quality of our manuscript. This research/project is supported by A*STAR under its Industry Alignment Fund (LOA Award I1901E0046).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erik Cambria.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhong, X., Cambria, E. & Hussain, A. Does semantics aid syntax? An empirical study on named entity recognition and classification. Neural Comput & Applic 34, 8373–8384 (2022). https://doi.org/10.1007/s00521-021-05949-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-05949-0

Keywords

Navigation