Abstract
Many researchers jointly model multiple linguistic tasks (e.g., joint modeling of named entity recognition and named entity classification and joint modeling of syntactic parsing and semantic parsing) with an implicit assumption that these individual tasks can enhance each other via the joint modeling. Before conducting research on jointly modeling multiple tasks, however, such researchers hardly examine whether such assumption is true or not. In this paper, we empirically examine whether named entity classification improves the performance of named entity recognition as an empirical case of examining whether semantics improves the performance of a syntactic task. To this end, we firstly specify the way to determine whether a linguistic task is a syntactic task or a semantic task according to both syntactic theory and semantic theory. After that, we design and conduct extensive experiments on two well-known benchmark datasets using three representative yet diverse state-of-the-art models. Experimental results demonstrate that named entity recognition does not lie at the semantic level and is not a semantic task; instead, it is a syntactic task and that the joint modeling of named entity recognition and classification does not improve the performance of named entity recognition. Experimental results also demonstrate that traditional handcrafted feature models can achieve state-of-the-art performance in comparison with the auto-learned feature model on named entity recognition.
Similar content being viewed by others
Notes
Term clarification, in this paper, named entity recognition (NER), denotes the task of recognizing named entities from unstructured text; named entity classification (NEC) denotes the task of classifying these recognized named entities into certain predefined categories; and named entity recognition and classification (NERC) denotes the task of treating NER and NEC as an end-to-end joint task.
Language context contains both the syntactic and semantic information, and statistical models (e.g., word embeddings [25, 40, 47]) can learn both the information from context. A model that is optimized for NER does not aim to learn the semantic information but aims to learn the syntactic from context, while a model that is optimized for NEC aims to learn the semantic information from context. In this paper, we are mainly concerns with the impact of the semantic information that is learned from context for the NER performance.
The 18 entity types of the OntoNotes5 dataset are CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PNERCENT, PNERSON, PRODUCT, QUANTITY, TIME, and WORK_OF_ART.
The entity types we remove from the OntoNotes5 dataset to derive the OntoNotes* dataset include CARDINAL, DATE, MONEY, ORDINAL, PNERCENT, QUANTITY, and TIME.
The official version is written by Perl: http://www.cnts.ua.ac.be/conll2000/chunking/conlleval.txt; an alternative version written by Python can be found at https://github.com/spyysalo/conlleval.py
The syntactic information from word embeddings does not improve the NER performance, because \(\rm {UGTO}_{E1}\) already leverages sufficient lexical and syntactic information (which includes those syntactic information learned from context) that covers the syntactic information from word embeddings.
In fact, the syntactic information that is carried by the POS tags is also learned from context.
References
Al-Smadi M, Al-Zboon S, Jararweh Y, Juola P (2020) Transfer learning for Arabic named entity recognition with deep neural networks. IEEE Access 99(8):37736–37745
Alex B, Haddow B, Grover C (2007) Recognising nested named entities in biomedical text. In: Proceedings of the workshop on BioNLP 2007: biological, translational, and clinical language processing, pp 65–72
Borthwick A, Sterling J, Agichtein E, Grishman R (1998) Nyu: description of the mene named entity system as used in muc-7. In: Proceedings of the 7th message understanding conference
Chinchor NA (1997) Muc-7 named entity task definition. In: Proceedings of the 7th message understanding conference, vol 29
Chomsky N (1957) Syntactic structures. Mouton Publishers, Berlin
Chomsky N (1965) Aspects of the theory of syntax. MIT Press, Cambridge
Collins M, Singer Y (1999) Unsupervised models for named entity classification. In: Proceedings of the 1999 joint SIGDAT conference on empirical methods in natural language processing and very large corpora
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa PP (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
Dashtipour K, Gogate M, Adeel A, Howard AAN, Hussain A (2017) Persian named entity recognition. In: 2017 IEEE 16th international conference on cognitive informatics and cognitive computing, pp 79–83
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1. Association for Computational Linguistics, Minneapolis, pp 4171–4186
Doddington G, Mitchell A, Przybocki M, Ramshaw L, Strassel S, Weischedel R (2004) The automatic content extraction (ace) program tasks, data, and evaluation. In: Proceedings of the 2004 conference on language resources and evaluation, pp 1–4
Dowty DR, Wall RE, Peters S (1981) Introduction to montague semantics. Reidel, Dordrecht
Finkel JR, Manning C (2009) Nested named entity recognition. In: Proceedings of the 2009 conference on empirical methods in natural language processing, pp 141–150
Finkel JR, Grenager T, Manning C (2005) Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43nd annual meeting of the association for computational linguistics, pp 363–370
Gildea D, Palmer M (2002) The necessity of parsing for predicate argument recognition. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 239–246
Giuliano C (2009) Fine-grained classification of named entities exploiting latent semantic kernels. In: CoNLL
Grishman R, Sundheim B (1996) Message understanding conference-6: a brief history. In: Proceedings of the 16th international conference on computational linguistics
Hajič J, Ciaramita M, Johansson R, Kawahara D, Martí MA, Màrquez L, Meyers A, Nivre J, Padó S, Štěpánek P, Surdeanu M, Xue N, Zhang Y (2009) The conll-2009 shared task: syntactic and semantic dependencies in multiple languages. In: Proceedings of the 13th conference on computational natural language learning, pp 1–18
Hashimoto K, Xiong C, Tsuruoka Y, Socher R (2017) A joint many-task model: growing a neural network for multiple nlp tasks. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 1923–1933
Henderson J, Merlo P, Titov I, Musillo G (2013) Multilingual joint parsing of syntactic and semantic dependencies with a latent variable model. Comput Linguist 39(4):949–998
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780
Huang Z, Xu W, Yu K (2015) Bidirectional lstm-crf models for sequence tagging. https://arxiv.org/abs/1508.01991v1
Johansson R, Nugues P (2008) Dependency-based syntactic-semantic analysis with propbank and nombank. In: Proceedings of the 12th conference on computational natural language learning, pp 183–187
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics, pp 427–431
Katz JJ, Fodor JA (1963) The structure of a semantic theory. Language 39(2):170–210
Kazama J, Torisawa K (2007) Exploiting wikipedia as external knowledge for named entity recognition. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning, pp 698–707
Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of international conference on machine learning, pp 281–289
Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architecture for named entity recognition. In: Proceedings of the 15th annual conference of the north american chapter of the association for computational linguistics, pp 260–270
Liang P (2005) Semi-supervised learning for natural language. Master’s thesis, Massachusetts Institute of Technology
Ling W, Dyer C, Black AW, Trancoso I, Fermandez R, Amir S, Marujo L, Luis T (2015) Finding function in form: compositional character models for open vocabulary word representation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1520–1530
Ling X, Weld DS (2012) Fine-grained entity recognition. In: Proceedings of the twenty-sixth conference on artificial intelligence
Liu L, Shang J, Ren X, Xu FF, Gui H, Peng J, Han J (2018) Empower sequence labeling with task-aware neural language model. In: Proceedings of the 32nd AAAI conference on artifical intelligence
Liu X, Zhang S, Wei F, Zhou M (2011) Recognizing named entities in tweets. In: Proceedings of the 49th annual meeting of the association for computational linguistics, pp 359–367
Lluís X, Carreras X, Màrquez L (2013) Joint arc-factored parsing of syntactic and semantic dependencies. Trans Assoc Comput Linguist 1:219–230
Luo G, Huang X, Lin CY, Nie Z (2015) Joint named entity recognition and disambiguation. In: Proceedings of the 2005 conference on empirical methods in natural language processing, pp 879–888
Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional lstm-cnns-crf. In: Proceedings of the 54th annual meeting of the association for computational linguistics, pp 1064–1074
Maynard D, Tablan V, Ursu C, Cunningham H, Wilks Y (2001) Named entity recognition from diverse text types. In: Proceedings of 2001 recent advances in natural language processing conference, pp 257–274
McCallum A, Li W (2003) Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the 7th conference on computational natural language learning
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of 27th conference on neural information processing systems, pp 3111–3119
Mikolov T, Grave E, Bojanowski P, Puhrsch C, Joulin A (2018) Advances in pre-training distributed word representations. In: Proceedings of the international conference on language resources and evaluation
Montague R (1970) English as a formal language. Linguaggi nella Societa nella Tecnica, pp 189–224
Montague R (1973) The proper treatment of quantification in ordinary English. In: Approaches to natural language, pp 221–242
Moon S, Lee G, Chi S, Oh H (2021) Automated construction specification review with namedentity recognition using natural language processing. J Construct Eng Manag 147(1):04020147
Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1):3–26
Nakashole N, Tylenda T, Weikum G (2013) Fine-grained semantic typing of emerging entities. In: Proceedings of the 51st annual meeting of the association for computational linguistics, pp 1488–1497
Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1532–1543
Peters ME, Ammar W, Bhagavatula C, Power R (2017) Semi-supervised suquence tagging with bidirectional language models. In: Proceedings of the 55th annual meeting of the association for computational linguistics, pp 1756–1765
Poibeau T, Kosseim L (2001) Proper name extraction from non-journalistic texts. Lang Comput 37:144–157
Pradhan S, Moschitti A, Xue N, Ng HT, Bjorkelund A, Uryupina O, Zhang Y, Zhong Z (2013) Towards robust linguistic analysis using ontonotes. In: Proceedings of the 7th conference on computational natural language learning, pp 143–152
Punyakanok V, Roth D, tau Yih W (2005) The necessity of syntactic parsing for semantic role labeling. In: Proceedings of the 19th international joint conference on artificial intelligence, pp 1117–1123
Punyakanok V, Roth D, W tau Yih, (2007) The importance of syntactic parsing and inference in semantic role labeling. Comput Linguist 6(9):1–30
Pustejovsky J, Castano J, Ingria R, Sauri R, Gaizauskas R, Setzer A, Katz G, Radev D (2003a) Timeml: Robust specification of event and temporal expressions in text. New Direct Question Answer 3:28–34
Pustejovsky J, Hanks P, Sauri R, See A, Gaizauskas R, Setzer A, Sundheim B, Radev D, Day D, Ferro L, Lazo M (2003b) The timebank corpus. Corpus Linguist 2003:647–656
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training
Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In: Proceedings of the thirteenth conference on computational natural language learning. Association for Computational Linguistics, Boulder, Colorado, USA, pp 147–155
Ritter A, Clark S, Mausam, Etzioni O (2011) Named entity recognition in tweets: an experimental study. In: Proceedings of the 2011 conference on empirical methods in natural language processing, pp 1524–1534
Sang EFTK, Meulder FD (2003) Introduction to the conll-2003 shared task: language-independent named entity recognition. In: Proceedings of the 7th conference on natural language learning, pp 142–147
Santos CND, Guimaraes V (2015) Boosting named entity recognition with neural character embeddings. In: Proceedings of the 5th named entities workshop, pp 25–33
Shi P, Zhang Y (2017) Joint bi-affine parsing and semantic role labeling. In: Proceedings of the 2017 international conference on Asian language processing, pp 338–341
Silva JFD, Kozareva Z, Lopes JGP (2004) Cluster analysis and classification of named entities. In: Proceedings of the 2004 conference on language resources and evaluation
Strubell E, Verga P, Belanger D, McCallum A (2017) Fast and accurate entity recognition with iterated dilated convolutions. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 2670–2680
Surdeanu M, Johansson R, Meyers A, Màrquez L, Nivre J (2008) The conll-2008 shared task on joint parsing of syntactic and semantic dependencies. In: Proceedings of the 12th conference on computational natural language learning, pp 159–177
Sutton C, McCallum A (2005) Joint parsing and semantic role labeling. In: Proceedings of the 9th conference on computational natural language learning, pp 225–229
Swayamdipta S, Ballesteros M, Dyer C, Smith NA (2016) Greedy, joint syntactic-semantic parsing with stack lstms. In: Proceedings of the 20th SIGNLL conference on computational natural language learning, pp 187–197
Swayamdipta S, Thomson S, Lee K, Zettlemoyer L, Dyer C, Smith NA (2018) Syntactic scaffolds for semantic structures. In: Proceedings of the 2018 confernece on empirical methods in natural language processing, pp 3772–3782
UzZaman N, Llorens H, Derczynski L, Verhagen M, Allen J, Pustejovsky J (2013) Semeval-2013 task 1: Tempeval-3: evaluating time expressions, events, and temporal relations. In: Proceedings of the 7th international workshop on semantic evaluation, pp 1–9
Verhagen M, Gaizauskas R, Schilder F, Hepple M, Katz G, Pustejovsky J (2007) Semeval-2007 task 15: tempeval temporal relation identification. In: Proceedings of the 4th international workshop on semantic evaluation, pp 75–80
Verhagen M, Sauri R, Caselli T, Pustejovsky J (2010) Semeval-2010 task 13: Tempeval-2. In: Proceedings of the 5th international workshop on semantic evaluation, pp 57–62
Wang LJ, Li WC, Chang CH (1992) Recognizing unregistered names for mandarin word identification. In: Proceedings of the 14th conference on computational linguistics, vol 4, pp 1239–1243
Yadav V, Bethard S (2018) A survey on recent advances in named entity recognition from deep learning models. In: Proceedings of the 27th international conference on computational linguistics, pp 2145–2158
Zhong X (2020) Time expression and named entity analysis and recognition. PhD thesis, Nanyang Technological University
Zhong X, Cambria E (2018) Time expression recognition using a constituent-based tagging scheme. In: Proceedings of the 2018 world wide web conference, pp 983–992
Zhong X, Sun A, Cambria E (2017) Time expression analysis and recognition using syntactic token types and general heuristic rules. In: Proceedings of the 55th annual meeting of the association for computational linguistics, pp 420–429
Zhong X, Cambria E, Hussain A (2020) Extracting time expressions and named entities with constituent-based tagging schemes. Cognit Comput 12(4):844–862
Acknowledgements
The authors would like to thank the anonymous reviewers for their valuable feedback which helps us improve the quality of our manuscript. This research/project is supported by A*STAR under its Industry Alignment Fund (LOA Award I1901E0046).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhong, X., Cambria, E. & Hussain, A. Does semantics aid syntax? An empirical study on named entity recognition and classification. Neural Comput & Applic 34, 8373–8384 (2022). https://doi.org/10.1007/s00521-021-05949-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-05949-0