当前位置: X-MOL 学术Poznan Studies in Contemporary Linguistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Named entity recognition for Polish
Poznan Studies in Contemporary Linguistics ( IF 0.400 ) Pub Date : 2019-06-26 , DOI: 10.1515/psicl-2019-0010
Michał Marcińczuk , Aleksander Wawer

Abstract In this article we discuss the current state-of-the-art for named entity recognition for Polish. We present publicly available resources and open-source tools for named entity recognition. The overview includes various kind of resources, i.e. guidelines, annotated corpora (NKJP, KPWr, CEN, PST) and lexicons (NELexiconS, PNET, Gazetteer). We present the major NER tools for Polish (Sprout, NERF, Liner2, Parallel LSTM-CRFs and PolDeepNer) and discuss their performance on the reference datasets. In the article we cover identification of named entity mentions in the running text, local and global entity categorization, fine- and coarse-grained categorization and lemmatization of proper names.

中文翻译:

命名为波兰实体认可

摘要在本文中,我们讨论了波兰语的命名实体识别的最新技术。我们介绍了可用于命名实体识别的公开资源和开源工具。概述包括各种资源,即指南,带注释的语料库(NKJP,KPWr,CEN,PST)和词典(NELexiconS,PNET,地名词典)。我们介绍了波兰语的主要NER工具(Sprout,NERF,Liner2,并行LSTM-CRF和PolDeepNer),并讨论了它们在参考数据集上的性能。在本文中,我们介绍了运行文本中提到的命名实体的标识,本地实体和全局实体的分类,专有名称的细粒度和粗粒度分类以及词素化。
更新日期:2019-06-26
down
wechat
bug