当前位置: X-MOL 学术Lobachevskii J. Math. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Recognizing Named Entities in Specific Domain
Lobachevskii Journal of Mathematics Pub Date : 2020-10-21 , DOI: 10.1134/s199508022008020x
M. M. Tikhomirov , N. V. Loukachevitch , B. V. Dobrov

Abstract

The paper presents the results of applying the BERT representation model in the named entity recognition task (NER) for the cybersecurity domain in Russian. We compare several approaches to domain-specific NER combining BERT fine-tuning on a domain-specific text collection, general labeled data, domain-specific data augmentation, and a domain-specific annotated dataset. We showed that using a BERT model fine-tuned on a domain text collection and pre-trained on the combination of a general dataset and augmented data achieves the best results of named entity recognition. We also studied computational performance of the BERT model in so-called mixed precision regime.



中文翻译:

识别特定域中的命名实体

摘要

本文介绍了在俄罗斯网络安全领域的命名实体识别任务(NER)中应用BERT表示模型的结果。我们比较了几种针对特定于域的NER的方法,这些方法结合了针对特定于域的文本集合,常规标记数据,特定于域的数据扩充和特定于域的注释数据集的BERT微调。我们表明,使用在域文本集合上进行微调并在常规数据集和扩充数据的组合上进行预训练的BERT模型,可以实现命名实体识别的最佳结果。我们还研究了BERT模型在所谓的混合精度状态下的计算性能。

更新日期:2020-10-30
down
wechat
bug