NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques
arXiv - CS - Computation and Language Pub Date : 2021-02-24 , DOI: arxiv-2102.12254
Gunjan Chhablani, Yash Bhartia, Abheesht Sharma, Harshit Pandey, Shan Suthaharan

Toxicity detection of text has been a popular NLP task in the recent years. In SemEval-2021 Task-5 Toxic Spans Detection, the focus is on detecting toxic spans within passages. Most state-of-the-art span detection approaches employ various techniques, each of which can be broadly classified into Token Classification or Span Prediction approaches. In our paper, we explore simple versions of both of these approaches and their performance on the task. Specifically, we use BERT-based models -- BERT, RoBERTa, and SpanBERT for both approaches. We also combine these approaches and modify them to bring improvements for Toxic Spans prediction. To this end, we investigate results on four hybrid approaches -- Multi-Span, Span+Token, LSTM-CRF, and a combination of predicted offsets using union/intersection. Additionally, we perform a thorough ablative analysis and analyze our observed results. Our best submission -- a combination of SpanBERT Span Predictor and RoBERTa Token Classifier predictions -- achieves an F1 score of 0.6753 on the test set. Our best post-eval F1 score is 0.6895 on intersection of predicted offsets from top-3 RoBERTa Token Classification checkpoints. These approaches improve the performance by 3% on average than those of the shared baseline models -- RNNSL and SpaCy NER.

中文翻译：

NLRG在SemEval-2021上的任务5：利用基于BERT的令牌分类和跨度预测技术进行有毒跨度检测

近年来，文本的毒性检测已成为NLP的一项流行任务。在SemEval-2021 Task-5有毒跨度检测中，重点是检测通道中的有毒跨度。大多数最新的跨度检测方法都采用各种技术，每种技术可以大致分为令牌分类或跨度预测方法。在我们的论文中，我们探讨了这两种方法的简单版本及其在任务上的性能。具体来说，我们对两种方法都使用基于BERT的模型-BERT，RoBERTa和SpanBERT。我们还结合了这些方法并对其进行了修改，以改进有毒跨度的预测。为此，我们研究了四种混合方法的结果-多跨度，Span + Token，LSTM-CRF以及使用并集/相交的预测偏移量的组合。此外，我们进行彻底的烧蚀分析并分析我们的观察结果。我们最好的提交-SpanBERT跨度预测器和RoBERTa令牌分类器预测的组合-在测试集上获得0.6753的F1分数。在与前3个RoBERTa令牌分类检查点的预测偏移相交处，我们最好的评估后F1分数是0.6895。与共享基准模型RNNSL和SpaCy NER相比，这些方法平均将性能提高3％。

更新日期：2021-02-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文