Neural Text Classification and StackedHeterogeneous Embeddings for Named Entity Recognition in SMM4H 2021,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Neural Text Classification and StackedHeterogeneous Embeddings for Named Entity Recognition in SMM4H 2021
arXiv - CS - Computation and Language Pub Date : 2021-06-10 , DOI: arxiv-2106.05823
Usama Yaseen, Stefan Langer

This paper presents our findings from participating in the SMM4H Shared Task 2021. We addressed Named Entity Recognition (NER) and Text Classification. To address NER we explored BiLSTM-CRF with Stacked Heterogeneous Embeddings and linguistic features. We investigated various machine learning algorithms (logistic regression, Support Vector Machine (SVM) and Neural Networks) to address text classification. Our proposed approaches can be generalized to different languages and we have shown its effectiveness for English and Spanish. Our text classification submissions (team:MIC-NLP) have achieved competitive performance with F1-score of $0.46$ and $0.90$ on ADE Classification (Task 1a) and Profession Classification (Task 7a) respectively. In the case of NER, our submissions scored F1-score of $0.50$ and $0.82$ on ADE Span Detection (Task 1b) and Profession Span detection (Task 7b) respectively.

中文翻译：

SMM4H 2021 中用于命名实体识别的神经文本分类和 StackedHeterogeneous Embeddings

本文介绍了我们参与 SMM4H 共享任务 2021 的发现。我们讨论了命名实体识别 (NER) 和文本分类。为了解决 NER，我们探索了具有 Stacked Heterogeneous Embeddings 和语言特征的 BiLSTM-CRF。我们研究了各种机器学习算法（逻辑回归、支持向量机 (SVM) 和神经网络）来解决文本分类问题。我们提出的方法可以推广到不同的语言，我们已经证明了它对英语和西班牙语的有效性。我们提交的文本分类（团队：MIC-NLP）在 ADE 分类（任务 1a）和专业分类（任务 7a）上的 F1 分数分别为 0.46 美元和 0.90 美元，取得了有竞争力的表现。在 NER 的情况下，我们提交的 F1 分数为 0.50 美元和 0 美元。

更新日期：2021-06-11

点击分享查看原文

点击收藏

阅读更多本刊最新论文