LynyrdSkynyrd at WNUT-2020 Task 2: Semi-Supervised Learning for Identification of Informative COVID-19 English Tweets,arXiv - CS - Social and Information Networks

当前位置： X-MOL 学术 › arXiv.cs.SI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

LynyrdSkynyrd at WNUT-2020 Task 2: Semi-Supervised Learning for Identification of Informative COVID-19 English Tweets
arXiv - CS - Social and Information Networks Pub Date : 2020-09-08 , DOI: arxiv-2009.03849
Abhilasha Sancheti, Kushal Chawla, Gaurav Verma

We describe our system for WNUT-2020 shared task on the identification of informative COVID-19 English tweets. Our system is an ensemble of various machine learning methods, leveraging both traditional feature-based classifiers as well as recent advances in pre-trained language models that help in capturing the syntactic, semantic, and contextual features from the tweets. We further employ pseudo-labelling to incorporate the unlabelled Twitter data released on the pandemic. Our best performing model achieves an F1-score of 0.9179 on the provided validation set and 0.8805 on the blind test-set.

中文翻译：

LynyrdSkynyrd 在 WNUT-2020 任务 2：用于识别信息性 COVID-19 英语推文的半监督学习

我们描述了我们的 WNUT-2020 共享任务系统，用于识别信息丰富的 COVID-19 英文推文。我们的系统是各种机器学习方法的集合，利用了传统的基于特征的分类器以及预训练语言模型的最新进展，这些模型有助于从推文中捕获句法、语义和上下文特征。我们进一步使用伪标签来合并在大流行中发布的未标记的 Twitter 数据。我们表现最好的模型在提供的验证集上的 F1 分数为 0.9179，在盲测试集上的 F1 分数为 0.8805。

更新日期：2020-09-09

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>