当前位置: X-MOL 学术arXiv.cs.SI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
EdinburghNLP at WNUT-2020 Task 2: Leveraging Transformers with Generalized Augmentation for Identifying Informativeness in COVID-19 Tweets
arXiv - CS - Social and Information Networks Pub Date : 2020-09-06 , DOI: arxiv-2009.06375
Nickil Maveli

Twitter has become an important communication channel in times of emergency. The ubiquitousness of smartphones enables people to announce an emergency they're observing in real-time. Because of this, more agencies are interested in programatically monitoring Twitter (disaster relief organizations and news agencies) and therefore recognizing the informativeness of a tweet can help filter noise from large volumes of data. In this paper, we present our submission for WNUT-2020 Task 2: Identification of informative COVID-19 English Tweets. Our most successful model is an ensemble of transformers including RoBERTa, XLNet, and BERTweet trained in a semi-supervised experimental setting. The proposed system achieves a F1 score of 0.9011 on the test set (ranking 7th on the leaderboard), and shows significant gains in performance compared to a baseline system using fasttext embeddings.

中文翻译:

EdinburghNLP 在 WNUT-2020 任务 2:利用具有广义增强的 Transformer 识别 COVID-19 推文中的信息量

Twitter 已成为紧急情况下的重要沟通渠道。智能手机无处不在,人们可以实时宣布他们正在观察的紧急情况。正因为如此,越来越多的机构对以编程方式监控 Twitter(救灾组织和新闻机构)感兴趣,因此识别推文的信息量有助于过滤大量数据中的噪音。在本文中,我们展示了我们提交的 WNUT-2020 任务 2:识别信息丰富的 COVID-19 英文推文。我们最成功的模型是一组 Transformer,包括在半监督实验环境中训练的 RoBERTa、XLNet 和 BERTweet。所提出的系统在测试集上的 F1 分数为 0.9011(在排行榜上排名第 7),
更新日期:2020-10-09
down
wechat
bug