当前位置: X-MOL 学术Inf. Process. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An easy numeric data augmentation method for early-stage COVID-19 tweets exploration of participatory dynamics of public attention and news coverage
Information Processing & Management ( IF 8.6 ) Pub Date : 2022-08-29 , DOI: 10.1016/j.ipm.2022.103073
Yuan Chen 1 , Zhisheng Zhang 1
Affiliation  

With the onset of COVID-19, the pandemic has aroused huge discussions on social media like Twitter, followed by many social media analyses concerning it. Despite such an abundance of studies, however, little work has been done on reactions from the public and officials on social networks and their associations, especially during the early outbreak stage. In this paper, a total of 9,259,861 COVID-19-related English tweets published from 31 December 2019 to 11 March 2020 are accumulated for exploring the participatory dynamics of public attention and news coverage during the early stage of the pandemic. An easy numeric data augmentation (ENDA) technique is proposed for generating new samples while preserving label validity. It attains superior performance on text classification tasks with deep models (BERT) than an easier data augmentation method. To demonstrate the efficacy of ENDA further, experiments and ablation studies have also been implemented on other benchmark datasets. The classification results of COVID-19 tweets show tweets peaks trigged by momentous events and a strong positive correlation between the daily number of personal narratives and news reports. We argue that there were three periods divided by the turning points on January 20 and February 23 and the low level of news coverage suggests the missed windows for government response in early January and February. Our study not only contributes to a deeper understanding of the dynamic patterns and relationships of public attention and news coverage on social media during the pandemic but also sheds light on early emergency management and government response on social media during global health crises.



中文翻译:

一种简单的数字数据增强方法,用于早期 COVID-19 推文探索公众关注和新闻报道的参与动态

随着 COVID-19 的出现,这种流行病在 Twitter 等社交媒体上引起了广泛的讨论,随之而来的是许多社交媒体对其的分析。然而,尽管进行了大量研究,但关于公众和官员对社交网络及其协会的反应的研究却很少,尤其是在疫情爆发的早期阶段。本文收集了2019年12月31日至2020年3月11日期间共发布的9,259,861条与COVID-19相关的英文推文,以探索大流行初期公众关注和新闻报道的参与动态。提出了一种简单的数值数据增强 (ENDA) 技术,用于在保持标签有效性的同时生成新样本。与更简单的数据增强方法相比,它在使用深度模型 (BERT) 的文本分类任务中获得了卓越的性能。为了进一步证明 ENDA 的有效性,还在其他基准数据集上进行了实验和消融研究。COVID-19 推文的分类结果显示,推文峰值由重大事件触发,个人叙述的每日数量与新闻报道之间存在很强的正相关关系。我们认为,以 1 月 20 日和 2 月 23 日的转折点划分了三个时期,低水平的新闻报道表明政府错过了 1 月初和 2 月的响应窗口。我们的研究不仅有助于更深入地了解大流行期间公众关注的动态模式和关系以及社交媒体上的新闻报道,而且还阐明了全球健康危机期间社交媒体上的早期应急管理和政府响应。

更新日期:2022-09-01
down
wechat
bug