当前位置:
X-MOL 学术
›
arXiv.cs.IR
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
IRLCov19: A Large COVID-19 Multilingual Twitter Dataset of Indian Regional Languages
arXiv - CS - Information Retrieval Pub Date : 2021-07-26 , DOI: arxiv-2107.12360 Deepak Uniyal, Amit Agarwal
arXiv - CS - Information Retrieval Pub Date : 2021-07-26 , DOI: arxiv-2107.12360 Deepak Uniyal, Amit Agarwal
Emerged in Wuhan city of China in December 2019, COVID-19 continues to spread
rapidly across the world despite authorities having made available a number of
vaccines. While the coronavirus has been around for a significant period of
time, people and authorities still feel the need for awareness due to the
mutating nature of the virus and therefore varying symptoms and prevention
strategies. People and authorities resort to social media platforms the most to
share awareness information and voice out their opinions due to their massive
outreach in spreading the word in practically no time. People use a number of
languages to communicate over social media platforms based on their
familiarity, language outreach, and availability on social media platforms. The
entire world has been hit by the coronavirus and India is the second worst-hit
country in terms of the number of active coronavirus cases. India, being a
multilingual country, offers a great opportunity to study the outreach of
various languages that have been actively used across social media platforms.
In this study, we aim to study the dataset related to COVID-19 collected in the
period between February 2020 to July 2020 specifically for regional languages
in India. This could be helpful for the Government of India, various state
governments, NGOs, researchers, and policymakers in studying different issues
related to the pandemic. We found that English has been the mode of
communication in over 64% of tweets while as many as twelve regional languages
in India account for approximately 4.77% of tweets.
更新日期:2021-07-27