DravidianMultiModality: A Dataset for Multi-modal Sentiment Analysis in Tamil and Malayalam,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

DravidianMultiModality: A Dataset for Multi-modal Sentiment Analysis in Tamil and Malayalam
arXiv - CS - Computation and Language Pub Date : 2021-06-09 , DOI: arxiv-2106.04853
Bharathi Raja Chakravarthi, Jishnu Parameswaran P. K, Premjith B, K. P Soman, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Kingston Pal Thamburaj, John P. McCrae

Human communication is inherently multimodal and asynchronous. Analyzing human emotions and sentiment is an emerging field of artificial intelligence. We are witnessing an increasing amount of multimodal content in local languages on social media about products and other topics. However, there are not many multimodal resources available for under-resourced Dravidian languages. Our study aims to create a multimodal sentiment analysis dataset for the under-resourced Tamil and Malayalam languages. First, we downloaded product or movies review videos from YouTube for Tamil and Malayalam. Next, we created captions for the videos with the help of annotators. Then we labelled the videos for sentiment, and verified the inter-annotator agreement using Fleiss's Kappa. This is the first multimodal sentiment analysis dataset for Tamil and Malayalam by volunteer annotators.

中文翻译：

DravidianMultiModality：用于泰米尔语和马拉雅拉姆语多模态情绪分析的数据集

人类交流本质上是多模式和异步的。分析人类的情绪和情感是人工智能的一个新兴领域。我们看到社交媒体上关于产品和其他主题的本地语言多模式内容越来越多。但是，对于资源不足的达罗毗荼语言，可用的多模式资源并不多。我们的研究旨在为资源不足的泰米尔语和马拉雅拉姆语创建多模态情感分析数据集。首先，我们从 YouTube 下载了泰米尔语和马拉雅拉姆语的产品或电影评论视频。接下来，我们在注释者的帮助下为视频创建了字幕。然后我们为视频标记情感，并使用 Fleiss 的 Kappa 验证注释者间的一致性。

更新日期：2021-06-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>