Heterogeneous classifier ensemble for sentiment analysis of Bengali and Hindi tweets,Sādhanā

当前位置： X-MOL 学术 › Sādhanā › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Heterogeneous classifier ensemble for sentiment analysis of Bengali and Hindi tweets
Sādhanā ( IF 1.4 ) Pub Date : 2020-08-06 , DOI: 10.1007/s12046-020-01424-z
Kamal Sarkar

Sentiment analysis is an essential step for analysing social media texts such as tweets and other posts on the various micro-blogging sites. The basic step of sentiment analysis is sentiment polarity detection, which identifies whether an input piece of social media text is positive, negative or neutral. In this paper, we present an approach that combines heterogeneous classifiers in an ensemble for sentiment polarity detection in Bengali and Hindi tweets. Our proposed method constructs an ensemble of three different base classifiers where the feature set for each base classifier is different from each other. We have also incorporated an external knowledge base called sentiment lexicon to augment tweet words with sentiment polarity information retrieved from the sentiment lexicon. Experimental results show the effectiveness of our proposed heterogeneous ensemble model for sentiment polarity detection for both Bengali and Hindi languages. It has been shown that our system outperforms other existing Bengali and Hindi sentiment classification systems to which it is compared.

中文翻译：

用于孟加拉语和北印度语推文情感分析的异构分类器集合

情绪分析是分析社交媒体文本（如微博站点上的推文和其他帖子）的重要步骤。情感分析的基本步骤是情感极性检测，它可以识别输入的社交媒体文本是肯定的，否定的还是中性的。在本文中，我们提出了在孟加拉语和北印度语推文中将异类分类器结合在一起用于情感极性检测的方法。我们提出的方法构造了三个不同基本分类器的集合，其中每个基本分类器的特征集彼此不同。我们还合并了一个外部情感知识库，称为情感词典，以利用从情感词典中检索到的情感极性信息来增强推文单词。实验结果表明，我们提出的异类集成模型对于孟加拉语和印地语的情感极性检测都是有效的。已经表明，我们的系统优于其他现有的孟加拉语和北印度语情感分类系统。

更新日期：2020-08-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文