当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Highlighting keyphrases using senti-scoring and fuzzy entropy for unsupervised sentiment analysis
Expert Systems with Applications ( IF 7.5 ) Pub Date : 2020-11-21 , DOI: 10.1016/j.eswa.2020.114323
Srishti Vashishtha , Seba Susan

Sentiment Analysis is a process that aids in assessing the performance of products or services from user generated online posts. In present time, there are various websites that allow customers to post reviews about movies, products, events or services, etc. This has led to cumulative aggregation of a lot of reviews written in natural language. Prevailing factors such as availability of online reviews and raised end-user expectations have motivated the evolution of opinion mining systems that can automatically classify customers' reviews. It is observed that in Sentiment Analysis (SA), to highlight the significant keyphrases which contribute towards correct sentiment cognition is a tedious task. In this paper, we have proposed an unsupervised sentiment classification system that comprehensively formulates phrases, computes their senti-scores (sentiment scores) and polarity using the SentiWordNet lexicon and fuzzy linguistic hedges. Further it extracts the keyphrases significant for SA using fuzzy entropy filter and k-means clustering. We have deployed document level SA on online reviews using n-gram techniques, specifically combination of unigram, bigram and trigram. Experiments on two benchmark movie review datasets- polarity dataset by Pang and Lee and IMDB dataset, achieve high accuracy for our approach as compared to the other state-of-the-art-methods for phrase-level SA.



中文翻译:

使用情感评分和模糊熵突出显示关键短语,以进行无监督的情感分析

情感分析是一个过程,可帮助评估用户生成的在线帖子中产品或服务的性能。当前,有各种各样的网站允许客户发布有关电影,产品,事件或服务等的评论。这导致许多以自然语言编写的评论的累积聚集。在线评论的可用性和最终用户的期望提高等普遍因素促使人们可以自动对客户的评论进行分类的观点挖掘系统的发展。可以看出,在情感分析(SA)中,突出显示有助于正确情感认知的重要关键词是一项繁琐的任务。在本文中,我们提出了一种无监督的情感分类系统,可以全面地表达短语,使用SentiWordNet词典和模糊语言对冲来计算其情感分数(情感分数)和极性。此外,它使用模糊熵滤波器和k均值聚类提取对SA有意义的关键短语。我们已经使用n-gram技术(特别是unigram,bigram和trigram的组合)在在线评论上部署了文档级SA。与短语级SA的其他最新方法相比,在两个基准电影评论数据集上进行的实验-Pang和Lee的极性数据集和IMDB数据集,实现了我们方法的高精度。特别是unigram,bigram和trigram的组合。与短语级SA的其他最新方法相比,在两个基准电影评论数据集上进行的实验-Pang和Lee的极性数据集和IMDB数据集,实现了我们方法的高精度。特别是unigram,bigram和trigram的组合。与短语级SA的其他最新方法相比,在两个基准电影评论数据集上进行的实验-Pang和Lee的极性数据集和IMDB数据集,实现了我们方法的高精度。

更新日期:2020-12-30
down
wechat
bug