当前位置: X-MOL 学术Ain Shams Eng. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving sentiment analysis with learning concepts from concept, patterns lexicons and negations
Ain Shams Engineering Journal ( IF 6.0 ) Pub Date : 2021-08-24 , DOI: 10.1016/j.asej.2021.08.004
Anima Pradhan 1 , Manas Ranjan Senapati 1 , Pradip Kumar Sahu 1
Affiliation  

The way of expressing sentiment (−ve/+ve) in the form of textual information depends on the way of thinking of human beings. Identifying aspect extraction and sentiment polarity from written texts is a crucial task. Mainly, a multi-level learning approach for aspect extraction from statistical methods, pattern-based methods, and rule-based methods. This work proposes the application of two probabilistic graphical Latent Dirichlet Allocation (LDA) and Probabilistic Latent Semantic Analysis (PLSA) algorithms to generate latent topic terms as possible aspects. Then frequency-based and Concept lexicons are used to retrieve unigram to multi-word phrases with associated opinion words. Polarity shift is a significant issue that reverses the polarity of the aspects that affect the sentiment classification of the system. Therefore, to improve the performance of the machine learning classification algorithm in ABSA a hybrid approach comprising rule-based methods and a graph-theoretic model is applied to deal with the explicit and implicit polarity shift. The performance of the proposed method is measured using Naive Bayes, a machine learning classification algorithm on two datasets, SemEval 2014 Restaurant and SemEval 2014 Laptop dataset. Experimental result shows that the method for aspect extraction outperforms baseline methods by 86.32% and 82.64% for Restaurant, Laptop dataset, respectively. Similarly, for aspect-based sentiment classification, the accuracy and F1 measure on Restaurant domain 84.73%, 81.28% and 82.06% and 80.71% on the laptop domain.



中文翻译:

通过从概念、模式词典和否定中学习概念来改进情感分析

以文本信息的形式表达情感的方式(-ve/+ve)取决于人类的思维方式。从书面文本中识别方面提取和情感极性是一项至关重要的任务。主要是从统计方法、基于模式的方法和基于规则的方法中提取方面的多级学习方法。这项工作提出了应用两种概率图形潜在狄利克雷分配 (LDA) 和概率潜在语义分析 (PLSA) 算法来生成潜在主题术语作为可能的方面。然后使用基于频率的和概念词典来检索具有相关意见词的一元到多词短语。极性转移是一个重要的问题,它会颠倒影响系统情感分类的方面的极性。所以,为了提高 ABSA 中机器学习分类算法的性能,应用了一种包含基于规则的方法和图论模型的混合方法来处理显式和隐式极性转换。所提出方法的性能是使用朴素贝叶斯测量的,这是一种机器学习分类算法,在两个数据集 SemEval 2014 Restaurant 和 SemEval 2014 Laptop 数据集上进行。实验结果表明,对于餐厅、笔记本电脑数据集,aspect 提取方法分别优于基线方法 86.32% 和 82.64%。类似地,对于基于方面的情感分类,餐厅域的准确度和 F1 度量分别为 84.73%、81.28% 和 82.06% 以及笔记本电脑域的 80.71%。

更新日期:2021-08-24
down
wechat
bug