当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sentiment analysis with genetic programming
Information Sciences Pub Date : 2021-02-01 , DOI: 10.1016/j.ins.2021.01.025
Airton Bordin Junior , Nádia Félix F. da Silva , Thierson Couto Rosa , Celso G.C. Junior

With the advent of online social networks, people became more eager to express and share their opinions and sentiment about all kinds of targets. The overwhelming amount of opinion texts soon attracted the interest of many entities (industry, e-commerce, celebrities, etc.) that were interested in analyzing the sentiment people express about what they produce or communicate. This interest has led to the surge of the sentiment analysis (SA) field. One of the most studied subfields of SA is polarity detection, which is the problem of classifying a text as positive, negative, or neutral. This classification problem is difficult to solve automatically, and many hand-adjusted resources are needed to overcome the difficulties in detecting sentiment from text. These resources include hand-adjusted textual features as well as lexicons. Deciding which resource and which combination of resources are more appropriate to a given scenario is a time-consuming trial-and-error process. Thus, in this work, we propose the use of Genetic Programming (GP) as a tool for automatically choosing, combining, and classifying sentiment from text. We propose a series of functions that allow GP to deal with preprocessing tasks, handcrafted features, and automatic weighting of lexicons for a given training set. Our experiments show that our GP solution is competitive and sometimes better than SVM and superior to naïve Bayes, logistic regression, and stochastic gradient descent, which are methods used in SA competitions.



中文翻译:

用遗传程序进行情感分析

随着在线社交网络的到来,人们变得更加渴望表达和分享他们对各种目标的看法和观点。大量的意见书很快引起了许多实体(行业,电子商务,名人等)的兴趣,这些实体对分​​析人们表达的关于其生产或交流的情感感兴趣。这种兴趣导致了情绪分析(SA)领域的激增。对SA的研究最多的子领域之一是极性检测,这是将文本分类为正,负或中性的问题。这种分类问题很难自动解决,并且需要许多手动调整的资源来克服从文本中检测情感的困难。这些资源包括手动调整的文本功能以及词典。确定哪种资源和哪种资源组合更适合给定方案是一个耗时的反复试验过程。因此,在这项工作中,我们建议使用遗传编程(GP)作为自动从文本中选择,组合和分类情感的工具。我们提出了一系列功能,这些功能使GP可以处理给定训练集的预处理任务,手工制作的功能以及自动对词汇进行加权。我们的实验表明,我们的GP解决方案具有竞争力,有时甚至优于SVM,并且优于SA竞争中使用的朴素贝叶斯,逻辑回归和随机梯度下降。我们建议使用遗传编程(GP)作为自动从文本中选择,组合和分类情感的工具。我们提出了一系列功能,这些功能使GP可以处理给定训练集的预处理任务,手工制作的功能以及自动对词汇进行加权。我们的实验表明,我们的GP解决方案具有竞争力,有时甚至优于SVM,并且优于SA竞争中使用的朴素贝叶斯,逻辑回归和随机梯度下降。我们建议使用遗传编程(GP)作为自动从文本中选择,组合和分类情感的工具。我们提出了一系列功能,这些功能使GP可以处理给定训练集的预处理任务,手工制作的功能以及自动对词汇进行加权。我们的实验表明,我们的GP解决方案具有竞争力,有时甚至优于SVM,并且优于SA竞争中使用的朴素贝叶斯,逻辑回归和随机梯度下降。

更新日期:2021-03-02
down
wechat
bug