当前位置: X-MOL 学术J. Ambient Intell. Human. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Automatic classification of emotions in news articles through ensemble decision tree classification techniques
Journal of Ambient Intelligence and Humanized Computing ( IF 3.662 ) Pub Date : 2020-08-03 , DOI: 10.1007/s12652-020-02373-5
S. Godfrey Winster , M. Naveen Kumar

Emotions form a major role in human life. As human interactions with online systems have increased drastically, emotion prediction from online text, which otherwise can be monotonous, would help to provide a better environment to the users. Identification of emotions from a normal text itself is very complicated while news text that does not explicitly convey emotions adds more intricacy to it. Data mining methods can be utilized in this context. In this work, the potential of decision tree classifiers in emotion classification is explored. The advocated methodology incorporates two segments towards emotion identification. The first segment deals with data preparation and involves dataset elicitation, translation, HTML tag removal, stop word elimination and stemming. The second segment that implements data mining takes the output of the first segment as its input and applies feature vector formulation, correlation based feature selection, building of bagged Grafted C4.5 learning model and performance evaluation. Based on the evolved classification rules, the emotions are categorized into joy, surprise, fear, sadness, disgust, neutral and mixed kind. Experiments have been conducted to analyse the effect of feature selection methods and ensemble methods in generating efficient rules. The accuracy is compared against eight other decision tree classifiers and also the support vector machine learning model. The proposed methodology achieves the maximum accuracy of 87.83% justifying its utilization in the real time applications.



中文翻译:

通过集合决策树分类技术对新闻中的情感进行自动分类

情感是人类生活中的主要角色。随着人们与在线系统的互动急剧增加,来自在线文本的情感预测(否则可能是单调的)将有助于为用户提供更好的环境。从普通文本本身识别情感非常复杂,而未明确传达情感的新闻文本则更加复杂。在这种情况下可以利用数据挖掘方法。在这项工作中,探索了决策树分类器在情感分类中的潜力。提倡的方法论将情感识别分为两个部分。第一部分涉及数据准备,涉及数据集提取,翻译,HTML标记删除,停用词消除和词干分析。实现数据挖掘的第二部分将第一部分的输出作为输入,并应用特征向量公式化,基于相关性的特征选择,袋装Grafted C4.5学习模型的构建和性能评估。根据不断演变的分类规则,将情绪分为欢乐,惊奇,恐惧,悲伤,厌恶,中立和混合。已经进行了实验以分析特征选择方法和集成方法在生成有效规则中的作用。将准确性与其他八个决策树分类器以及支持向量机学习模型进行比较。所提出的方法达到了87.83%的最大精度,证明了其在实时应用中的利用率。基于相关性的特征选择,袋装嫁接C4.5学习模型的建立和性能评估。根据不断演变的分类规则,将情绪分为欢乐,惊奇,恐惧,悲伤,厌恶,中立和混合。已经进行了实验以分析特征选择方法和集成方法在生成有效规则中的作用。将准确性与其他八个决策树分类器以及支持向量机学习模型进行比较。所提出的方法达到了87.83%的最大精度,证明了其在实时应用中的利用率。基于相关性的特征选择,袋装嫁接C4.5学习模型的建立和性能评估。根据不断演变的分类规则,将情绪分为欢乐,惊奇,恐惧,悲伤,厌恶,中立和混合。已经进行了实验以分析特征选择方法和集成方法在生成有效规则中的作用。将准确性与其他八个决策树分类器以及支持向量机学习模型进行比较。所提出的方法达到了87.83%的最大精度,证明了其在实时应用中的利用率。已经进行了实验以分析特征选择方法和集成方法在生成有效规则中的作用。将准确性与其他八个决策树分类器以及支持向量机学习模型进行比较。所提出的方法达到了87.83%的最大精度,证明了其在实时应用中的利用率。已经进行了实验以分析特征选择方法和集成方法在生成有效规则中的作用。将准确性与其他八个决策树分类器以及支持向量机学习模型进行比较。所提出的方法达到了87.83%的最大精度,证明了其在实时应用中的利用率。

更新日期:2020-08-04
down
wechat
bug