当前位置: X-MOL 学术J. Intell. Fuzzy Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sentiment classification using hybrid feature selection and ensemble classifier
Journal of Intelligent & Fuzzy Systems ( IF 1.7 ) Pub Date : 2021-02-17 , DOI: 10.3233/jifs-189738
Achin Jain 1 , Vanita Jain 2
Affiliation  

This paper presents a Hybrid Feature Selection Technique for Sentiment Classification. We have used a Genetic Algorithm and a combination of existing Feature Selection methods, namely: Information Gain (IG), CHI Square (CHI), and GINI Index (GINI). First, we have obtained features from three different selection approaches as mentioned above and then performed the UNION SET Operation to extract the reduced feature set. Then, Genetic Algorithm is applied to optimize the feature set further. This paper also presents an Ensemble Approach based on the error rate obtained different domain datasets. To test our proposed Hybrid Feature Selection and Ensemble Classification approach, we have considered four Support Vector Machine (SVM) classifier variants. We have used UCI ML Datasets of three domains namely: IMDB Movie Review, Amazon Product Review and Yelp Restaurant Reviews. The experimental results show that our proposed approach performed best in all three domain datasets. Further, we also presented T-Test for Statistical Significance between classifiers and comparison is also done based on Precision, Recall, F1-Score, AUC and model execution time.

中文翻译:

使用混合特征选择和集成分类器进行情感分类

本文提出了一种用于情感分类的混合特征选择技术。我们使用了遗传算法和现有特征选择方法的组合,即:信息增益(IG),CHI方(CHI)和GINI索引(GINI)。首先,我们从上述三种不同的选择方法中获得了特征,然后执行了UNION SET操作以提取简化的特征集。然后,应用遗传算法进一步优化特征集。本文还提出了一种基于错误率获得不同域数据集的集成方法。为了测试我们提出的混合特征选择和集合分类方法,我们考虑了四个支持向量机(SVM)分类器变体。我们使用了三个域的UCI ML数据集:IMDB电影评论,亚马逊产品评论和Yelp餐厅评论。实验结果表明,我们提出的方法在所有三个域数据集中均表现最佳。此外,我们还针对分类器之间的统计意义进行了T检验,并且还基于Precision,Recall,F1-Score,AUC和模型执行时间进行了比较。
更新日期:2021-02-19
down
wechat
bug