An automated learning model for sentiment analysis and data classification of Twitter data using balanced CA-SVM,Concurrent Engineering

当前位置： X-MOL 学术 › Concurr. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An automated learning model for sentiment analysis and data classification of Twitter data using balanced CA-SVM
Concurrent Engineering Pub Date : 2021-07-20 , DOI: 10.1177/1063293x211031485
C Pretty Diana Cyril ₁ , J Rene Beulah ₁ , Neelakandan Subramani ₂ , Prakash Mohan ₃ , A Harshavardhan ₄ , D Sivabalaselvamani ₅

Affiliation

The modern society runs over the social media for their most time of every day. The web users spend their most time in social media and they share many details with their friends. Such information obtained from their chat has been used in several applications. The sentiment analysis is the one which has been applied with Twitter data set toward identifying the emotion of any user and based on those different problems can be solved. Primarily, the data as of the Twitter database is preprocessed. In this step, tokenization, stemming, stop word removal, and number removal are done. The proposed automated learning with CA-SVM based sentiment analysis model reads the Twitter data set. After that they have been processed to extract the features which yield set of terms. Using the terms, the tweets are clustered using TGS-K means clustering which measures Euclidean distance according to different features like semantic sentiment score (SSS), gazetteer and symbolic sentiment support (GSSS), and topical sentiment score (TSS). Further, the method classifies the tweets according to support vector machine (CA-SVM) which classifies the tweet according to the support value which is measured based on the above two measures. The attained results are validated utilizing k-fold cross-validation methodology. Then, the classification is performed by utilizing the Balanced CA-SVM (Deep Learning Modified Neural Network). The results are evaluated and compared with the existing works. The Proposed model achieved 92.48 % accuracy and 92.05% sentiment score contrasted with the existing works.

中文翻译：

一种使用平衡 CA-SVM 的 Twitter 数据情感分析和数据分类的自动学习模型

现代社会每天大部分时间都在社交媒体上运行。网络用户将大部分时间花在社交媒体上，并与朋友分享许多细节。从他们的聊天中获得的此类信息已用于多个应用程序。情感分析是一种已与 Twitter 数据集一起应用于识别任何用户的情感并基于这些不同问题可以解决的分析。主要是对 Twitter 数据库中的数据进行预处理。在这一步中，完成了标记化、词干提取、停用词移除和数字移除。提出的基于 CA-SVM 的情感分析模型自动学习读取 Twitter 数据集。之后，它们被处理以提取产生术语集的特征。使用条款，使用 TGS-K 聚类方法对推文进行聚类，该聚类方法根据语义情感评分 (SSS)、地名词典和符号情感支持 (GSSS) 以及主题情感评分 (TSS) 等不同特征测量欧几里得距离。进一步地，该方法根据支持向量机（CA-SVM）对推文进行分类，该支持向量机根据基于上述两种度量测量的支持值对推文进行分类。使用 k 折交叉验证方法验证获得的结果。然后，利用平衡 CA-SVM（深度学习修正神经网络）进行分类。评估结果并与现有作品进行比较。与现有作品相比，Proposed 模型达到了 92.48% 的准确率和 92.05% 的情感得分。

更新日期：2021-07-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文