当前位置: X-MOL 学术IEEE Syst. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
New Clustering Algorithms for Twitter Sentiment Analysis
IEEE Systems Journal ( IF 4.4 ) Pub Date : 2019-05-15 , DOI: 10.1109/jsyst.2019.2912759
Hajar Rehioui , Abdellah Idrissi

In this last decade, the use of social networks became ubiquitous in our daily life. Twitter, one of the famous social networks became a rich source of discussed topics. The users in Twitter express their sentiments or points of view by tweets concerning different topics in variety of fields, such as politics, commercial products, etc. These important information are exploited by sentiment analysis tools. Clustering algorithms are one of the used solutions to discover the sentiment provided by users in tweets. However, knowing that the users sentiments are generally divided into three categories: positive, negative, and neutral, it was mandatory to find a strong clustering algorithm, which leads to a good clustering performance and produce an appropriate number of clusters in an acceptable run time. To achieve this goal, we combine in this paper two well-known clustering methods: K-means and DENCLUE (DENsity-based CLUstEring) with its variants. This combination allows to exploit the precise number of cluster from K-means and the clustering performance from DENCLUE and its variants. Experimental results on four Twitter datasets demonstrate the competitiveness of the proposed algorithms against the state-of-the-art methods to provide a tradeoff between clustering performance, number of returned clusters, and runtime.

中文翻译:

Twitter情绪分析的新聚类算法

在过去的十年中,社交网络的使用在我们的日常生活中无处不在。Twitter是著名的社交网络之一,已成为讨论话题的丰富来源。Twitter中的用户通过推文表达与不同领域(例如政治,商业产品等)不同主题有关的观点或观点。这些重要信息被情感分析工具所利用。聚类算法是用于发现用户在推文中提供的情绪的已用解决方案之一。但是,知道用户情绪通常分为三类:积极,消极和中立,因此必须找到强大的聚类算法,这会导致良好的聚类性能并在可接受的运行时间内生成适当数量的聚类。为了实现这个目标,我们在本文中结合了两种著名的聚类方法:K-means和DENCLUE(基于密度的CLUstEring)及其变体。这种组合允许利用K-means来精确分类集群,并利用DENCLUE及其变体来实现集群性能。在四个Twitter数据集上的实验结果表明,所提出的算法与最新方法相比具有竞争力,可以在聚类性能,返回的聚类数量和运行时间之间进行权衡。
更新日期:2020-04-22
down
wechat
bug