当前位置: X-MOL 学术Knowl. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Discovering cluster evolution patterns with the Cluster Association-aware matrix factorization
Knowledge and Information Systems ( IF 2.5 ) Pub Date : 2021-04-09 , DOI: 10.1007/s10115-021-01561-9
Wathsala Anupama Mohotti , Richi Nayak

Tracking of document collections over time (or across domains) is helpful in several applications such as finding dynamics of terminologies, identifying emerging and evolving trends, and concept drift detection. We propose a novel ‘Cluster Association-aware’ Non-negative Matrix Factorization (NMF)-based method with graph-based visualization to identify the changing dynamics of text clusters over time/domains. NMF is utilized to find similar clusters in the set of clustering solutions. Based on the similarities, four major lifecycle states of clusters, namely birth, split, merge and death, are tracked to discover their emergence, growth, persistence and decay. The novel concepts of ‘cluster associations’ and term frequency-based ‘cluster density’ have been used to improve the quality of evolution patterns. The cluster evolution is visualized using a k-partite graph. Empirical analysis with the text data shows that the proposed method is able to produce accurate and efficient solution as compared to the state-of-the-art methods.



中文翻译:

使用可识别集群关联的矩阵分解发现集群演化模式

随时间推移(或跨域)跟踪文档收集在某些应用程序中很有帮助,例如查找术语动态,识别新兴趋势和发展趋势以及概念漂移检测。我们提出了一种新颖的基于“集群关联感知”的非负矩阵分解(NMF)方法,并基于图形的可视化来识别文本集群在时间/域上的变化动态。NMF用于在聚类解决方案集中查找相似的聚类。根据相似性,跟踪集群的四个主要生命周期状态,即出生,分裂,合并和死亡,以发现它们的出现,增长,持久性和衰减。“集群关联”和术语基于频率的“集群密度”的新颖概念已被用来提高进化模式的质量。k部分图。对文本数据的实证分析表明,与最新方法相比,该方法能够产生准确而有效的解决方案。

更新日期:2021-04-09
down
wechat
bug