当前位置: X-MOL 学术IEEE Trans. Cybern. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Two-Stage Sparse Representation Clustering for Dynamic Data Streams
IEEE Transactions on Cybernetics ( IF 11.8 ) Pub Date : 2022-09-28 , DOI: 10.1109/tcyb.2022.3204894
Jie Chen 1 , Zhu Wang 2 , Shengxiang Yang 3 , Hua Mao 4
Affiliation  

Data streams are a potentially unbounded sequence of data objects, and the clustering of such data is an effective way of identifying their underlying patterns. Existing data stream clustering algorithms face two critical issues: 1) evaluating the relationship among data objects with individual landmark windows of fixed size and 2) passing useful knowledge from previous landmark windows to the current landmark window. Based on sparse representation techniques, this article proposes a two-stage sparse representation clustering (TSSRC) method. The novelty of the proposed TSSRC algorithm comes from evaluating the effective relationship among data objects in the landmark windows with an accurate number of clusters. First, the proposed algorithm evaluates the relationship among data objects using sparse representation techniques. The dictionary and sparse representations are iteratively updated by solving a convex optimization problem. Second, the proposed TSSRC algorithm presents a dictionary initialization strategy that seeks representative data objects by making full use of the sparse representation results. This efficiently passes previously learned knowledge to the current landmark window over time. Moreover, the convergence and sparse stability of TSSRC can be theoretically guaranteed in continuous landmark windows under certain conditions. Experimental results on benchmark datasets demonstrate the effectiveness and robustness of TSSRC.

中文翻译:

动态数据流的两阶段稀疏表示聚类

数据流是潜在无限的数据对象序列,此类数据的聚类是识别其底层模式的有效方法。现有的数据流聚类算法面临两个关键问题:1)评估具有固定大小的各个地标窗口的数据对象之间的关系;2)将有用的知识从先前的地标窗口传递到当前的地标窗口。基于稀疏表示技术,本文提出了一种两阶段稀疏表示聚类(TSSRC)方法。所提出的 TSSRC 算法的新颖性来自于以准确的聚类数量评估地标窗口中数据对象之间的有效关系。首先,所提出的算法使用稀疏表示技术评估数据对象之间的关系。通过解决凸优化问题来迭代更新字典和稀疏表示。其次,所提出的TSSRC算法提出了一种字典初始化策略,通过充分利用稀疏表示结果来寻找代表性数据对象。随着时间的推移,这可以有效地将先前学习的知识传递到当前的里程碑窗口。而且,在一定条件下,TSSRC的收敛性和稀疏稳定性在理论上可以在连续地标窗口中得到保证。基准数据集上的实验结果证明了 TSSRC 的有效性和鲁棒性。
更新日期:2022-09-28
down
wechat
bug