当前位置: X-MOL 学术Bus. Inf. Syst. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimizing Data Stream Representation: An Extensive Survey on Stream Clustering Algorithms
Business & Information Systems Engineering ( IF 7.4 ) Pub Date : 2019-01-21 , DOI: 10.1007/s12599-019-00576-5
Matthias Carnein , Heike Trautmann

Analyzing data streams has received considerable attention over the past decades due to the widespread usage of sensors, social media and other streaming data sources. A core research area in this field is stream clustering which aims to recognize patterns in an unordered, infinite and evolving stream of observations. Clustering can be a crucial support in decision making, since it aims for an optimized aggregated representation of a continuous data stream over time and allows to identify patterns in large and high-dimensional data. A multitude of algorithms and approaches has been developed that are able to find and maintain clusters over time in the challenging streaming scenario. This survey explores, summarizes and categorizes a total of 51 stream clustering algorithms and identifies core research threads over the past decades. In particular, it identifies categories of algorithms based on distance thresholds, density grids and statistical models as well as algorithms for high dimensional data. Furthermore, it discusses applications scenarios, available software and how to configure stream clustering algorithms. This survey is considerably more extensive than comparable studies, more up-to-date and highlights how concepts are interrelated and have been developed over time.

中文翻译:

优化数据流表示:对流聚类算法的广泛调查

由于传感器、社交媒体和其他流数据源的广泛使用,分析数据流在过去几十年中受到了相当大的关注。该领域的一个核心研究领域是流聚类,旨在识别无序、无限和不断发展的观察流中的模式。聚类可以成为决策的关键支持,因为它旨在随着时间的推移优化连续数据流的聚合表示,并允许识别大数据和高维数据中的模式。已经开发了多种算法和方法,能够在具有挑战性的流场景中随着时间的推移找到和维护集群。该调查对总共 51 种流聚类算法进行了探索、总结和分类,并确定了过去几十年的核心研究思路。特别是,它根据距离阈值、密度网格和统计模型以及高维数据算法识别算法类别。此外,它还讨论了应用场景、可用软件以及如何配置流聚类算法。这项调查比可比研究更广泛,更及时,并突出了概念如何相互关联以及如何随着时间的推移而发展。
更新日期:2019-01-21
down
wechat
bug