Time series clustering in linear time complexity

Li, Xiaosheng; Lin, Jessica; Zhao, Liang

doi:10.1007/s10618-021-00798-w

Time series clustering in linear time complexity

Published: 18 September 2021

Volume 35, pages 2369–2388, (2021)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

936 Accesses
6 Citations
2 Altmetric
Explore all metrics

Abstract

With the increasing power of data storage and advances in data generation and collection technologies, large volumes of time series data become available and the content is changing rapidly. This requires data mining methods to have low time complexity to handle the huge and fast-changing data. This article presents a novel time series clustering algorithm that has linear time complexity. The proposed algorithm partitions the data by checking some randomly selected symbolic patterns in the time series. We provide theoretical analysis to show that group structures in the data can be revealed from this process. We evaluate the proposed algorithm extensively on all 128 datasets from the well-known UCR time series archive, and compare with the state-of-the-art approaches with statistical analysis. The results show that the proposed method achieves better accuracy compared with other rival methods. We also conduct experiments to explore how the parameters and configuration of the algorithm can affect the final clustering results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A survey on spatio-temporal series prediction with deep learning: taxonomy, applications, and future directions

Article 09 April 2024

Feiyan Sun, Wenning Hao, … Qianyan Shen

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

Article Open access 18 December 2020

Alejandro Pasos Ruiz, Michael Flynn, … Anthony Bagnall

A survey of methods for time series change point detection

Article 08 September 2016

Samaneh Aminikhanghahi & Diane J. Cook

Notes

References

Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time-series clustering-a decade review. Inf Syst 53:16–38
Article Google Scholar
Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. KDD workshop, Seattle, WA 10:359–370
Dau HA, Keogh E, Kamgar K, Yeh CCM, Zhu Y, Gharghabi S, Ratanamahatana CA, Yanping, Hu B, Begum N, Bagnall A, Mueen A, Batista G, Hexagon-ML (2018) The ucr time series classification archive. https://www.cs.ucr.edu/~eamonn/time_series_data_2018/
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan):1–30
MathSciNet MATH Google Scholar
Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases, vol 23. ACM
Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the twenty-first international conference on Machine learning, ACM, p 36
Gupta L, Molfese DL, Tammana R, Simos PG (1996) Nonlinear alignment and averaging for estimating the evoked potential. IEEE Trans Biomed Eng 43(4):348–356
Article Google Scholar
Hoeffding W (1994) Probability inequalities for sums of bounded random variables In the collected works of Wassily Hoeffding. Springer, Berlin, pp 409–426
Book Google Scholar
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392
Article MathSciNet Google Scholar
Kumar M, Patel NR, Woo J (2002) Clustering seasonality patterns in the presence of errors. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 557–563
Kumar N, Lolla VN, Keogh E, Lonardi S, Ratanamahatana CA, Wei L (2005) Time-series bitmaps: a practical visualization tool for working with large time series databases. In: Proceedings of the 2005 SIAM international conference on data mining, SIAM, pp 531–535
Lei Q, Yi J, Vaculin R, Wu L, Dhillon IS (2019) Similarity preserving representation learning for time series clustering. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, AAAI Press, pp 2845–2851
Li X, Lin J (2017) Linear time complexity time series classification with bag-of-pattern-features. In: 2017 IEEE International Conference on Data Mining (ICDM), IEEE, pp 277–286
Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing sax: a novel symbolic representation of time series. Data Min Knowl Discov 15(2):107–144
Article MathSciNet Google Scholar
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Oakland, CA, USA 1:281–297
Madiraju NS, Sadat SM, Fisher D, Karimabadi H (2018) Deep temporal clustering: Fully unsupervised learning of time-domain features
Niennattrakul V, Ratanamahatana CA (2009) Shape averaging under time warping. In: 2009 6th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, IEEE, vol 2, pp 626–629
Paparrizos J, Gravano L (2015) k-shape: Efficient and accurate clustering of time series. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, ACM, pp 1855–1870
Petitjean F, Ketterlin A, Gançarski P (2011) A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognit 44(3):678–693
Article Google Scholar
Ratanamahatana CA, Keogh E (2004) Everything you know about dynamic time warping is wrong. Citeseer, USA
Google Scholar
Rebbapragada U, Protopapas P, Brodley CE, Alcock C (2009) Finding anomalous periodic time series. Mach Learn 74(3):281–313
Article Google Scholar
Saito N, Coifman RR (1994) Local feature extraction and its applications using a library of bases. PhD thesis, Yale University
Steinbach M, Tan PN, Kumar V, Klooster S, Potter C (2003) Discovery of climate indices using clustering. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 446–455
Subhani N, Rueda L, Ngom A, Burden CJ (2010) Multiple gene expression profile alignment for microarray time-series data clustering. Bioinformatics 26(18):2281–2288
Article Google Scholar
Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26(2):275–309
Article MathSciNet Google Scholar
Yang J, Leskovec J (2011) Patterns of temporal variation in online media. In: Proceedings of the fourth ACM international conference on Web search and data mining, pp 177–186
Zakaria J, Mueen A, Keogh E (2012) Clustering time series using unsupervised-shapelets. In: 2012 IEEE 12th International Conference on Data Mining, IEEE, pp 785–794
Zhang Q, Wu J, Yang H, Tian Y, Zhang C (2016) Unsupervised feature learning from time series. In: IJCAI, pp 2322–2328

Download references

Author information

Authors and Affiliations

George Mason University, 4400 University Dr., Fairfax, USA
Xiaosheng Li, Jessica Lin & Liang Zhao

Authors

Xiaosheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Lin
View author publications
You can also search for this author in PubMed Google Scholar
Liang Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaosheng Li.

Additional information

Responsible editor: Eamonn Keogh.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, X., Lin, J. & Zhao, L. Time series clustering in linear time complexity. Data Min Knowl Disc 35, 2369–2388 (2021). https://doi.org/10.1007/s10618-021-00798-w

Download citation

Received: 06 May 2021
Accepted: 04 September 2021
Published: 18 September 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s10618-021-00798-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Time series clustering in linear time complexity

Abstract

Access this article

Similar content being viewed by others

A survey on spatio-temporal series prediction with deep learning: taxonomy, applications, and future directions

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

A survey of methods for time series change point detection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Time series clustering in linear time complexity

Abstract

Access this article

Similar content being viewed by others

A survey on spatio-temporal series prediction with deep learning: taxonomy, applications, and future directions

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

A survey of methods for time series change point detection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation