当前位置: X-MOL 学术Knowl. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accelerating pattern-based time series classification: a linear time and space string mining approach
Knowledge and Information Systems ( IF 2.7 ) Pub Date : 2019-07-12 , DOI: 10.1007/s10115-019-01378-7
Atif Raza , Stefan Kramer

Subsequences-based time series classification algorithms provide interpretable and generally more accurate classification models compared to the nearest neighbor approach, albeit at a considerably higher computational cost. A number of discretized time series-based algorithms have been proposed to reduce the computational complexity of these algorithms; however, the asymptotic time complexity of the proposed algorithms is also cubic or higher-order polynomial. We present a remarkably fast and resource-efficient time series classification approach which employs a linear time and space string mining algorithm for extracting frequent patterns from discretized time series data. Compared to other subsequence or pattern-based classification algorithms, the proposed approach only requires a few parameters, which can be chosen arbitrarily and do not require any fine-tuning for different datasets. The time series data are discretized using symbolic aggregate approximation, and frequent patterns are extracted using a string mining algorithm. An independence test is used to select the most discriminative frequent patterns, which are subsequently used to create a transformed version of the time series data. Finally, a classification model can be trained using any off-the-shelf algorithm. Extensive empirical evaluations demonstrate the competitive classification accuracy of our approach compared to other state-of-the-art approaches. The experiments also show that our approach is at least one to two orders of magnitude faster than the existing pattern-based methods due to the extremely fast frequent pattern extraction, which is the most computationally intensive process in pattern-based time series classification approaches.

中文翻译:

加速基于模式的时间序列分类:线性时空字符串挖掘方法

与最近的邻居方法相比,基于子序列的时间序列分类算法提供了可解释的且通常更准确的分类模型,尽管计算成本较高。为了减少这些算法的计算复杂度,已经提出了许多基于离散时间序列的算法。然而,所提出算法的渐近时间复杂度也是三次或更高级多项式。我们提出了一种非常快速且资源高效的时间序列分类方法,该方法采用线性时间和空间字符串挖掘算法从离散时间序列数据中提取频繁模式。与其他子序列或基于模式的分类算法相比,该方法只需要几个参数,可以任意选择,并且不需要对不同的数据集进行任何微调。使用符号聚合近似将时间序列数据离散化,并使用字符串挖掘算法提取频繁模式。独立性测试用于选择最有区别的频繁模式,随后将其用于创建时间序列数据的转换版本。最后,可以使用任何现成的算法来训练分类模型。广泛的经验评估表明,与其他最新方法相比,我们的方法具有竞争性的分类准确性。实验还表明,由于非常快速的频繁模式提取,我们的方法比现有的基于模式的方法至少快一到两个数量级,
更新日期:2019-07-12
down
wechat
bug