The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances.,Data Mining and Knowledge Discovery

当前位置： X-MOL 学术 › Data Min. Knowl. Discov. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances.
Data Mining and Knowledge Discovery ( IF 4.8 ) Pub Date : 2016-11-23 , DOI: 10.1007/s10618-016-0483-9
Anthony Bagnall ₁ , Jason Lines ₁ , Aaron Bostrom ₁ , James Large ₁ , Eamonn Keogh ₂

Affiliation

In the last 5 years there have been a large number of new time series classification algorithms proposed in the literature. These algorithms have been evaluated on subsets of the 47 data sets in the University of California, Riverside time series classification archive. The archive has recently been expanded to 85 data sets, over half of which have been donated by researchers at the University of East Anglia. Aspects of previous evaluations have made comparisons between algorithms difficult. For example, several different programming languages have been used, experiments involved a single train/test split and some used normalised data whilst others did not. The relaunch of the archive provides a timely opportunity to thoroughly evaluate algorithms on a larger number of datasets. We have implemented 18 recently proposed algorithms in a common Java framework and compared them against two standard benchmark classifiers (and each other) by performing 100 resampling experiments on each of the 85 datasets. We use these results to test several hypotheses relating to whether the algorithms are significantly more accurate than the benchmarks and each other. Our results indicate that only nine of these algorithms are significantly more accurate than both benchmarks and that one classifier, the collective of transformation ensembles, is significantly more accurate than all of the others. All of our experiments and results are reproducible: we release all of our code, results and experimental details and we hope these experiments form the basis for more robust testing of new algorithms in the future.

中文翻译：

伟大的时间序列分类烘焙：对最新算法进展的回顾和实验评估。

近5年来，文献中提出了大量新的时间序列分类算法。这些算法已在加州大学河滨分校时间序列分类档案中的 47 个数据集的子集上进行了评估。该档案最近已扩展到 85 个数据集，其中一半以上是由东安格利亚大学的研究人员捐赠的。先前评估的各个方面使得算法之间的比较变得困难。例如，使用了几种不同的编程语言，实验涉及单个训练/测试分割，一些使用标准化数据，而另一些则没有。档案的重新启动提供了一个及时的机会，可以在大量数据集上彻底评估算法。我们在通用 Java 框架中实现了 18 种最近提出的算法，并通过对 85 个数据集中的每一个数据集执行 100 次重采样实验，将它们与两个标准基准分类器（以及彼此之间）进行比较。我们使用这些结果来测试与算法是否比基准和彼此之间的算法是否明显更准确有关的几个假设。我们的结果表明，这些算法中只有九种比两个基准都显着更准确，并且一个分类器（转换集成的集合）比所有其他算法显着更准确。我们所有的实验和结果都是可重复的：我们发布了所有的代码、结果和实验细节，我们希望这些实验能够为未来对新算法进行更稳健的测试奠定基础。

更新日期：2016-11-23

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>