当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Early abandoning and pruning for elastic distances including dynamic time warping
Data Mining and Knowledge Discovery ( IF 4.8 ) Pub Date : 2021-08-16 , DOI: 10.1007/s10618-021-00782-4
Matthieu Herrmann 1 , Geoffrey I. Webb 1
Affiliation  

Nearest neighbor search under elastic distances is a key tool for time series analysis, supporting many applications. However, straightforward implementations of distances require \(O(n^2)\) space and time complexities, preventing these applications from scaling to long series. Much work has been devoted to speeding up the NN search process, mostly with the development of lower bounds, allowing to avoid costly distance computations when a given threshold is exceeded. This threshold, provided by the similarity search process, also allows to early abandon the computation of a distance itself. Another approach, is to prune parts of the computation. All these techniques are orthogonal to each other. In this work, we develop a new generic strategy, “EAPruned”, that tightly integrates pruning with early abandoning. We apply it to six elastic distance measures: DTW, CDTW, WDTW, ERP, MSM and TWE, showing substantial speedup in NN search applications. Pruning alone also shows substantial speedup for some distances, benefiting applications beyond the scope of NN search (e.g. requiring all pairwise distances), and hence where early abandoning is not applicable. We release our implementation as part of a new C++ library for time series classification, along with easy to use Python/Numpy bindings.



中文翻译:

弹性距离的早期放弃和修剪,包括动态时间扭曲

弹性距离下的最近邻搜索是时间序列分析的关键工具,支持许多应用。然而,距离的直接实现需要\(O(n^2)\)空间和时间复杂性,阻止这些应用程序扩展到长系列。许多工作致力于加速 NN 搜索过程,主要是随着下界的发展,允许在超过给定阈值时避免代价高昂的距离计算。这个由相似性搜索过程提供的阈值也允许提前放弃距离本身的计算。另一种方法是修剪部分计算。所有这些技术都是相互正交的。在这项工作中,我们开发了一种新的通用策略“EAPruned”,它将修剪与早期放弃紧密结合。我们将其应用于六种弹性距离度量:DTW、CDTW、WDTW、ERP、MSM 和 TWE,在 NN 搜索应用中显示出显着的加速。单独修剪也显示出在某些距离上的显着加速,有益于 NN 搜索范围之外的应用(例如,需要所有成对距离),因此早期放弃不适用。我们发布了我们的实现,作为用于时间序列分类的新 C++ 库的一部分,以及易于使用的 Python/Numpy 绑定。

更新日期:2021-08-19
down
wechat
bug