当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Swiss army knife of time series data mining: ten useful things you can do with the matrix profile and ten lines of code
Data Mining and Knowledge Discovery ( IF 2.8 ) Pub Date : 2020-03-31 , DOI: 10.1007/s10618-019-00668-6
Yan Zhu , Shaghayegh Gharghabi , Diego Furtado Silva , Hoang Anh Dau , Chin-Chia Michael Yeh , Nader Shakibay Senobari , Abdulaziz Almaslukh , Kaveh Kamgar , Zachary Zimmerman , Gareth Funning , Abdullah Mueen , Eamonn Keogh

The recently introduced data structure, the Matrix Profile, annotates a time series by recording the location of and distance to the nearest neighbor of every subsequence. This information trivially provides answers to queries for both time series motifs and time series discords, perhaps two of the most frequently used primitives in time series data mining. One attractive feature of the Matrix Profile is that it completely divorces the high-level details of the analytics performed, from the computational “heavy lifting.” The Matrix Profile can be computed using the appropriate computational paradigm for the task at hand: CPU, GPU, FPGA, distributed computing, anytime computation, incremental computation, and so forth. However, all the details of such computation can be hidden from the analyst who only needs to think about her analytical need. In this work, we expand on this philosophy and ask the following question: If we assume that we get the Matrix Profile for free, what interesting analytics can we do, writing at most ten lines of code? As we will show, the answer is surprisingly large and diverse. Our aim here is not to establish or compete with state-of-the-art results, but merely to show that we can both reproduce the results of many existing algorithms and find novel regularities in time series data collections with very little effort.

中文翻译:

时间序列数据挖掘的瑞士军刀:矩阵配置文件和十行代码可以做十件事

最近引入的数据结构“矩阵配置文件”通过记录每个子序列的最近邻居的位置和与之的距离来注释时间序列。该信息为时间序列主题时间序列不和谐查询提供了简单的答案,可能是时间序列数据挖掘中两个最常用的原语。Matrix Profile的一个吸引人的功能是,它与计算的“繁重工作”完全脱离了所执行分析的高级细节。可以使用适用于当前任务的适当计算范例来计算矩阵配置文件:CPU,GPU,FPGA,分布式计算,随时计算,增量计算等。但是,此类计算的所有细节都可以对分析师隐藏,他们只需考虑她的分析需求即可。在这项工作中,我们将扩展这种哲学,并提出以下问题:如果假设我们免费获得了Matrix Profile,那么我们最多可以编写十行代码,就可以进行哪些有趣的分析?正如我们将显示的,答案出奇的大而多样。
更新日期:2020-03-31
down
wechat
bug