Matrix profile goes MAD: variable-length motif and discord discovery in data series,Data Mining and Knowledge Discovery

当前位置： X-MOL 学术 › Data Min. Knowl. Discov. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Matrix profile goes MAD: variable-length motif and discord discovery in data series
Data Mining and Knowledge Discovery ( IF 2.8 ) Pub Date : 2020-05-07 , DOI: 10.1007/s10618-020-00685-w
Michele Linardi , Yan Zhu , Themis Palpanas , Eamonn Keogh

In the last 15 years, data series motif and discord discovery have emerged as two useful and well-used primitives for data series mining, with applications to many domains, including robotics, entomology, seismology, medicine, and climatology. Nevertheless, the state-of-the-art motif and discord discovery tools still require the user to provide the relative length. Yet, in several cases, the choice of length is critical and unforgiving. Unfortunately, the obvious brute-force solution, which tests all lengths within a given range, is computationally untenable. In this work, we introduce a new framework, which provides an exact and scalable motif and discord discovery algorithm that efficiently finds all motifs and discords in a given range of lengths. We evaluate our approach with five diverse real datasets, and demonstrate that it is up to 20 times faster than the state-of-the-art. Our results also show that removing the unrealistic assumption that the user knows the correct length, can often produce more intuitive and actionable results, which could have otherwise been missed.

中文翻译：

矩阵配置文件成为MAD：可变长度的主题和数据系列中的不和谐发现

在过去的15年中，数据系列的主题和不和谐发现已成为数据系列挖掘的两个有用且用途广泛的原语，并已应用于许多领域，包括机器人技术，昆虫学，地震学，医学和气候学。尽管如此，最新的主题和不和谐发现工具仍然需要用户提供相对长度。然而，在某些情况下，长度的选择是至关重要且无情的。不幸的是，明显的蛮力解决方案在给定范围内测试所有长度，在计算上是站不住脚的。在这项工作中，我们引入了一个新框架，该框架提供了精确且可扩展的主题和不和谐发现算法，该算法可以有效地查找给定长度范围内的所有主题和不和谐。我们使用五个不同的真实数据集评估了我们的方法，并证明了它比最新技术快20倍。

更新日期：2020-05-07

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11