当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Domain agnostic online semantic segmentation for multi-dimensional time series.
Data Mining and Knowledge Discovery ( IF 2.8 ) Pub Date : 2018-09-25 , DOI: 10.1007/s10618-018-0589-3
Shaghayegh Gharghabi 1 , Chin-Chia Michael Yeh 1 , Yifei Ding 1 , Wei Ding 2 , Paul Hibbing 3 , Samuel LaMunion 3 , Andrew Kaplan 3 , Scott E Crouter 3 , Eamonn Keogh 1
Affiliation  

Unsupervised semantic segmentation in the time series domain is a much studied problem due to its potential to detect unexpected regularities and regimes in poorly understood data. However, the current techniques have several shortcomings, which have limited the adoption of time series semantic segmentation beyond academic settings for four primary reasons. First, most methods require setting/learning many parameters and thus may have problems generalizing to novel situations. Second, most methods implicitly assume that all the data is segmentable and have difficulty when that assumption is unwarranted. Thirdly, many algorithms are only defined for the single dimensional case, despite the ubiquity of multi-dimensional data. Finally, most research efforts have been confined to the batch case, but online segmentation is clearly more useful and actionable. To address these issues, we present a multi-dimensional algorithm, which is domain agnostic, has only one, easily-determined parameter, and can handle data streaming at a high rate. In this context, we test the algorithm on the largest and most diverse collection of time series datasets ever considered for this task and demonstrate the algorithm’s superiority over current solutions.

中文翻译:

多维时间序列的领域不可知在线语义分割。

时间序列域中的无监督语义分割是一个受到广泛研究的问题,因为它有可能检测出人们所理解的数据中意外的规律性和规则性。然而,当前的技术有几个缺点,由于四个主要原因,它们限制了学术背景之外的时间序列语义分段的采用。首先,大多数方法都需要设置/学习许多参数,因此可能存在一些普遍适用于新颖情况的问题。其次,大多数方法都隐式地假设所有数据都是可分割的,并且在这种假设不成立时会遇到困难。第三,尽管存在多维数据,但许多算法仅针对一维情况定义。最后,大多数研究工作都局限于批量情况下,但是在线细分显然更有用和可行。为了解决这些问题,我们提出了一种多维算法,该算法与领域无关,仅具有一个易于确定的参数,并且可以高速处理数据流。在这种情况下,我们在有史以来为此任务考虑过的最大和最多样化的时间序列数据集上测试了该算法,并证明了该算法相对于当前解决方案的优越性。
更新日期:2018-09-25
down
wechat
bug