当前位置: X-MOL 学术VLDB J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Distributed detection of sequential anomalies in univariate time series
The VLDB Journal ( IF 4.2 ) Pub Date : 2021-03-25 , DOI: 10.1007/s00778-021-00657-6
Johannes Schneider , Phillip Wenig , Thorsten Papenbrock

The automated detection of sequential anomalies in time series is an essential task for many applications, such as the monitoring of technical systems, fraud detection in high-frequency trading, or the early detection of disease symptoms. All these applications require the detection to find all sequential anomalies possibly fast on potentially very large time series. In other words, the detection needs to be effective, efficient and scalable w.r.t. the input size. Series2Graph is an effective solution based on graph embeddings that are robust against re-occurring anomalies and can discover sequential anomalies of arbitrary length and works without training data. Yet, Series2Graph is no t scalable due to its single-threaded approach; it cannot, in particular, process arbitrarily large sequences due to the memory constraints of a single machine. In this paper, we propose our distributed anomaly detection system, short DADS, which is an efficient and scalable adaptation of Series2Graph. Based on the actor programming model, DADS distributes the input time sequence, intermediate state and the computation to all processors of a cluster in a way that minimizes communication costs and synchronization barriers. Our evaluation shows that DADS is orders of magnitude faster than S2G, scales almost linearly with the number of processors in the cluster and can process much larger input sequences due to its scale-out property.



中文翻译:

单变量时间序列中顺序异常的分布式检测

时间序列中顺序异常的自动检测是许多应用程序的重要任务,例如技术系统的监视,高频交易中的欺诈检测或疾病症状的早期检测。所有这些应用都需要检测,发现所有的连续异常可能于潜在的非常时间序列。换句话说,检测需要有效,高效和可缩放的输入大小。Series2Graph是基于图嵌入的有效解决方案,该图对再次出现的异常具有鲁棒性,并且可以发现任意长度的连续异常,并且无需训练即可工作。但是,由于Series2Graph具有单线程方法,因此无法扩展。由于单个计算机的内存限制,它尤其不能处理任意大的序列。在本文中,我们提出了我们的分布式异常检测系统,简称DADS,它是对Series2Graph的一种有效且可扩展的改编。根据演员编程模型,DADS分配输入时间序列,中间状态和对群集中所有处理器的计算,其方式可以最大程度地减少通信成本和同步障碍。我们的评估表明,DADS比S2G快几个数量级,由于集群中的处理器具有向外扩展的特性,因此它几乎与集群中的处理器数量呈线性比例缩放,并且可以处理更大的输入序列。

更新日期:2021-03-25
down
wechat
bug