当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SAZED: parameter-free domain-agnostic season length estimation in time series data
Data Mining and Knowledge Discovery ( IF 4.8 ) Pub Date : 2019-07-26 , DOI: 10.1007/s10618-019-00645-z
Maximilian Toller , Tiago Santos , Roman Kern

Season length estimation is the task of identifying the number of observations in the dominant repeating pattern of seasonal time series data. As such, it is a common pre-processing task crucial for various downstream applications. Inferring season length from a real-world time series is often challenging due to phenomena such as slightly varying period lengths and noise. These issues may, in turn, lead practitioners to dedicate considerable effort to preprocessing of time series data since existing approaches either require dedicated parameter-tuning or their performance is heavily domain-dependent. Hence, to address these challenges, we propose SAZED: spectral and average autocorrelation zero distance density. SAZED is a versatile ensemble of multiple, specialized time series season length estimation approaches. The combination of various base methods selected with respect to domain-agnostic criteria and a novel seasonality isolation technique, allow a broad applicability to real-world time series of varied properties. Further, SAZED is theoretically grounded and parameter-free, with a computational complexity of \(\mathcal {O}(n\log n)\), which makes it applicable in practice. In our experiments, SAZED was statistically significantly better than every other method on at least one dataset. The datasets we used for the evaluation consist of time series data from various real-world domains, sterile synthetic test cases and synthetic data that were designed to be seasonal and yet have no finite statistical moments of any order.

中文翻译:

SAZED:时间序列数据中无参数的与领域无关的季节长度估计

季节长度估计是确定季节性时间序列数据的主要重复模式中的观测次数的任务。因此,这是对各种下游应用至关重要的常见预处理任务。由于周期长度和噪声稍有变化等现象,从现实世界的时间序列推断季节长度通常很困难。这些问题反过来可能导致从业者投入大量精力进行时间序列数据的预处理,因为现有方法要么需要专用的参数调整,要么其性能在很大程度上取决于领域。因此,为了应对这些挑战,我们提出了SAZED:频谱和平均自相关零距离密度。SAZED是多种专用时间序列季节长度估计方法的多功能集合。根据领域不可知标准选择的各种基本方法与新颖的季节性隔离技术相结合,可广泛应用于各种属性的真实世界时间序列。此外,SAZED在理论上是扎根且无参数的,计算复杂度为\(\ mathcal {O}(n \ log n)\),这使其在实践中适用。在我们的实验中,在至少一个数据集上,SAZED在统计上显着优于其他所有方法。我们用于评估的数据集包括来自各个实际领域的时间序列数据,无菌合成测试用例和合成数据,这些数据被设计为季节性的,但没有任何阶跃的有限统计时刻。
更新日期:2019-07-26
down
wechat
bug