Evaluating time series forecasting models: an empirical study on performance estimation methods,Machine Learning

当前位置： X-MOL 学术 › Mach. Learn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Evaluating time series forecasting models: an empirical study on performance estimation methods
Machine Learning ( IF 4.3 ) Pub Date : 2020-10-13 , DOI: 10.1007/s10994-020-05910-7
Vitor Cerqueira , Luis Torgo , Igor Mozetič

Performance estimation aims at estimating the loss that a predictive model will incur on unseen data. These procedures are part of the pipeline in every machine learning project and are used for assessing the overall generalisation ability of predictive models. In this paper we address the application of these methods to time series forecasting tasks. For independent and identically distributed data the most common approach is cross-validation. However, the dependency among observations in time series raises some caveats about the most appropriate way to estimate performance in this type of data and currently there is no settled way to do so. We compare different variants of cross-validation and of out-of-sample approaches using two case studies: One with 62 real-world time series and another with three synthetic time series. Results show noticeable differences in the performance estimation methods in the two scenarios. In particular, empirical experiments suggest that cross-validation approaches can be applied to stationary time series. However, in real-world scenarios, when different sources of non-stationary variation are at play, the most accurate estimates are produced by out-of-sample methods that preserve the temporal order of observations.

中文翻译：

评估时间序列预测模型：性能评估方法的实证研究

性能估计旨在估计预测模型将在看不见的数据上产生的损失。这些程序是每个机器学习项目中管道的一部分，用于评估预测模型的整体泛化能力。在本文中，我们将讨论这些方法在时间序列预测任务中的应用。对于独立同分布的数据，最常用的方法是交叉验证。然而，时间序列中观测值之间的依赖性提出了一些关于在此类数据中估计性能的最合适方法的警告，目前还没有确定的方法。我们使用两个案例研究比较了交叉验证和样本外方法的不同变体：一个具有 62 个真实世界的时间序列，另一个具有三个合成时间序列。结果显示两种情况下性能估计方法的显着差异。特别是，经验实验表明交叉验证方法可以应用于平稳时间序列。然而，在现实世界中，当不同来源的非平稳变化在起作用时，最准确的估计是通过保留观察时间顺序的样本外方法产生的。

更新日期：2020-10-13

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11