Missing value imputation in multivariate time series with end-to-end generative adversarial networks,Information Sciences

当前位置： X-MOL 学术 › Inform. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Missing value imputation in multivariate time series with end-to-end generative adversarial networks
Information Sciences ( IF 8.1 ) Pub Date : 2020-12-01 , DOI: 10.1016/j.ins.2020.11.035
Ying Zhang , Baohang Zhou , Xiangrui Cai , Wenya Guo , Xiaoke Ding , Xiaojie Yuan

Missing values are inherent in multivariate time series because of multiple reasons, such as collection errors, which deteriorate the performance of follow-up analytic applications on the multivariate time series. Numerous missing value imputation methods have been proposed to mitigate the influence of missing values on multivariate time series analysis. Recently, inspired by the success of generative adversarial networks (GANs) in image generation, the GAN-2-Stage has been used to address the imputation problem with the generative model. Specifically, GAN-2-Stage employs GANs to impute the missing values. However, an extra phase is required to optimize the input random “noise” of the generator. In addition, the imputed values can be very different from real values because of the difficulty in training a GAN and the unstable generation process. Therefore, this paper proposes an end-to-end model to impute the missing values in a multivariate time series. Specifically, we introduce an encoder network into the standard GAN architecture that eliminates the input optimization phase in the GAN-2-Stage. Our generator utilizes real data during training to force the imputed values to be close to the real ones. Experiments on three real-world multivariate time series datasets demonstrate that the proposed model outperforms state-of-the-art methods in imputation tasks and downstream applications, including classification and regression.

中文翻译：

具有端到端生成对抗网络的多元时间序列中的缺失值估算

由于多种原因（例如收集错误），缺失值是多变量时间序列中固有的，这会降低多变量时间序列上后续分析应用程序的性能。已经提出了许多缺失值插补方法来减轻缺失值对多元时间序列分析的影响。最近，受生成对抗网络（GAN）在图像生成方面的成功启发，GAN-2-Stage已用于解决生成模型的归因问题。具体来说，GAN-2-Stage使用GAN来估算缺失值。但是，需要额外的相位来优化发生器的输入随机“噪声”。此外，由于训练GAN的困难和不稳定的生成过程，估算值可能与实际值有很大差异。因此，本文提出了一种端到端模型来估算多元时间序列中的缺失值。具体来说，我们将编码器网络引入标准GAN架构中，从而消除了GAN-2-Stage中的输入优化阶段。我们的生成器在训练过程中利用真实数据来强制推定值接近真实值。在三个真实世界的多元时间序列数据集上进行的实验表明，该模型在插补任务和下游应用（包括分类和回归）方面均优于最新方法。我们的生成器在训练过程中利用真实数据来强制推定值接近真实值。在三个真实世界的多元时间序列数据集上进行的实验表明，该模型在插补任务和下游应用（包括分类和回归）方面优于最新方法。我们的生成器在训练过程中利用真实数据来强制推定值接近真实值。在三个真实世界的多元时间序列数据集上进行的实验表明，该模型在插补任务和下游应用（包括分类和回归）方面优于最新方法。

更新日期：2020-12-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>