Applied Intelligence ( IF 5.3 ) Pub Date : 2019-10-19 , DOI: 10.1007/s10489-019-01560-y Jie Lin , NianHua Li , Md Ashraful Alam , Yuqing Ma
Abstract
Due to cluster instability, not in the cluster monitoring system. This paper focuses on the missing data imputation processing for the cluster monitoring application and proposes a new hybrid multiple imputation framework. This new imputation approach is different from the conventional multiple imputation technologies in the fact that it attempts to impute the missing data for an arbitrary missing pattern with a model-based and data-driven combination architecture. Essentially, the deep neural network, as the data model, extracts deep features from the data and deep features are further calculated then by a regression or data-driven strategies and used to create the estimation of missing data with the arbitrary missing pattern. This paper gives evidence that if we can train a deep neural network to construct the deep features of the data, imputation based on deep features is better than that directly on the original data. In the experiments, we compare the proposed method with other conventional multiple imputation approaches for varying missing data patterns, missing ratios, and different datasets including real cluster data. The result illustrates that when data encounters larger missing ratio and various missing patterns, the proposed algorithm has the ability to achieve more accurate and stable imputation performance.
中文翻译:
基于深度神经网络的集群监控系统中数据驱动的缺失数据归因
摘要
由于群集不稳定,因此不在群集监视系统中。本文重点研究了集群监控应用程序中的缺失数据插补处理,并提出了一种新的混合多重插补框架。这种新的插补方法与常规的多重插补技术的不同之处在于,它尝试使用基于模型和数据驱动的组合体系结构为任意缺失模式插补缺失数据。本质上,深度神经网络作为数据模型从数据中提取深度特征,然后通过回归或数据驱动策略进一步计算深度特征,并用于创建具有任意缺失模式的缺失数据的估计。本文提供的证据表明,如果我们可以训练一个深度神经网络来构造数据的深度特征,基于深层特征的插补比直接基于原始数据的插补要好。在实验中,我们将提出的方法与其他常规多重插补方法进行了比较,以改变丢失的数据模式,丢失的比率以及包括真实聚类数据在内的不同数据集。结果表明,当数据遇到较大的丢失率和各种丢失模式时,该算法具有实现更准确稳定的插补性能的能力。