当前位置: X-MOL 学术IEEJ Trans. Electr. Electron. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Hybrid Model for Nonlinear Regression with Missing Data Using Quasilinear Kernel
IEEJ Transactions on Electrical and Electronic Engineering ( IF 1.0 ) Pub Date : 2020-10-01 , DOI: 10.1002/tee.23253
Huilin Zhu 1 , Yanling Tian 1 , Yanni Ren 1 , Jinglu Hu 1
Affiliation  

In both the research and engineering fields, missing data is a serious problem that cannot be overlooked. Therefore, available datasets with missing data are a challenge to be modeled by conventional global prediction models. In this paper, we propose a hybrid model consisting of an autoencoder and a gated linear network for solving the regression problem under missing value scenario. A sophisticated modeling and identifying algorithm is developed. First, an extended affinity propagation (AP) clustering algorithm is applied to obtain a self‐organized competitive net dividing the datasets into several clusters. Second, a multiple imputation tool with top p% winner‐take‐all denoising autoencoders (DAE) is introduced to realize better predictions of missing values, in which rough estimates of missing values by using the mean imputation and similarity method within the clusters are used as teacher signals of DAE. Finally, a gated linear network is designed to construct a piecewise linear regression model with interpolations in the exact same way as a support vector regression with a quasilinear kernel composed using the cluster information obtained in the AP clustering step. Based on the experiments of five datasets, our proposed method demonstrates its effectiveness and robustness compared with other traditional kernels and state‐of‐the‐art methods, even on datasets with a large percentage of missing values. © 2020 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC.

中文翻译:

拟线性核用于数据缺失的非线性回归混合模型

在研究和工程领域,数据丢失都是一个不可忽视的严重问题。因此,具有缺失数据的可用数据集是由常规全局预测模型建模的挑战。在本文中,我们提出了一种由自动编码器和门控线性网络组成的混合模型,用于解决缺失值情况下的回归问题。开发了复杂的建模和识别算法。首先,应用扩展的亲和力传播(AP)聚类算法来获得将数据集划分为几个聚类的自组织竞争网络。第二,采用最高p的多重插补工具引入赢家通吃去噪自动编码器(DAE)以实现更好的缺失值预测,其中通过使用聚类中的均值插补和相似度方法对缺失值进行粗略估计,作为DAE的教师信号。最后,设计了门控线性网络,以与使用支持向量聚类的准线性核(使用在AP聚类步骤中获得的聚类信息组成)的支持向量回归完全相同的方式,构建具有插值的分段线性回归模型。基于五个数据集的实验,我们提出的方法证明了它与其他传统内核和最新技术相比的有效性和鲁棒性,即使在丢失值百分比很高的数据集上也是如此。©2020日本电气工程师学会。由Wiley Periodicals LLC发布。
更新日期:2020-11-13
down
wechat
bug