当前位置: X-MOL 学术IEEE Trans. Fuzzy Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fuzzy Clustering-Based Adaptive Regression for Drifting Data Streams
IEEE Transactions on Fuzzy Systems ( IF 10.7 ) Pub Date : 2020-02-28 , DOI: 10.1109/tfuzz.2019.2910714
Yiliao Song , Jie Lu , Haiyan Lu , Guangquan Zhang

Current models and algorithms have been increasingly required to learn in a nonstationary environment because the phenomenon of concept drift (or pattern shift) may occur, that is, the assumption that data are identically distributed may be invalid in data streams. Once the data pattern changes, a well-trained model built on the previous, now obsolete data cannot provide an accurate prediction for future data. To obtain reliable prediction, it is important to understand the existing patterns in the data stream and to know which pattern the current examples belong to during the modeling process. However, it is ambiguous to classify an example to a certain pattern in many real-world cases. In this paper, we propose a novel adaptive regression approach, called FUZZ-CARE, to dynamically recognize, train, and store patterns, and assign the membership degree of the upcoming examples belonging to these patterns. Membership degrees are presented by the membership matrix obtained from a kernel fuzzy c-means clustering, which is synchronously trained and adapted with regression parameters. Rather than designing a complicated procedure to continuously chase the newest pattern, which is a common approach in the literature, FUZZ-CARE abstracts useful past information to help predict newly arrived examples. It thus effectively avoids the risk of insufficient training due to the lack of new data and improves prediction accuracy. Experiments on six synthetic datasets and 21 real-world datasets validate the high accuracy and robustness of our approach.

中文翻译:


基于模糊聚类的漂移数据流自适应回归



当前的模型和算法越来越需要在非平稳环境中学习,因为可能会出现概念漂移(或模式转移)的现象,即数据同分布的假设在数据流中可能无效。一旦数据模式发生变化,基于以前的、现已过时的数据构建的训练有素的模型就无法为未来的数据提供准确的预测。为了获得可靠的预测,在建模过程中了解数据流中的现有模式并了解当前示例属于哪种模式非常重要。然而,在许多现实案例中,将示例分类为某种模式是不明确的。在本文中,我们提出了一种新颖的自适应回归方法,称为 FUZZ-CARE,用于动态识别、训练和存储模式,并分配属于这些模式的即将出现的示例的隶属度。隶属度由从核模糊c均值聚类获得的隶属矩阵表示,该矩阵与回归参数同步训练和调整。 FUZZ-CARE 没有设计一个复杂的过程来不断追逐最新的模式(这是文献中常见的方法),而是抽象有用的过去信息来帮助预测新到达的示例。从而有效避免了由于缺乏新数据而导致训练不足的风险,提高了预测精度。对 6 个合成数据集和 21 个真实世界数据集的实验验证了我们方法的高精度和稳健性。
更新日期:2020-02-28
down
wechat
bug