Discernible neighborhood counting based incremental feature selection for heterogeneous data,International Journal of Machine Learning and Cybernetics

当前位置： X-MOL 学术 › Int. J. Mach. Learn. & Cyber. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Discernible neighborhood counting based incremental feature selection for heterogeneous data
International Journal of Machine Learning and Cybernetics ( IF 3.1 ) Pub Date : 2019-08-13 , DOI: 10.1007/s13042-019-00997-4
Yanyan Yang , Shiji Song , Degang Chen , Xiao Zhang

Incremental feature selection refreshes a subset of information-rich features from added-in samples without forgetting the previously learned knowledge. However, most existing algorithms for incremental feature selection have no explicit mechanisms to handle heterogeneous data with symbolic and real-valued features. Therefore, this paper presents an incremental feature selection method for heterogeneous data with the sequential arrival of samples in group. Discernible neighborhood counting that measures different types of features, is first introduced to establish a framework for feature selection from heterogeneous data. With the arrival of new samples, the discernible neighborhood counting of a feature subset is then updated to reveal the incremental feature selection scheme. This scheme determines the criterion for efficiently adding informative features and deleting redundant features. Based on the incremental scheme, our incremental feature selection algorithm is further formulated to select valuable features from heterogeneous data. Extensive experiments are finally conducted to demonstrate the effectiveness and the efficiency of the proposed incremental feature selection algorithm.

中文翻译：

基于可分辨邻域计数的异构数据增量特征选择

增量特征选择可从添加的样本中刷新信息丰富的特征的子集，而不会忘记先前学习的知识。但是，大多数现有的增量特征选择算法都没有显式机制来处理具有符号和实值特征的异构数据。因此，本文提出了一种将样本按组顺序到达的异构数据增量特征选择方法。首先引入可测量不同类型特征的可分辨邻域计数，以建立用于从异构数据中选择特征的框架。随着新样本的到来，特征子集的可分辨邻域计数随后被更新以揭示增量特征选择方案。该方案确定有效添加信息性特征和删除冗余性特征的标准。基于增量方案，我们进一步制定了增量特征选择算法，以从异构数据中选择有价值的特征。最后进行了广泛的实验，以证明所提出的增量特征选择算法的有效性和效率。

更新日期：2019-08-13

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11