当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust Hierarchical Feature Selection Driven by Data and Knowledge
Information Sciences ( IF 8.1 ) Pub Date : 2020-11-13 , DOI: 10.1016/j.ins.2020.11.003
Xinxin Liu , Yucan Zhou , Hong Zhao

Feature selection is facing great challenges brought by the enlarging label space and the inevitable noisy data. Flat feature selection methods fail to obtain a compact feature subset because of the numerous classes. In addition, these data-driven methods are sensitive to the data outliers. Fortunately, many practical tasks usually organize the classes by a hierarchical structure in a coarse-to-fine manner and can be solved by using the divide-and-conquer strategy. In this paper, we propose a hierarchical feature selection method driven by data and knowledge (HFSDK), which is robust to the data outliers and produces compact feature subsets by splitting the original large label space. Firstly, HFSDK decomposes a large-scale classification task into a group of small subclassification tasks with multiple granularities, which is driven by knowledge of the hierarchical class structure. Then, the corresponding datasets are constructed from the bottom up using the class labels of data, which is a data-driven process. Finally, robust and discriminative feature subsets are selected recursively for those subtasks by eliminating the data outliers and adding a semantic relation constraint. Experiments on six real-world datasets validate the superior performance of the proposed method.



中文翻译:

由数据和知识驱动的稳健的分层特征选择

标签空间的扩大和不可避免的噪声数据带来的特征选择面临巨大挑战。平面特征选择方法由于种类繁多而无法获得紧凑的特征子集。此外,这些数据驱动的方法对数据异常值敏感。幸运的是,许多实际任务通常以从头到尾的方式通过层次结构组织类,并且可以使用分而治之的策略来解决。在本文中,我们提出了一种由数据和知识驱动的分层特征选择方法(HFSDK),该方法对数据离群值具有鲁棒性,并且通过分割原始的大标签空间来生成紧凑的特征子集。首先,HFSDK将大规模分类任务分解为一组具有多个粒度的小型子分类任务,这是由对分层类结构的了解所驱动的。然后,使用数据的类标签从下至上构建相应的数据集,这是一个数据驱动的过程。最后,通过消除数据离群值并添加语义关系约束,为那些子任务递归选择健壮和可区分的特征子集。在六个真实世界的数据集上进行的实验证明了该方法的优越性能。

更新日期:2020-11-13
down
wechat
bug