当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Extending greedy feature selection algorithms to multiple solutions
Data Mining and Knowledge Discovery ( IF 4.8 ) Pub Date : 2021-05-01 , DOI: 10.1007/s10618-020-00731-7
Giorgos Borboudakis 1 , Ioannis Tsamardinos 1, 2, 3
Affiliation  

Most feature selection methods identify only a single solution. This is acceptable for predictive purposes, but is not sufficient for knowledge discovery if multiple solutions exist. We propose a strategy to extend a class of greedy methods to efficiently identify multiple solutions, and show under which conditions it identifies all solutions. We also introduce a taxonomy of features that takes the existence of multiple solutions into account. Furthermore, we explore different definitions of statistical equivalence of solutions, as well as methods for testing equivalence. A novel algorithm for compactly representing and visualizing multiple solutions is also introduced. In experiments we show that (a) the proposed algorithm is significantly more computationally efficient than the TIE* algorithm, the only alternative approach with similar theoretical guarantees, while identifying similar solutions to it, and (b) that the identified solutions have similar predictive performance.



中文翻译:

将贪婪特征选择算法扩展到多个解决方案

大多数特征选择方法仅识别单个解决方案。这对于预测目的是可以接受的,但如果存在多个解决方案,则不足以用于知识发现。我们提出了一种策略来扩展一类贪婪方法以有效识别多个解决方案,并显示它在哪些条件下识别所有解决方案。我们还介绍了一种特征分类法,该分类法考虑了多个解决方案的存在。此外,我们探索了解决方案统计等效性的不同定义,以及测试等效性的方法。还介绍了一种用于紧凑地表示和可视化多个解决方案的新算法。在实验中,我们表明 (a) 所提出的算法比 TIE* 算法的计算效率明显更高,

更新日期:2021-05-02
down
wechat
bug