当前位置: X-MOL 学术Mach. Learn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ReliefE: feature ranking in high-dimensional spaces via manifold embeddings
Machine Learning ( IF 4.3 ) Pub Date : 2021-06-17 , DOI: 10.1007/s10994-021-05998-5
Blaž Škrlj , Sašo Džeroski , Nada Lavrač , Matej Petković

Feature ranking has been widely adopted in machine learning applications such as high-throughput biology and social sciences. The approaches of the popular Relief family of algorithms assign importances to features by iteratively accounting for nearest relevant and irrelevant instances. Despite their high utility, these algorithms can be computationally expensive and not-well suited for high-dimensional sparse input spaces. In contrast, recent embedding-based methods learn compact, low-dimensional representations, potentially facilitating down-stream learning capabilities of conventional learners. This paper explores how the Relief branch of algorithms can be adapted to benefit from (Riemannian) manifold-based embeddings of instance and target spaces, where a given embedding’s dimensionality is intrinsic to the dimensionality of the considered data set. The developed ReliefE algorithm is faster and can result in better feature rankings, as shown by our evaluation on 20 real-life data sets for multi-class and multi-label classification tasks. The utility of ReliefE for high-dimensional data sets is ensured by its implementation that utilizes sparse matrix algebraic operations. Finally, the relation of ReliefE to other ranking algorithms is studied via the Fuzzy Jaccard Index.



中文翻译:

ReliefE:通过流形嵌入在高维空间中进行特征排序

特征排序在高通量生物学和社会科学等机器学习应用中被广泛采用。流行的 Relief 系列算法的方法通过迭代地考虑最近的相关和不相关的实例来为特征分配重要性。尽管它们具有很高的实用性,但这些算法在计算上可能很昂贵,并且不太适合高维稀疏输入空间。相比之下,最近的基于嵌入的方法学习紧凑、低维的表示,有可能促进传统学习者的下游学习能力。本文探讨了如何调整 Relief 算法分支以从实例和目标空间的(黎曼)基于流形的嵌入中受益,其中给定嵌入的维数是所考虑数据集的维数所固有的。开发的 ReliefE 算法速度更快,并且可以产生更好的特征排名,正如我们对 20 个用于多类和多标签分类任务的真实数据集的评估所示。ReliefE 对高维数据集的效用是通过其利用稀疏矩阵代数运算的实现来确保的。最后,通过模糊杰卡德指数研究了 ReliefE 与其他排序算法的关系。ReliefE 对高维数据集的效用是通过其利用稀疏矩阵代数运算的实现来确保的。最后,通过模糊杰卡德指数研究了 ReliefE 与其他排序算法的关系。ReliefE 对高维数据集的效用是通过其利用稀疏矩阵代数运算的实现来确保的。最后,通过模糊杰卡德指数研究了 ReliefE 与其他排序算法的关系。

更新日期:2021-06-18
down
wechat
bug