当前位置: X-MOL 学术Hum. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust hypergraph regularized non-negative matrix factorization for sample clustering and feature selection in multi-view gene expression data.
Human Genomics ( IF 3.8 ) Pub Date : 2019-10-22 , DOI: 10.1186/s40246-019-0222-6
Na Yu 1 , Ying-Lian Gao 2 , Jin-Xing Liu 1 , Juan Wang 1 , Junliang Shang 1
Affiliation  

BACKGROUND As one of the most popular data representation methods, non-negative matrix decomposition (NMF) has been widely concerned in the tasks of clustering and feature selection. However, most of the previously proposed NMF-based methods do not adequately explore the hidden geometrical structure in the data. At the same time, noise and outliers are inevitably present in the data. RESULTS To alleviate these problems, we present a novel NMF framework named robust hypergraph regularized non-negative matrix factorization (RHNMF). In particular, the hypergraph Laplacian regularization is imposed to capture the geometric information of original data. Unlike graph Laplacian regularization which captures the relationship between pairwise sample points, it captures the high-order relationship among more sample points. Moreover, the robustness of the RHNMF is enhanced by using the L2,1-norm constraint when estimating the residual. This is because the L2,1-norm is insensitive to noise and outliers. CONCLUSIONS Clustering and common abnormal expression gene (com-abnormal expression gene) selection are conducted to test the validity of the RHNMF model. Extensive experimental results on multi-view datasets reveal that our proposed model outperforms other state-of-the-art methods.

中文翻译:

健壮的超图正则化非负矩阵分解,可用于多视图基因表达数据中的样本聚类和特征选择。

背景技术作为一种最流行的数据表示方法,非负矩阵分解(NMF)已被广泛关注于聚类和特征选择的任务中。但是,大多数以前提出的基于NMF的方法都无法充分探究数据中隐藏的几何结构。同时,数据中不可避免地会出现噪声和离群值。结果为了缓解这些问题,我们提出了一种新颖的NMF框架,称为鲁棒超图正则化非负矩阵分解(RHNMF)。特别是,施加了超图Laplacian正则化来捕获原始数据的几何信息。与图Laplacian正则化可捕获成对样本点之间的关系不同,它可捕获更多样本点之间的高阶关系。此外,当估计残差时,通过使用L2,1-范数约束可以增强RHNMF的鲁棒性。这是因为L2,1-范数对噪声和异常值不敏感。结论进行了聚类和常见异常表达基因(com-abnormal expression gene)的选择,以检验RHNMF模型的有效性。在多视图数据集上的大量实验结果表明,我们提出的模型优于其他最新方法。
更新日期:2020-04-22
down
wechat
bug