当前位置: X-MOL 学术J. Bioinform. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sparse robust graph-regularized non-negative matrix factorization based on correntropy
Journal of Bioinformatics and Computational Biology ( IF 1 ) Pub Date : 2020-11-02 , DOI: 10.1142/s021972002050047x
Chuan-Yuan Wang 1 , Ying-Lian Gao 2 , Jin-Xing Liu 1 , Ling-Yun Dai 1 , Junliang Shang 1
Affiliation  

Non-negative Matrix Factorization (NMF) is a popular data dimension reduction method in recent years. The traditional NMF method has high sensitivity to data noise. In the paper, we propose a model called Sparse Robust Graph-regularized Non-negative Matrix Factorization based on Correntropy (SGNMFC). The maximized correntropy replaces the traditional minimized Euclidean distance to improve the robustness of the algorithm. Through the kernel function, correntropy can give less weight to outliers and noise in data but give greater weight to meaningful data. Meanwhile, the geometry structure of the high-dimensional data is completely preserved in the low-dimensional manifold through the graph regularization. Feature selection and sample clustering are commonly used methods for analyzing genes. Sparse constraints are applied to the loss function to reduce matrix complexity and analysis difficulty. Comparing the other five similar methods, the effectiveness of the SGNMFC model is proved by selection of differentially expressed genes and sample clustering experiments in three The Cancer Genome Atlas (TCGA) datasets.

中文翻译:

基于相关熵的稀疏鲁棒图正则化非负矩阵分解

非负矩阵分解(NMF)是近年来流行的一种数据降维方法。传统的 NMF 方法对数据噪声具有较高的敏感性。在本文中,我们提出了一种基于 Correntropy (SGNMFC) 的模型,称为稀疏鲁棒图正则化非负矩阵分解。最大化的相关熵代替了传统的最小化欧几里得距离,以提高算法的鲁棒性。通过核函数,相关熵可以减少数据中的异常值和噪声的权重,但赋予有意义的数据更大的权重。同时,高维数据的几何结构通过图正则化完全保留在低维流形中。特征选择和样本聚类是分析基因的常用方法。将稀疏约束应用于损失函数以降低矩阵复杂度和分析难度。比较其他五种类似方法,通过在三个癌症基因组图谱(TCGA)数据集中选择差异表达基因和样本聚类实验证明了 SGNMFC 模型的有效性。
更新日期:2020-11-02
down
wechat
bug