当前位置: X-MOL 学术IEEE J. Biomed. Health Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Visualization and Analysis of Single Cell RNA-Seq Data by Maximizing Correntropy Based Non-Negative Low Rank Representation
IEEE Journal of Biomedical and Health Informatics ( IF 6.7 ) Pub Date : 2021-09-08 , DOI: 10.1109/jbhi.2021.3110766
Cui-Na Jiao 1 , Jin-Xing Liu 1 , Juan Wang 1 , Junliang Shang 1 , Chun-Hou Zheng 1
Affiliation  

The exploration of single cell RNA-sequencing (scRNA-seq) technology generates a new perspective to analyze biological problems. One of the major applications of scRNA-seq data is to discover subtypes of cells by cell clustering. Nevertheless, it is challengeable for traditional methods to handle scRNA-seq data with high level of technical noise and notorious dropouts. To better analyze single cell data, a novel scRNA-seq data analysis model called Maximum correntropy criterion based Non-negative and Low Rank Representation (MccNLRR) is introduced. Specifically, the maximum correntropy criterion, as an effective loss function, is more robust to the high noise and large outliers existed in the data. Moreover, the low rank representation is proven to be a powerful tool for capturing the global and local structures of data. Therefore, some important information, such as the similarity of cells in the subspace, is also extracted by it. Then, an iterative algorithm on the basis of the half-quadratic optimization and alternating direction method is developed to settle the complex optimization problem. Before the experiment, we also analyze the convergence and robustness of MccNLRR. At last, the results of cell clustering, visualization analysis, and gene markers selection on scRNA-seq data reveal that MccNLRR method can distinguish cell subtypes accurately and robustly.

中文翻译:


通过最大化基于熵的非负低秩表示来可视化和分析单细胞 RNA-Seq 数据



单细胞RNA测序(scRNA-seq)技术的探索为分析生物问题提供了新的视角。 scRNA-seq 数据的主要应用之一是通过细胞聚类发现细胞亚型。然而,传统方法处理具有高水平技术噪音和臭名昭著的丢失的 scRNA-seq 数据面临着挑战。为了更好地分析单细胞数据,引入了一种新的 scRNA-seq 数据分析模型,称为基于非负低秩表示的最大相关熵准则(MccNLRR)。具体来说,最大相关熵准则作为一种有效的损失函数,对于数据中存在的高噪声和大异常值具有更强的鲁棒性。此外,低秩表示被证明是捕获数据的全局和局部结构的强大工具。因此,一些重要的信息,例如子空间中细胞的相似性,也被它提取出来。然后,开发了一种基于半二次优化和交替方向法的迭代算法来解决复杂的优化问题。在实验之前,我们还分析了MccNLRR的收敛性和鲁棒性。最后,对scRNA-seq数据进行细胞聚类、可视化分析和基因标记选择的结果表明,MccNLRR方法可以准确、鲁棒地区分细胞亚型。
更新日期:2021-09-08
down
wechat
bug