当前位置: X-MOL 学术J. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ARGLRR: A Sparse Low-Rank Representation Single-Cell RNA-Sequencing Data Clustering Method Combined with a New Graph Regularization.
Journal of Computational Biology ( IF 1.7 ) Pub Date : 2023-07-20 , DOI: 10.1089/cmb.2023.0077
Zhen-Chang Wang 1 , Jin-Xing Liu 1 , Jun-Liang Shang 1 , Ling-Yun Dai 1 , Chun-Hou Zheng 1 , Juan Wang 1
Affiliation  

The development of single-cell transcriptome sequencing technologies has opened new ways to study biological phenomena at the cellular level. A key application of such technologies involves the employment of single-cell RNA sequencing (scRNA-seq) data to identify distinct cell types through clustering, which in turn provides evidence for revealing heterogeneity. Despite the promise of this approach, the inherent characteristics of scRNA-seq data, such as higher noise levels and lower coverage, pose major challenges to existing clustering methods and compromise their accuracy. In this study, we propose a method called Adjusted Random walk Graph regularization Sparse Low-Rank Representation (ARGLRR), a practical sparse subspace clustering method, to identify cell types. The fundamental low-rank representation (LRR) model is concerned with the global structure of data. To address the limited ability of the LRR method to capture local structure, we introduced adjusted random walk graph regularization in its framework. ARGLRR allows for the capture of both local and global structures in scRNA-seq data. Additionally, the imposition of similarity constraints into the LRR framework further improves the ability of the proposed model to estimate cell-to-cell similarity and capture global structural relationships between cells. ARGLRR surpasses other advanced comparison approaches on nine known scRNA-seq data sets judging by the results. In the normalized mutual information and Adjusted Rand Index metrics on the scRNA-seq data sets clustering experiments, ARGLRR outperforms the best-performing comparative method by 6.99% and 5.85%, respectively. In addition, we visualize the result using Uniform Manifold Approximation and Projection. Visualization results show that the usage of ARGLRR enhances the separation of different cell types within the similarity matrix.

中文翻译:

ARGLRR:一种稀疏低秩表示单细胞 RNA 测序数据聚类方法与新的图正则化相结合。

单细胞转录组测序技术的发展为在细胞水平上研究生物现象开辟了新的途径。此类技术的一个关键应用涉及利用单细胞 RNA 测序 (scRNA-seq) 数据通过聚类来识别不同的细胞类型,从而为揭示异质性提供证据。尽管这种方法前景广阔,但 scRNA-seq 数据的固有特征(例如较高的噪声水平和较低的覆盖范围)对现有聚类方法提出了重大挑战,并损害了其准确性。在本研究中,我们提出了一种称为调整随机游走图正则化稀疏低秩表示(ARGLRR)的方法,这是一种实用的稀疏子空间聚类方法,用于识别细胞类型。基本的低秩表示(LRR)模型涉及数据的全局结构。为了解决 LRR 方法捕获局部结构的能力有限的问题,我们在其框架中引入了调整随机游走图正则化。ARGLRR 允许捕获 scRNA-seq 数据中的局部和全局结构。此外,在 LRR 框架中施加相似性约束进一步提高了所提出的模型估计细胞间相似性和捕获细胞之间全局结构关系的能力。从结果来看,ARGLRR 在九个已知的 scRNA-seq 数据集上超越了其他先进的比较方法。在 scRNA-seq 数据集聚类实验的归一化互信息和调整兰德指数指标中,ARGLRR 分别比性能最佳的比较方法高出 6.99% 和 5.85%。此外,我们使用均匀流形近似和投影来可视化结果。可视化结果表明,ARGLRR 的使用增强了相似性矩阵内不同细胞类型的分离。
更新日期:2023-07-20
down
wechat
bug