当前位置: X-MOL 学术Interdiscip. Sci. Comput. Life Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Single-Cell Clustering Based on Shared Nearest Neighbor and Graph Partitioning.
Interdisciplinary Sciences: Computational Life Sciences ( IF 4.8 ) Pub Date : 2020-02-22 , DOI: 10.1007/s12539-019-00357-4
Xiaoshu Zhu 1, 2 , Jie Zhang 2 , Yunpei Xu 1 , Jianxin Wang 1 , Xiaoqing Peng 3 , Hong-Dong Li 1
Affiliation  

Clustering of single-cell RNA sequencing (scRNA-seq) data enables discovering cell subtypes, which is helpful for understanding and analyzing the processes of diseases. Determining the weight of edges is an essential component in graph-based clustering methods. While several graph-based clustering algorithms for scRNA-seq data have been proposed, they are generally based on k-nearest neighbor (KNN) and shared nearest neighbor (SNN) without considering the structure information of graph. Here, to improve the clustering accuracy, we present a novel method for single-cell clustering, called structural shared nearest neighbor-Louvain (SSNN-Louvain), which integrates the structure information of graph and module detection. In SSNN-Louvain, based on the distance between a node and its shared nearest neighbors, the weight of edge is defined by introducing the ratio of the number of the shared nearest neighbors to that of nearest neighbors, thus integrating structure information of the graph. Then, a modified Louvain community detection algorithm is proposed and applied to identify modules in the graph. Essentially, each community represents a subtype of cells. It is worth mentioning that our proposed method integrates the advantages of both SNN graph and community detection without the need for tuning any additional parameter other than the number of neighbors. To test the performance of SSNN-Louvain, we compare it to five existing methods on 16 real datasets, including nonnegative matrix factorization, single-cell interpretation via multi-kernel learning, SNN-Cliq, Seurat and PhenoGraph. The experimental results show that our approach achieves the best average performance in these datasets.

中文翻译:

基于共享最近邻和图分区的单单元聚类。

通过对单细胞RNA测序(scRNA-seq)数据进行聚类,可以发现细胞亚型,这有助于理解和分析疾病过程。确定边缘的权重是基于图的聚类方法的重要组成部分。虽然已经提出了几种基于图的scRNA-seq数据聚类算法,但它们通常基于k最近邻(KNN)和共享最近邻(SNN),而无需考虑图的结构信息。在这里,为了提高聚类的准确性,我们提出了一种新的单细胞聚类方法,称为结构共享最近邻-Louvain(SSNN-Louvain),该方法集成了图和模块检测的结构信息。在SSNN-Louvain中,根据节点与其共享的最近邻居之间的距离,通过引入共享的最近邻居与最近邻居的数量之比来定义边缘的权重,从而整合图的结构信息。然后,提出了一种改进的Louvain社区检测算法,并将其应用于识别图中的模块。本质上,每个群落代表细胞的亚型。值得一提的是,我们提出的方法融合了SNN图和社区检测的优点,而无需调整除邻居数以外的任何其他参数。为了测试SSNN-Louvain的性能,我们将其与16个真实数据集上的五种现有方法进行了比较,包括非负矩阵分解,通过多核学习进行单细胞解释,SNN-Cliq,Seurat和PhenoGraph。
更新日期:2020-02-22
down
wechat
bug