当前位置: X-MOL 学术Bioinformatics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Joint learning dimension reduction and clustering of single-cell RNA-sequencing data.
Bioinformatics ( IF 5.8 ) Pub Date : 2020-04-04 , DOI: 10.1093/bioinformatics/btaa231
Wenming Wu 1 , Xiaoke Ma 1
Affiliation  

Motivation
Single-cell RNA-sequencing (scRNA-Seq) profiles transcriptome of individual cells, which enables the discovery of cell types or subtypes by using unsupervised clustering. Current algorithms perform dimension reduction before cell clustering because of noises, high dimensionality, and linear inseparability of scRNA-seq data. However, independence of dimension reduction and clustering fails to fully characterize patterns in data, resulting in an undesirable performance.
Results
In this study, we propose a flexible and accurate algorithm for scRNA-Seq data by jointly learning dimension reduction and cell clustering (aka DRjCC), where dimension reduction is performed by projected matrix decomposition and cell type clustering by nonnegative matrix factorization. We first formulate joint learning of dimension reduction and cell clustering into a constrained optimization problem and then derive the optimization rules. The advantage of DRjCC is that feature selection in dimension reduction is guided by cell clustering, significantly improving the performance of cell type discovery. Eleven scRNA-seq datasets are adopted to validate the performance of algorithms, where the number of single cells varies from 49 to 68,579 with the number of cell types ranging from 3 to 14. The experimental results demonstrate that DRjCC significantly outperforms 13 state-of-the-art methods in terms of various measurements on cell type clustering (on average 17.44% by improvement). Furthermore, DRjCC is efficient and robust across different scRNA-seq datasets from various tissues. The proposed model and methods provide an effective strategy to analyze scRNA-seq data (The software is coded using matlab, and is free available for academichttps://github.com/xkmaxidian/DRjCC).
Supplementary information
Supplementary dataSupplementary data are available at Bioinformatics online.


中文翻译:

联合学习维度缩减和单细胞RNA测序数据的聚类。

动机
单细胞RNA测序(scRNA-Seq)可以分析单个细胞的转录组,从而可以通过使用无监督聚类来发现细胞类型或亚型。由于scRNA-seq数据的噪声,高维数和线性不可分性,当前的算法会在细胞聚类之前执行降维。但是,降维和聚类的独立性无法完全表征数据中的模式,从而导致性能不理想。
结果
在这项研究中,我们通过联合学习降维和细胞聚类(又名DRjCC),为scRNA-Seq数据提出了一种灵活而准确的算法,其中降维是通过投影矩阵分解和非负矩阵分解实现的细胞类型聚类。我们首先将降维和单元聚类的联合学习公式化为约束优化问题,然后得出优化规则。DRjCC的优势在于,通过细胞聚类可以指导维度缩减中的特征选择,从而显着提高细胞类型发现的性能。采用11个scRNA-seq数据集来验证算法的性能,其中单细胞的数量从49到68,579不等,细胞类型的数量从3到14不等。实验结果表明,就细胞类型聚类的各种测量而言,DRjCC明显优于13种最新方法(改进后的平均值为17.44%)。此外,DRjCC在来自各种组织的不同scRNA-seq数据集上是高效且强大的。所提出的模型和方法提供了一种有效的策略来分析scRNA-seq数据(该软件使用matlab进行编码,可免费从https://github.com/xkmaxidian/DRjCC免费获得)。
补充资料
补充数据补充数据可从Bioinformatics在线获得。
更新日期:2020-04-06
down
wechat
bug