当前位置: X-MOL 学术Mol. Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
scClassify: sample size estimation and multiscale classification of cells using single and multiple reference.
Molecular Systems Biology ( IF 8.5 ) Pub Date : 2020-06-22 , DOI: 10.15252/msb.20199389
Yingxin Lin 1, 2 , Yue Cao 1, 2 , Hani Jieun Kim 1, 2, 3 , Agus Salim 4, 5, 6 , Terence P Speed 6 , David M Lin 7 , Pengyi Yang 1, 2, 3 , Jean Yee Hwa Yang 1, 2
Affiliation  

Automated cell type identification is a key computational challenge in single‐cell RNA ‐sequencing (scRNA ‐seq) data. To capitalise on the large collection of well‐annotated scRNA ‐seq datasets, we developed scClassify, a multiscale classification framework based on ensemble learning and cell type hierarchies constructed from single or multiple annotated datasets as references. scClassify enables the estimation of sample size required for accurate classification of cell types in a cell type hierarchy and allows joint classification of cells when multiple references are available. We show that scClassify consistently performs better than other supervised cell type classification methods across 114 pairs of reference and testing data, representing a diverse combination of sizes, technologies and levels of complexity, and further demonstrate the unique components of scClassify through simulations and compendia of experimental datasets. Finally, we demonstrate the scalability of scClassify on large single‐cell atlases and highlight a novel application of identifying subpopulations of cells from the Tabula Muris data that were unidentified in the original publication. Together, scClassify represents state‐of‐the‐art methodology in automated cell type identification from scRNA ‐seq data.

中文翻译:

scClassify:使用单参考和多参考对细胞进行样本量估计和多尺度分类。

自动细胞类型识别是单细胞 RNA 测序 (scRNA seq) 数据中的一个关键计算挑战。为了利用大量注释良好的 scRNA ‐seq 数据集,我们开发了 scClassify,这是一种基于集成学习和从单个或多个注释数据集作为参考构建的细胞类型层次结构的多尺度分类框架。scClassify 能够估计细胞类型层次结构中细胞类型精确分类所需的样本量,并允许在多个参考可用时对细胞进行联合分类。我们在 114 对参考和测试数据中证明了 scClassify 的表现始终优于其他监督细胞类型分类方法,代表了规模、技术和复杂程度的多样化组合,并通过模拟和实验概要进一步证明了 scClassify 的独特组件数据集。最后,我们展示了 scClassify 在大型单细胞图谱上的可扩展性,并强调了从原始出版物中未识别的 Tabula Muris 数据中识别细胞亚群的新应用。总之,scClassify 代表了从 scRNA ‐seq 数据自动识别细胞类型的最先进方法。
更新日期:2020-06-30
down
wechat
bug