当前位置: X-MOL 学术Nucleic Acids Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RCA2: a scalable supervised clustering algorithm that reduces batch effects in scRNA-seq data
Nucleic Acids Research ( IF 14.9 ) Pub Date : 2021-07-13 , DOI: 10.1093/nar/gkab632
Florian Schmidt 1 , Bobby Ranjan 1 , Quy Xiao Xuan Lin 1 , Vaidehi Krishnan 2 , Ignasius Joanito 1 , Mohammad Amin Honardoost 1, 3 , Zahid Nawaz 1 , Prasanna Nori Venkatesh 1 , Joanna Tan 1 , Nirmala Arul Rayan 1 , Sin Tiong Ong 2, 4 , Shyam Prabhakar 1
Affiliation  

The transcriptomic diversity of cell types in the human body can be analysed in unprecedented detail using single cell (SC) technologies. Unsupervised clustering of SC transcriptomes, which is the default technique for defining cell types, is prone to group cells by technical, rather than biological, variation. Compared to de-novo (unsupervised) clustering, we demonstrate using multiple benchmarks that supervised clustering, which uses reference transcriptomes as a guide, is robust to batch effects and data quality artifacts. Here, we present RCA2, the first algorithm to combine reference projection (batch effect robustness) with graph-based clustering (scalability). In addition, RCA2 provides a user-friendly framework incorporating multiple commonly used downstream analysis modules. RCA2 also provides new reference panels for human and mouse and supports generation of custom panels. Furthermore, RCA2 facilitates cell type-specific QC, which is essential for accurate clustering of data from heterogeneous tissues. We demonstrate the advantages of RCA2 on SC data from human bone marrow, healthy PBMCs and PBMCs from COVID-19 patients. Scalable supervised clustering methods such as RCA2 will facilitate unified analysis of cohort-scale SC datasets.

中文翻译:

RCA2:一种可扩展的监督聚类算法,可减少 scRNA-seq 数据中的批处理效应

使用单细胞 (SC) 技术可以对人体细胞类型的转录组多样性进行前所未有的详细分析。SC 转录组的无监督聚类是定义细胞类型的默认技术,它倾向于通过技术而非生物学变异对细胞进行分组。与从头(无监督)聚类相比,我们证明了使用多个基准来监督聚类,它使用参考转录组作为指导,对批处理效应和数据质量伪影具有鲁棒性。在这里,我们介绍了 RCA2,这是第一个将参考投影(批量效应鲁棒性)与基于图的聚类(可扩展性)相结合的算法。此外,RCA2 提供了一个用户友好的框架,包含多个常用的下游分析模块。RCA2 还为人和鼠标提供了新的参考面板,并支持自定义面板的生成。此外,RCA2 促进细胞类型特异性 QC,这对于来自异质组织的数据的准确聚类至关重要。我们展示了 RCA2 在来自人类骨髓、健康 PBMC 和来自 COVID-19 患者的 PBMC 的 SC 数据上的优势。RCA2 等可扩展的监督聚类方法将有助于对队列规模的 SC 数据集进行统一分析。
更新日期:2021-07-13
down
wechat
bug