当前位置: X-MOL 学术BMC Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Choice of pre-processing pipeline influences clustering quality of scRNA-seq datasets
BMC Genomics ( IF 3.5 ) Pub Date : 2021-09-14 , DOI: 10.1186/s12864-021-07930-6
Inbal Shainer 1 , Manuel Stemmer 1
Affiliation  

Single-cell RNA sequencing (scRNA-seq) has quickly become one of the most dominant techniques in modern transcriptome assessment. In particular, 10X Genomics’ Chromium system, with its high throughput approach, turn key and thorough user guide made this cutting-edge technique accessible to many laboratories using diverse animal models. However, standard pre-processing, including the alignment and cell filtering pipelines might not be ideal for every organism or tissue. Here we applied an alternative strategy, based on the pseudoaligner kallisto, on twenty-two publicly available single cell sequencing datasets from a wide range of tissues of eight organisms and compared the results with the standard 10X Genomics’ Cell Ranger pipeline. In most of the tested samples, kallisto produced higher sequencing read alignment rates and total gene detection rates in comparison to Cell Ranger. Although datasets processed with Cell Ranger had higher cell counts, outside of human and mouse datasets, these additional cells were routinely of low quality, containing low gene detection rates. Thorough downstream analysis of one kallisto processed dataset, obtained from the zebrafish pineal gland, revealed clearer clustering, allowing the identification of an additional photoreceptor cell type that previously went undetected. The finding of the new cluster suggests that the photoreceptive pineal gland is essentially a bi-chromatic tissue containing both green and red cone-like photoreceptors and implies that the alignment and pre-processing pipeline can affect the discovery of biologically-relevant cell types. While Cell Ranger favors higher cell numbers, using kallisto results in datasets with higher median gene detection per cell. We could demonstrate that cell type identification was not hampered by the lower cell count, but in fact improved as a result of the high gene detection rate and the more stringent filtering. Depending on the acquired dataset, it can be beneficial to favor high quality cells and accept a lower cell count, leading to an improved classification of cell types.

中文翻译:


预处理流程的选择会影响 scRNA-seq 数据集的聚类质量



单细胞 RNA 测序 (scRNA-seq) 已迅速成为现代转录组评估中最主要的技术之一。特别是,10X Genomics 的 Chromium 系统凭借其高通量方法、交钥匙和全面的用户指南,使许多使用不同动物模型的实验室可以使用这种尖端技术。然而,标准预处理(包括对齐和细胞过滤管道)可能并不适合每种生物体或组织。在这里,我们对来自八种生物体的广泛组织的 22 个公开可用的单细胞测序数据集应用了基于伪对准器 kallisto 的替代策略,并将结果与​​标准 10X Genomics 的 Cell Ranger 管道进行了比较。在大多数测试样本中,与 Cell Ranger 相比,kallisto 产生了更高的测序读取比对率和总基因检测率。尽管使用 Cell Ranger 处理的数据集具有较高的细胞计数,但在人类和小鼠数据集之外,这些额外的细胞通常质量较低,基因检测率较低。对从斑马鱼松果体获得的一个 kallisto 处理的数据集进行彻底的下游分析,揭示了更清晰的聚类,从而可以识别以前未检测到的其他感光细胞类型。新簇的发​​现表明,感光松果体本质上是一种双色组织,包含绿色和红色锥状感光器,并意味着对齐和预处理管道可以影响生物学相关细胞类型的发现。虽然 Cell Ranger 倾向于更高的细胞数量,但使用 kallisto 会导致数据集中每个细胞的基因检测中位数更高。 我们可以证明,细胞类型识别并未因细胞计数较低而受到阻碍,事实上,由于高基因检测率和更严格的过滤,细胞类型识别得到了改善。根据获取的数据集,有利于支持高质量细胞并接受较低的细胞计数,从而改进细胞类型的分类。
更新日期:2021-09-14
down
wechat
bug