当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparison of high-throughput single-cell RNA sequencing data processing pipelines
Briefings in Bioinformatics ( IF 9.5 ) Pub Date : 2020-07-07 , DOI: 10.1093/bib/bbaa116
Mingxuan Gao 1 , Mingyi Ling 1 , Xinwei Tang 1 , Shun Wang 1 , Xu Xiao 1 , Ying Qiao 1 , Wenxian Yang 2 , Rongshan Yu 3
Affiliation  

With the development of single-cell RNA sequencing (scRNA-seq) technology, it has become possible to perform large-scale transcript profiling for tens of thousands of cells in a single experiment. Many analysis pipelines have been developed for data generated from different high-throughput scRNA-seq platforms, bringing a new challenge to users to choose a proper workflow that is efficient, robust and reliable for a specific sequencing platform. Moreover, as the amount of public scRNA-seq data has increased rapidly, integrated analysis of scRNA-seq data from different sources has become increasingly popular. However, it remains unclear whether such integrated analysis would be biassed if the data were processed by different upstream pipelines. In this study, we encapsulated seven existing high-throughput scRNA-seq data processing pipelines with Nextflow, a general integrative workflow management framework, and evaluated their performance in terms of running time, computational resource consumption and data analysis consistency using eight public datasets generated from five different high-throughput scRNA-seq platforms. Our work provides a useful guideline for the selection of scRNA-seq data processing pipelines based on their performance on different real datasets. In addition, these guidelines can serve as a performance evaluation framework for future developments in high-throughput scRNA-seq data processing.

中文翻译:

高通量单细胞RNA测序数据处理流水线的比较

随着单细胞RNA测序(scRNA-seq)技术的发展,在一次实验中对数万个细胞进行大规模的转录分析已经成为可能。针对不同高通量 scRNA-seq 平台生成的数据开发了许多分析流程,这为用户选择适合特定测序平台的高效、稳健和可靠的工作流程带来了新的挑战。此外,随着公共 scRNA-seq 数据量的迅速增加,对来自不同来源的 scRNA-seq 数据的综合分析变得越来越流行。然而,如果数据由不同的上游管道处理,这种综合分析是否会产生偏差仍不清楚。在这项研究中,我们使用通用集成工作流管理框架 Nextflow 封装了七个现有的高通量 scRNA-seq 数据处理管道,并使用从五个不同的高通量生成的八个公共数据集评估了它们在运行时间、计算资源消耗和数据分析一致性方面的性能。吞吐量 scRNA-seq 平台。我们的工作为根据不同真实数据集的性能选择 scRNA-seq 数据处理管道提供了有用的指导。此外,这些指南可以作为未来高通量 scRNA-seq 数据处理发展的性能评估框架。使用从五个不同的高通量 scRNA-seq 平台生成的八个公共数据集的计算资源消耗和数据分析一致性。我们的工作为根据不同真实数据集的性能选择 scRNA-seq 数据处理管道提供了有用的指导。此外,这些指南可以作为未来高通量 scRNA-seq 数据处理发展的性能评估框架。使用从五个不同的高通量 scRNA-seq 平台生成的八个公共数据集的计算资源消耗和数据分析一致性。我们的工作为根据不同真实数据集的性能选择 scRNA-seq 数据处理管道提供了有用的指导。此外,这些指南可以作为未来高通量 scRNA-seq 数据处理发展的性能评估框架。
更新日期:2020-07-07
down
wechat
bug