当前位置: X-MOL 学术bioRxiv. Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evaluating the reproducibility of single-cell gene regulatory network inference algorithms
bioRxiv - Systems Biology Pub Date : 2020-11-10 , DOI: 10.1101/2020.11.10.375923
Yoonjee Kang , Denis Thieffry , Laura Cantini

Networks are powerful tools to represent and investigate biological systems. The development of algorithms inferring regulatory interactions from functional genomics data has been an active area of research. With the advent of single-cell RNA-seq data (scRNA-seq), numerous methods specifically designed to take advantage of single-cell datasets have been proposed. However, published benchmarks on single-cell network inference are mostly based on simulated data. Once applied to real data, these benchmarks take into account only a small set of genes and only compare the inferred networks with an imposed ground-truth. Here, we benchmark four single-cell network inference methods based on their reproducibility, i.e. their ability to infer similar networks when applied to two independent datasets for the same biological condition. We tested each of these methods on real data from three biological conditions: human retina, T-cells in colorectal cancer, and human hematopoiesis. GENIE3 results to be the most reproducible algorithm, independently from the single-cell sequencing platform, the cell type annotation system, the number of cells constituting the dataset, or the thresholding applied to the links of the inferred networks. In order to ensure the reproducibility and ease extensions of this benchmark study, we implemented all the analyses in scNET, a Jupyter notebook available at https://github.com/ComputationalSystemsBiology/scNET.

中文翻译:

评估单细胞基因调控网络推理算法的可重复性

网络是代表和研究生物系统的强大工具。从功能基因组学数据推断调控相互作用的算法的开发一直是研究的活跃领域。随着单细胞RNA序列数据(scRNA-seq)的出现,已经提出了许多专门设计用于利用单细胞数据集的方法。但是,已发布的有关单小区网络推断的基准测试大多基于模拟数据。一旦应用于实际数据,这些基准仅考虑一小部分基因,并且仅将推断出的网络与强加的真相进行比较。在这里,我们基于四种单细胞网络推断方法的可重复性进行基准测试,即当将它们应用于相同生物学条件的两个独立数据集时,它们推断相似网络的能力。我们在来自三种生物学条件的真实数据上测试了每种方法:人类视网膜,结肠直肠癌中的T细胞和人类造血功能。GENIE3的结果是最可重复的算法,与单细胞测序平台,细胞类型注释系统,构成数据集的细胞数量或应用于推断网络链接的阈值无关。为了确保此基准研究的可重复性和易扩展性,我们在scNET中实现了所有分析,scNET是Jupyter笔记本,可从https://github.com/ComputationalSystemsBiology/scNET获得。单元类型注释系统,构成数据集的单元数量或应用于推断网络链接的阈值。为了确保此基准研究的可重复性和易扩展性,我们在scNET中实现了所有分析,scNET是Jupyter笔记本,可从https://github.com/ComputationalSystemsBiology/scNET获得。单元类型注释系统,构成数据集的单元数量或应用于推断网络链接的阈值。为了确保此基准研究的可重复性和易扩展性,我们在scNET中实现了所有分析,scNET是Jupyter笔记本,可从https://github.com/ComputationalSystemsBiology/scNET获得。
更新日期:2020-11-12
down
wechat
bug