当前位置: X-MOL 学术bioRxiv. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ESCO: single cell expression simulation incorporating gene co-expression
bioRxiv - Genetics Pub Date : 2020-10-21 , DOI: 10.1101/2020.10.20.347211
Jinjin Tian , Jiebiao Wang , Kathryn Roeder

Motivation: Gene-gene co-expression networks (GCN) are of biological interest for the useful information they provide for understanding gene-gene interactions. The advent of single cell RNA-sequencing allows us to examine more subtle gene co-expression occurring within a cell type. Many imputation and denoising methods have been developed to deal with the technical challenges observed in single cell data; meanwhile, several simulators have been developed for benchmarking and assessing these methods. Most of these simulators, however, either do not incorporate gene co-expression or generate co-expression in an inconvenient manner. Results: Therefore, with the focus on gene co-expression, we propose a new simulator, ESCO, which adopts the idea of the copula to impose gene co-expression, while preserving the highlights of available simulators, which perform well for simulation of gene expression marginally. Using ESCO, we assess the performance of imputation methods on GCN recovery and find that imputation generally helps GCN recovery when the data are not too sparse, and the ensemble imputation method works best among leading methods. In contrast, imputation fails to help in the presence of an excessive fraction of zero counts, where simple data aggregating methods are a better choice. These findings are further verified with mouse and human brain cell data. Availability: The ESCO implementation is available as R package SplatterESCO (https://github.com/JINJINT/SplatterESCO).

中文翻译:

ESCO:包含基因共表达的单细胞表达模拟

动机:基因-基因共表达网络(GCN)具有生物学意义,因为它们为理解基因-基因相互作用提供了有用的信息。单细胞RNA测序的出现使我们能够检查细胞类型内发生的更多微妙的基因共表达。已经开发出许多归类和去噪方法来应对在单细胞数据中观察到的技术挑战。同时,已经开发了一些模拟器来对这些方法进行基准测试和评估。然而,这些模拟器中的大多数要么不整合基因共表达,要么以不方便的方式产生共表达。结果:因此,我们着眼于基因共表达,提出了一种新的模拟器ESCO,它采用copula的概念进行基因共表达,同时保留了可用模拟器的重点,可以很好地模拟基因表达。使用ESCO,我们评估了插补方法对GCN恢复的性能,发现当数据不太稀疏时,插补通常可以帮助GCN恢复,而整体插补方法在领先方法中效果最好。相反,在零计数过多的情况下,归因无法提供帮助,在这种情况下,简单的数据汇总方法是更好的选择。这些发现进一步得到了小鼠和人脑细胞数据的验证。可用性:ESCO实现以R包SplatterESCO(https://github.com/JINJINT/SplatterESCO)的形式提供。我们评估了插补方法对GCN恢复的性能,发现当数据不太稀疏时,插补通常可以帮助GCN恢复,而整体插补方法在领先方法中效果最好。相反,在零计数过多的情况下,归因无法提供帮助,在这种情况下,简单的数据汇总方法是更好的选择。这些发现进一步得到了小鼠和人脑细胞数据的验证。可用性:ESCO实现以R包SplatterESCO(https://github.com/JINJINT/SplatterESCO)的形式提供。我们评估了插补方法对GCN恢复的性能,发现当数据不太稀疏时,插补通常可以帮助GCN恢复,而整体插补方法在领先方法中效果最好。相反,在零计数过多的情况下,归因无法提供帮助,在这种情况下,简单的数据汇总方法是更好的选择。这些发现进一步得到了小鼠和人脑细胞数据的验证。可用性:ESCO实现以R包SplatterESCO(https://github.com/JINJINT/SplatterESCO)的形式提供。这些发现进一步得到了小鼠和人脑细胞数据的验证。可用性:ESCO实现以R包SplatterESCO(https://github.com/JINJINT/SplatterESCO)的形式提供。这些发现进一步得到了小鼠和人脑细胞数据的验证。可用性:ESCO实现以R包SplatterESCO(https://github.com/JINJINT/SplatterESCO)的形式提供。
更新日期:2020-10-27
down
wechat
bug