当前位置: X-MOL 学术bioRxiv. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bayesian estimation of cell-type-specific gene expression per bulk sample with prior derived from single-cell data
bioRxiv - Genomics Pub Date : 2020-08-06 , DOI: 10.1101/2020.08.05.238949
Jiebiao Wang , Kathryn Roeder , Bernie Devlin

When assessed over a large number of samples, bulk RNA sequencing provides reliable data for gene expression at the tissue level. Single-cell RNA sequencing (scRNA-seq) deepens those analyses by evaluating gene expression at the cellular level. Both data types lend insights into disease etiology. With current technologies, however, scRNA-seq data are known to be noisy. Moreover, constrained by costs, scRNA-seq data are typically generated from a relatively small number of subjects, which limits their utility for some analyses, such as identification of gene expression quantitative trait loci (eQTLs). To address these issues while maintaining the unique advantages of each data type, we develop a Bayesian method (bMIND) to integrate bulk and scRNA-seq data. With a prior derived from scRNA-seq data, we propose to estimate sample-level cell-type-specific (CTS) expression from bulk expression data. The CTS expression enables large-scale sample-level downstream analyses, such as detecting CTS differentially expressed genes (DEGs) and eQTLs. Through simulations, we demonstrate that bMIND improves the accuracy of sample-level CTS expression estimates and power to discover CTS-DEGs when compared to existing methods. To further our understanding of two complex phenotypes, autism spectrum disorder and Alzheimer's disease, we apply bMIND to gene expression data of relevant brain tissue to identify CTS-DEGs. Our results complement findings for CTS-DEGs obtained from snRNA-seq studies, replicating certain DEGs in specific cell types while nominating other novel genes in those cell types. Finally, we calculate CTS-eQTLs for eleven brain regions by analyzing GTEx V8 data, creating a new resource for biological insights.

中文翻译:

贝叶斯估计每个散装样品的细胞类型特异性基因表达的先验值来自单细胞数据

当对大量样品进行评估时,大量RNA测序可为组织水平的基因表达提供可靠的数据。单细胞RNA测序(scRNA-seq)通过在细胞水平评估基因表达来加深那些分析。两种数据类型都有助于深入了解疾病病因。然而,利用当前技术,已知scRNA-seq数据是有噪声的。此外,受成本限制,通常从相对较少的受试者中产生scRNA-seq数据,这限制了它们在某些分析(例如鉴定基因表达数量性状基因座(eQTL))中的效用。为了解决这些问题,同时保持每种数据类型的独特优势,我们开发了一种贝叶斯方法(bMIND)来集成批量数据和scRNA-seq数据。利用来自scRNA-seq数据的先验数据,我们建议从大量表达数据中估算样品级特定细胞类型(CTS)的表达。CTS表达可进行大规模的样品级下游分析,例如检测CTS差异表达基因(DEG)和eQTL。通过仿真,我们证明与现有方法相比,bMIND提高了样品级CTS表达估计的准确性,并提高了发现CTS-DEG的能力。为了进一步了解两个复杂的表型,自闭症谱系障碍和阿尔茨海默氏病,我们将bMIND应用于相关脑组织的基因表达数据以鉴定CTS-DEG。我们的结果补充了从snRNA-seq研究获得的CTS-DEG的发现,这些结果在特定细胞类型中复制了某些DEG,同时提名了那些细胞类型中的其他新基因。最后,
更新日期:2020-08-08
down
wechat
bug