当前位置: X-MOL 学术Genome Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data
Genome Biology ( IF 10.1 ) Pub Date : 2020-07-01 , DOI: 10.1186/s13059-020-02065-5
C Anthony Scott 1 , Jack D Duryea 1 , Harry MacKay 1 , Maria S Baker 1 , Eleonora Laritsky 1 , Chathura J Gunasekara 1 , Cristian Coarfa 2, 3 , Robert A Waterland 1, 4
Affiliation  

Background The traditional approach to studying the epigenetic mechanism CpG methylation in tissue samples is to identify regions of concordant differential methylation spanning multiple CpG sites (differentially methylated regions). Variation limited to single or small numbers of CpGs has been assumed to reflect stochastic processes. To test this, we developed software, Cluster-Based analysis of CpG methylation (CluBCpG), and explored variation in read-level CpG methylation patterns in whole genome bisulfite sequencing data. Results Analysis of both human and mouse whole genome bisulfite sequencing datasets reveals read-level signatures associated with cell type and cell type-specific biological processes. These signatures, which are mostly orthogonal to classical differentially methylated regions, are enriched at cell type-specific enhancers and allow estimation of proportional cell composition in synthetic mixtures and improved prediction of gene expression. In tandem, we developed a machine learning algorithm, Precise Read-Level Imputation of Methylation (PReLIM), to increase coverage of existing whole genome bisulfite sequencing datasets by imputing CpG methylation states on individual sequencing reads. PReLIM both improves CluBCpG coverage and performance and enables identification of novel differentially methylated regions, which we independently validate. Conclusions Our data indicate that, rather than stochastic variation, read-level CpG methylation patterns in tissue whole genome bisulfite sequencing libraries reflect cell type. Accordingly, these new computational tools should lead to an improved understanding of epigenetic regulation by DNA methylation.

中文翻译:

批量全基因组亚硫酸氢盐测序数据中细胞类型特异性甲基化信号的识别

背景 研究组织样本中 CpG 甲基化表观遗传机制的传统方法是识别跨多个 CpG 位点的一致差异甲基化区域(差异甲基化区域)。假设仅限于单个或少量 CpG 的变化反映了随机过程。为了测试这一点,我们开发了基于簇的 CpG 甲基化分析 (CluBCpG) 软件,并探索了全基因组亚硫酸氢盐测序数据中读取水平 CpG 甲基化模式的变化。结果对人类和小鼠全基因组亚硫酸氢盐测序数据集的分析揭示了与细胞类型和细胞类型特异性生物过程相关的读取水平特征。这些特征大多与经典的差异甲基化区域正交,在细胞类型特异性增强子处富集,并允许估计合成混合物中的比例细胞组成并改进基因表达的预测。与此同时,我们开发了一种机器学习算法,即精确读取水平甲基化插补 (PReLIM),通过在单个测序读取上插补 CpG 甲基化状态来增加现有全基因组亚硫酸氢盐测序数据集的覆盖范围。PReLIM 既提高了 CluBCpG 覆盖率和性能,又能够识别新的差异甲基化区域,我们对此进行了独立验证。结论 我们的数据表明,组织全基因组亚硫酸氢盐测序文库中的读取水平 CpG 甲基化模式反映了细胞类型,而不是随机变化。因此,这些新的计算工具应该有助于加深对 DNA 甲基化表观遗传调控的理解。
更新日期:2020-07-01
down
wechat
bug