当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predictive modeling of single-cell DNA methylome data enhances integration with transcriptome data
Genome Research ( IF 6.2 ) Pub Date : 2021-01-01 , DOI: 10.1101/gr.267047.120
Yasin Uzun 1, 2 , Hao Wu 3, 4 , Kai Tan 1, 2, 3, 4, 5
Affiliation  

Single-cell DNA methylation data has become increasingly abundant and has uncovered many genes with a positive correlation between expression and promoter methylation, challenging the common dogma based on bulk data. However, computational tools for analyzing single-cell methylome data are lagging far behind. A number of tasks, including cell type calling and integration with transcriptome data, requires the construction of a robust gene activity matrix as the prerequisite but challenging task. The advent of multi-omics data enables measurement of both DNA methylation and gene expression for the same single cells. Although such data is rather sparse, they are sufficient to train supervised models that capture the complex relationship between DNA methylation and gene expression and predict gene activities at single-cell level. Here, we present methylome association by predictive linkage to expression (MAPLE), a computational framework that learns the association between DNA methylation and expression using both gene- and cell-dependent statistical features. Using multiple data sets generated with different experimental protocols, we show that using predicted gene activity values significantly improves several analysis tasks, including clustering, cell type identification, and integration with transcriptome data. Application of MAPLE revealed several interesting biological insights into the relationship between methylation and gene expression, including asymmetric importance of methylation signals around transcription start site for predicting gene expression, and increased predictive power of methylation signals in promoters located outside CpG islands and shores. With the rapid accumulation of single-cell epigenomics data, MAPLE provides a general framework for integrating such data with transcriptome data.

中文翻译:

单细胞 DNA 甲基化组数据的预测建模增强了与转录组数据的整合

单细胞 DNA 甲基化数据变得越来越丰富,并发现了许多在表达和启动子甲基化之间呈正相关的基因,挑战了基于大量数据的共同教条。然而,用于分析单细胞甲基化组数据的计算工具远远落后。许多任务,包括细胞类型调用和与转录组数据的整合,需要构建一个强大的基因活动矩阵作为先决条件但具有挑战性的任务。多组学数据的出现使得能够测量同一单细胞的 DNA 甲基化和基因表达。尽管此类数据相当稀少,但它们足以训练捕捉 DNA 甲基化与基因表达之间复杂关系并在单细胞水平预测基因活动的监督模型。这里,我们通过预测性表达关联(MAPLE)来呈现甲基化组关联,这是一种计算框架,它使用基因和细胞依赖性统计特征来学习 DNA 甲基化与表达之间的关联。使用不同实验方案生成的多个数据集,我们表明使用预测的基因活性值显着改善了几个分析任务,包括聚类、细胞类型识别和转录组数据的整合。MAPLE 的应用揭示了关于甲基化和基因表达之间关系的几个有趣的生物学见解,包括转录起始位点周围甲基化信号对预测基因表达的不对称重要性,以及位于 CpG 岛和海岸外的启动子中甲基化信号的预测能力增加。
更新日期:2021-01-04
down
wechat
bug