当前位置: X-MOL 学术BMC Med. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Pan-cancer analysis of differential DNA methylation patterns
BMC Medical Genomics ( IF 2.7 ) Pub Date : 2020-10-22 , DOI: 10.1186/s12920-020-00780-3
Mai Shi 1 , Stephen Kwok-Wing Tsui 1, 2 , Hao Wu 3 , Yingying Wei 4
Affiliation  

DNA methylation is a key epigenetic regulator contributing to cancer development. To understand the role of DNA methylation in tumorigenesis, it is important to investigate and compare differential methylation (DM) patterns between normal and case samples across different cancer types. However, current pan-cancer analyses call DM separately for each cancer, which suffers from lower statistical power and fails to provide a comprehensive view for patterns across cancers. In this work, we propose a rigorous statistical model, PanDM, to jointly characterize DM patterns across diverse cancer types. PanDM uses the hidden correlations in the combined dataset to improve statistical power through joint modeling. PanDM takes summary statistics from separate analyses as input and performs methylation site clustering, differential methylation detection, and pan-cancer pattern discovery. We demonstrate the favorable performance of PanDM using simulation data. We apply our model to 12 cancer methylome data collected from The Cancer Genome Atlas (TCGA) project. We further conduct ontology- and pathway-enrichment analyses to gain new biological insights into the pan-cancer DM patterns learned by PanDM. PanDM outperforms two types of separate analyses in the power of DM calling in the simulation study. Application of PanDM to TCGA data reveals 37 pan-cancer DM patterns in the 12 cancer methylomes, including both common and cancer-type-specific patterns. These 37 patterns are in turn used to group cancer types. Functional ontology and biological pathways enriched in the non-common patterns not only underpin the cancer-type-specific etiology and pathogenesis but also unveil the common environmental risk factors shared by multiple cancer types. Moreover, we also identify PanDM-specific DM CpG sites that the common strategy fails to detect. PanDM is a powerful tool that provides a systematic way to investigate aberrant methylation patterns across multiple cancer types. Results from real data analyses suggest a novel angle for us to understand the common and specific DM patterns in different cancers. Moreover, as PanDM works on the summary statistics for each cancer type, the same framework can in principle be applied to pan-cancer analyses of other functional genomic profiles. We implement PanDM as an R package, which is freely available at http://www.sta.cuhk.edu.hk/YWei/PanDM.html .

中文翻译:

差异 DNA 甲基化模式的泛癌分析

DNA 甲基化是促进癌症发展的关键表观遗传调控因子。要了解 DNA 甲基化在肿瘤发生中的作用,重要的是研究和比较不同癌症类型的正常和病例样本之间的差异甲基化 (DM) 模式。然而,目前的泛癌分析对每种癌症分别调用 DM,其统计能力较低,并且无法提供跨癌症模式的全面视图。在这项工作中,我们提出了一个严格的统计模型 PanDM,以共同表征不同癌症类型的 DM 模式。PanDM 使用组合数据集中的隐藏相关性,通过联合建模提高统计能力。PanDM 将来自单独分析的汇总统计数据作为输入,并执行甲基化位点聚类、差异甲基化检测、和泛癌模式发现。我们使用模拟数据证明了 PanDM 的良好性能。我们将我们的模型应用于从癌症基因组图谱 (TCGA) 项目收集的 12 个癌症甲基化组数据。我们进一步进行本体和通路富集分析,以获得对 PanDM 学习的泛癌 DM 模式的新生物学见解。PanDM 在模拟研究中的 DM 调用能力方面优于两种类型的单独分析。PanDM 在 TCGA 数据中的应用揭示了 12 种癌症甲基化组中的 37 种泛癌 DM 模式,包括常见模式和癌症类型特异性模式。这 37 种模式依次用于对癌症类型进行分组。富含非常见模式的功能本体和生物学通路不仅支持特定癌症类型的病因和发病机制,而且揭示了多种癌症类型共有的常见环境风险因素。此外,我们还确定了通用策略未能检测到的 PanDM 特定的 DM CpG 站点。PanDM 是一种强大的工具,它提供了一种系统的方法来研究多种癌症类型的异常甲基化模式。来自真实数据分析的结果为我们提供了一个新的角度来了解不同癌症中常见和特定的 DM 模式。此外,由于 PanDM 对每种癌症类型进行汇总统计,因此原则上可以将相同的框架应用于其他功能基因组谱的泛癌症分析。我们将 PanDM 实现为 R 包,可在 http://www.sta 上免费获得。
更新日期:2020-10-26
down
wechat
bug