当前位置: X-MOL 学术Biometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel statistical method for modeling covariate effects in bisulfite sequencing derived measures of DNA methylation
Biometrics ( IF 1.4 ) Pub Date : 2020-06-05 , DOI: 10.1111/biom.13307
Kaiqiong Zhao 1, 2 , Karim Oualkacha 3 , Lajmi Lakhal-Chaieb 4 , Aurélie Labbe 5 , Kathleen Klein 2 , Antonio Ciampi 1, 2 , Marie Hudson 2, 6 , Inés Colmegna 6, 7 , Tomi Pastinen 8 , Tieyuan Zhang 9 , Denise Daley 10 , Celia M T Greenwood 1, 2, 11

Identifying disease-associated changes in DNA methylation can help us gain a better understanding of disease etiology. Bisulfite sequencing allows the generation of high-throughput methylation profiles at single-base resolution of DNA. However, optimally modeling and analyzing these sparse and discrete sequencing data is still very challenging due to variable read depth, missing data patterns, long-range correlations, data errors, and confounding from cell type mixtures. We propose a regression-based hierarchical model that allows covariate effects to vary smoothly along genomic positions and we have built a specialized EM algorithm, which explicitly allows for experimental errors and cell type mixtures, to make inference about smooth covariate effects in the model. Simulations show that the proposed method provides accurate estimates of covariate effects and captures the major underlying methylation patterns with excellent power. We also apply our method to analyze data from rheumatoid arthritis patients and controls. The method has been implemented in R package SOMNiBUS. This article is protected by copyright. All rights reserved.


一种新的统计方法,用于模拟亚硫酸氢盐测序衍生的 DNA 甲基化测量中的协变量效应

识别与疾病相关的 DNA 甲基化变化可以帮助我们更好地了解疾病病因。亚硫酸氢盐测序允许在 DNA 的单碱基分辨率下生成高通量甲基化图谱。然而,由于读取深度可变、数据模式缺失、长期相关性、数据错误以及细胞类型混合物的混淆,对这些稀疏和离散的测序数据进行最佳建模和分析仍然非常具有挑战性。我们提出了一个基于回归的层次模型,它允许协变量效应沿着基因组位置平滑变化,并且我们构建了一个专门的 EM 算法,该算法明确允许实验错误和细胞类型混合,以推断模型中的平滑协变量效应。模拟表明,所提出的方法提供了对协变量效应的准确估计,并以出色的能力捕获了主要的潜在甲基化模式。我们还应用我们的方法来分析来自类风湿性关节炎患者和对照的数据。该方法已在 R 包 SOMNiBUS 中实现。本文受版权保护。版权所有。