当前位置: X-MOL 学术Stat. Interface › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Single-gene negative binomial regression models for RNA-Seq data with higher-order asymptotic inference
Statistics and Its Interface ( IF 0.3 ) Pub Date : 2015-01-01 , DOI: 10.4310/sii.2015.v8.n4.a1
Yanming Di 1
Affiliation  

We consider negative binomial (NB) regression models for RNA-Seq read counts and investigate an approach where such NB regression models are fitted to individual genes separately and, in particular, the NB dispersion parameter is estimated from each gene separately without assuming commonalities between genes. This single-gene approach contrasts with the more widely-used dispersion-modeling approach where the NB dispersion is modeled as a simple function of the mean or other measures of read abundance, and then estimated from a large number of genes combined. We show that through the use of higher-order asymptotic techniques, inferences with correct type I errors can be made about the regression coefficients in a single-gene NB regression model even when the dispersion is unknown and the sample size is small. The motivations for studying single-gene models include: 1) they provide a basis of reference for understanding and quantifying the power-robustness trade-offs of the dispersion-modeling approach; 2) they can also be potentially useful in practice if moderate sample sizes become available and diagnostic tools indicate potential problems with simple models of dispersion.

中文翻译:

具有高阶渐近推理的 RNA-Seq 数据的单基因负二项式回归模型

我们考虑了用于 RNA-Seq 读数计数的负二项式 (NB) 回归模型,并研究了一种将此类 NB 回归模型分别拟合到单个基因的方法,特别是,在不假设基因之间具有共性的情况下,分别从每个基因估计 NB 分散参数. 这种单基因方法与更广泛使用的分散建模方法形成对比,后者将 NB 分散建模为平均值或其他读数丰度的简单函数,然后根据大量基因的组合进行估计。我们表明,通过使用高阶渐近技术,即使在离散度未知且样本量很小的情况下,也可以对单基因 NB 回归模型中的回归系数进行正确的 I 类错误推断。研究单基因模型的动机包括:1)它们为理解和量化分散建模方法的功率稳健性权衡提供了参考基础;2) 如果有适中的样本量可用并且诊断工具表明简单的分散模型存在潜在问题,它们在实践中也可能有用。
更新日期:2015-01-01
down
wechat
bug