当前位置: X-MOL 学术Stat. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros
Statistics in Medicine ( IF 2 ) Pub Date : 2021-05-07 , DOI: 10.1002/sim.9005
Roberto Ascari 1 , Sonia Migliorati 1
Affiliation  

Binary outcomes are extremely common in biomedical research. Despite its popularity, binomial regression often fails to model this kind of data accurately due to the overdispersion problem. Many alternatives can be found in the literature, the beta-binomial (BB) regression model being one of the most popular. The additional parameter of this model enables a better fit to overdispersed data. It also exhibits an attractive interpretation in terms of the intraclass correlation coefficient. Nonetheless, in many real data applications, a single additional parameter cannot handle the entire excess of variability. In this study, we propose a new finite mixture distribution with BB components, namely, the flexible beta-binomial (FBB), which is characterized by a richer parameterization. This allows us to enhance the variance structure to account for multiple causes of overdispersion while also preserving the intraclass correlation interpretation. The novel regression model, based on the FBB distribution, exploits the flexibility and large variety of the distribution's possible shapes (which includes bimodality and various tail behaviors). Thus, it succeeds in accounting for several (possibly concomitant) sources of overdispersion stemming from the presence of latent groups in the population, outliers, and excessive zero observations. Adopting a Bayesian approach to inference, we perform an intensive simulation study that shows the superiority of the new regression model over that of the existing ones. Its better performance is also confirmed by three applications to real datasets extensively studied in the biomedical literature, namely, bacteria data, atomic bomb radiation data, and control mice data.

中文翻译:

用于考虑异常值和多余零的过度分散二项式数据的新回归模型

二元结果在生物医学研究中极为常见。尽管它很受欢迎,但由于过度分散问题,二项式回归通常无法准确地对此类数据进行建模。在文献中可以找到许多替代方案,β-二项式 (BB) 回归模型是最受欢迎的模型之一。该模型的附加参数可以更好地拟合过度分散的数据。它还在类内相关系数方面表现出有吸引力的解释。尽管如此,在许多实际数据应用中,单个附加参数无法处理整个过量的可变性。在这项研究中,我们提出了一种新的具有 BB 分量的有限混合分布,即灵活的 beta-二项式 (FBB),其特征在于更丰富的参数化。这使我们能够增强方差结构以解释过度分散的多种原因,同时还保留类内相关性解释。基于 FBB 分布的新型回归模型利用了分布可能形状的灵活性和种类繁多(包括双峰性和各种尾部行为)。因此,它成功地解释了由于群体中存在潜在群体、异常值和过多的零观察而导致的几个(可能伴随的)过度分散的来源。采用贝叶斯推理方法,我们进行了深入的模拟研究,表明新回归模型优于现有回归模型。
更新日期:2021-07-12
down
wechat
bug