当前位置: X-MOL 学术Int. Stat. Rev. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modelling Excess Zeros in Count Data: A New Perspective on Modelling Approaches
International Statistical Review ( IF 2 ) Pub Date : 2021-11-07 , DOI: 10.1111/insr.12479
John Haslett 1 , Andrew C. Parnell 2 , John Hinde 3 , Rafael Andrade Moral 4
Affiliation  

We consider the analysis of count data in which the observed frequency of zero counts is unusually large, typically with respect to the Poisson distribution. We focus on two alternative modelling approaches: over-dispersion (OD) models and zero-inflation (ZI) models, both of which can be seen as generalisations of the Poisson distribution; we refer to these as implicit and explicit ZI models, respectively. Although sometimes seen as competing approaches, they can be complementary; OD is a consequence of ZI modelling, and ZI is a by-product of OD modelling. The central objective in such analyses is often concerned with inference on the effect of covariates on the mean, in light of the apparent excess of zeros in the counts. Typically, the modelling of the excess zeros per se is a secondary objective, and there are choices to be made between, and within, the OD and ZI approaches. The contribution of this paper is primarily conceptual. We contrast, descriptively, the impact on zeros of the two approaches. We further offer a novel descriptive characterisation of alternative ZI models, including the classic hurdle and mixture models, by providing a unifying theoretical framework for their comparison. This in turn leads to a novel and technically simpler ZI model. We develop the underlying theory for univariate counts and touch on its implication for multivariate count data.

中文翻译:

对计数数据中的多余零进行建模:建模方法的新视角

我们考虑对计数数据的分析,其中观察到的零计数频率异常大,通常与泊松分布有关。我们专注于两种替代建模方法:过度分散 (OD) 模型和零膨胀 (ZI) 模型,两者都可以看作是泊松分布的推广;我们将它们分别称为隐式和显式 ZI 模型。尽管有时被视为相互竞争的方法,但它们可以是互补的;OD 是 ZI 建模的结果,而 ZI 是 OD 建模的副产品。鉴于计数中零点明显过多,此类分析的中心目标通常涉及推断协变量对均值的影响。通常,多余零点本身的建模是次要目标,在 OD 和 ZI 方法之间和内部可以做出选择。本文的贡献主要是概念性的。我们描述性地对比了这两种方法对零点的影响。我们通过为它们的比较提供统一的理论框架,进一步提供了替代 ZI 模型的新颖描述性特征,包括经典的障碍和混合模型。这反过来又导致了一种新颖且技术上更简单的 ZI 模型。我们开发了单变量计数的基本理论,并触及了它对多变量计数数据的影响。
更新日期:2021-11-07
down
wechat
bug