A comparison of zero-inflated and hurdle models for modeling zero-inflated count data,Journal of Statistical Distributions and Applications

当前位置： X-MOL 学术 › J. Stat. Distrib. App. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A comparison of zero-inflated and hurdle models for modeling zero-inflated count data
Journal of Statistical Distributions and Applications Pub Date : 2021-06-24 , DOI: 10.1186/s40488-021-00121-4
Cindy Xin Feng ₁

Affiliation

Counts data with excessive zeros are frequently encountered in practice. For example, the number of health services visits often includes many zeros representing the patients with no utilization during a follow-up time. A common feature of this type of data is that the count measure tends to have excessive zero beyond a common count distribution can accommodate, such as Poisson or negative binomial. Zero-inflated or hurdle models are often used to fit such data. Despite the increasing popularity of ZI and hurdle models, there is still a lack of investigation of the fundamental differences between these two types of models. In this article, we reviewed the zero-inflated and hurdle models and highlighted their differences in terms of their data generating processes. We also conducted simulation studies to evaluate the performances of both types of models. The final choice of regression model should be made after a careful assessment of goodness of fit and should be tailored to a particular data in question.

中文翻译：

用于建模零膨胀计数数据的零膨胀模型和障碍模型的比较

在实践中经常遇到带有过多零的计数数据。例如，卫生服务就诊次数通常包括许多零，代表在随访期间没有使用的患者。此类数据的一个共同特征是，计数度量往往具有超出常见计数分布所能容纳的过多零，例如泊松或负二项式。零膨胀模型或障碍模型通常用于拟合此类数据。尽管 ZI 和跨栏模型越来越受欢迎，但仍然缺乏对这两种模型之间根本差异的调查。在本文中，我们回顾了零膨胀模型和障碍模型，并强调了它们在数据生成过程方面的差异。我们还进行了模拟研究来评估这两种模型的性能。回归模型的最终选择应该在仔细评估拟合优度之后做出，并且应该针对所讨论的特定数据进行定制。

更新日期：2021-06-24

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>