Multiple imputation of binary multilevel missing not at random data,The Journal of the Royal Statistical Society: Series C (Applied Statistics)

当前位置： X-MOL 学术 › J. R. Stat. Soc. Ser. C Appl. Stat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multiple imputation of binary multilevel missing not at random data
The Journal of the Royal Statistical Society: Series C (Applied Statistics) ( IF 1.0 ) Pub Date : 2020-02-24 , DOI: 10.1111/rssc.12401
Angelina Hammon ₁ , Sabine Zinn ₂

Affiliation

We introduce a selection model‐based multilevel imputation approach to be used within the fully conditional specification framework for multiple imputation. Concretely, we apply a censored bivariate probit model to describe binary variables assumed to be missing not at random. The first equation of the model defines the regression model for the missing data mechanism. The second equation specifies the regression model of the variable to be imputed. The non‐random selection of the binary data is mapped by correlations between the error terms of the two regression models. Hierarchical data structures are modelled by random intercepts in both equations. To fit the novel imputation model we use maximum likelihood and adaptive Gauss–Hermite quadrature. A comprehensive simulation study shows the overall performance of the approach. We test its usefulness for empirical research by applying it to a common problem in social scientific research: the emergence of educational aspirations. Our software is designed to be used in the R package mice.

中文翻译：

二进制多级的多重插补不在随机数据中丢失

我们介绍了基于选择模型的多级插补方法，该方法将在多插补的完全条件规范框架内使用。具体而言，我们应用删失的双变量概率模型来描述假定不是随机丢失的二进制变量。模型的第一个方程式定义了缺失数据机制的回归模型。第二个方程式指定了要估算的变量的回归模型。通过两个回归模型的误差项之间的相关性来映射二进制数据的非随机选择。分层数据结构由两个方程中的随机截距建模。为了拟合新颖的归因模型，我们使用最大似然和自适应高斯-赫尔姆特正交。全面的仿真研究显示了该方法的整体性能。我们通过将其应用于社会科学研究中的一个常见问题：教育愿望的出现，来检验其对实证研究的有用性。我们的软件旨在用于R包装鼠标。

更新日期：2020-02-24

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文