当前位置: X-MOL 学术Stat. Methods Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Semiparametric model for regression analysis with nonmonotone missing data
Statistical Methods & Applications ( IF 1.1 ) Pub Date : 2020-06-13 , DOI: 10.1007/s10260-020-00530-w
Yang Zhao

Semiparametric likelihoods for regression models with missing at random data (Chen in J Am Stat Assoc 99:1176–1189, 2004, Zhang and Rockette in J Stat Comput Simul 77(2):163–173, 2007, Zhao et al. in Biom J 51: 123–136, 2009, Zhao in Commun Stat Theory Methods 38:3736–3744, 2009) are robust as they use nonparametric models for covariate distributions and do not require modeling the missing data probabilities. Furthermore, the EM algorithms based on the semiparametric likelihoods have closed form expressions for both E-step and M-step. As far as we know the semiparametric likelihoods can only deal with the simple monotone missing data pattern. In this research we extend the semiparemetric likelihood approach to deal with regression models with arbitrary nonmonotone missing at random data. We propose a pseudo-likelihood model, which uses an empirical distribution to model the conditional distribution of missing covariates given observed covariates for each missing data pattern separately. We show that an EM algorithm with closed form updating formulas can be used for computing maximum pseudo-likelihood estimates for regression models with nonmonotone missing data. We then propose estimating the asymptotic variance of the maximum pseudo-likelihood estimator through a profile log likelihood and the EM algorithm. We examine the finite sample performance of the new methods in simulation studies and further illustrate the methods in a real data example investigating high risk gambling behavior and the associated factors.



中文翻译:

具有非单调缺失数据的回归分析的半参数模型

缺少随机数据的回归模型的半参数似然性(Chen in J Am Stat Assoc 99:1176-1189,2004,Zhang and Rockette in J Stat Comput Simul 77(2):163-173,2007,Zhao等人,Biom J 51:123–136,2009,Zhao in Commun Stat Theory Methods 38:3736–3744,2009)是稳健的,因为它们使用非参数模型进行协变量分布,并且不需要对丢失的数据概率进行建模。此外,基于半参数似然的EM算法对于E步和M步均具有闭合形式的表达式。据我们所知,半参数似然只能处理简单的单调缺失数据模式。在这项研究中,我们扩展了半参数似然方法,以处理在随机数据中缺少任意非单调的回归模型。我们提出了一个伪似然模型,它使用经验分布对缺失协变量的条件分布进行建模,这些条件分别针对每个缺失数据模式观察到的协变量。我们表明,具有封闭形式更新公式的EM算法可用于计算具有非单调缺失数据的回归模型的最大伪似然估计。然后,我们提出通过轮廓对数似然法和EM算法估计最大伪似然估计器的渐近方差。我们在模拟研究中检查了新方法的有限样本性能,并在调查高风险赌博行为及其相关因素的真实数据示例中进一步说明了这些方法。我们表明,具有封闭形式更新公式的EM算法可用于计算具有非单调缺失数据的回归模型的最大伪似然估计。然后,我们提出通过轮廓对数似然法和EM算法估计最大伪似然估计器的渐近方差。我们在模拟研究中检查了新方法的有限样本性能,并在调查高风险赌博行为及其相关因素的真实数据示例中进一步说明了这些方法。我们表明,具有封闭形式更新公式的EM算法可用于计算具有非单调缺失数据的回归模型的最大伪似然估计。然后,我们提出通过轮廓对数似然法和EM算法估计最大伪似然估计器的渐近方差。我们在仿真研究中检查了新方法的有限样本性能,并在调查高风险赌博行为及其相关因素的真实数据示例中进一步说明了这些方法。

更新日期:2020-07-24
down
wechat
bug