当前位置: X-MOL 学术Comput. Stat. Data Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient estimation in a partially specified nonignorable propensity score model
Computational Statistics & Data Analysis ( IF 1.8 ) Pub Date : 2021-07-21 , DOI: 10.1016/j.csda.2021.107322
Mengyan Li 1 , Yanyuan Ma 2 , Jiwei Zhao 3
Affiliation  

Consider the regression setting where the response variable is subject to missing data and the covariates are fully observed. A nonignorable propensity score model, i.e., the probability that the response is observed conditional on all variables depends on the missing values themselves, is assumed throughout the paper. In such problems, model misspecification and model identifiability are two critical issues. A fully parametric approach can produce results that are sensitive to the model assumptions, while a fully nonparametric approach may not be sufficient for model identification. A new flexible semiparametric propensity score model is proposed where the relationship between the missingness indicator and the partially observed response is totally unspecified and estimated nonparametrically, while the relationship between the missingness indicator and the fully observed covariates is modeled parametrically. The proposed estimator is constructed via a semiparametric treatment and is proved to be semiparametrically efficient. Comprehensive simulation studies are conducted to examine the finite-sample performance of the estimators. While the naive parametric method leads to heavily biased estimator and poor coverage results, the proposed method produces estimator with negligible finite-sample biases and also correct inference results. The proposed method is further illustrated via an electronic health records (EHR) data application for the albumin level in the blood sample. The empirical analyses demonstrated that the proposed semiparametric propensity score model is more sensible than a purely parametric model. The proposed method could be very useful to uncover the unknown and possibly nonlinear dependence of the propensity score model to the albumin level, and is recommended for practical use.



中文翻译:

部分指定的不可忽略倾向评分模型中的有效估计

考虑回归设置,其中响应变量受到缺失数据的影响并且协变量被完全观察到。一个不可忽略的倾向得分模型,即在所有变量的条件下观察到响应的概率取决于缺失值本身,在整篇论文中都被假设。在此类问题中,模型错误指定和模型可识别性是两个关​​键问题。完全参数方法可以产生对模型假设敏感的结果,而完全非参数方法可能不足以识别模型。提出了一种新的灵活的半参数倾向评分模型,其中缺失指标和部分观察到的响应之间的关系是完全未指定的,并且是非参数估计的,而缺失指标和完全观察到的协变量之间的关系是参数化的。所提出的估计器是通过半参数处理构建的,并被证明是半参数有效的。进行了全面的模拟研究以检查估计器的有限样本性能。虽然朴素参数方法导致估计器严重偏差和覆盖结果差,但所提出的方法产生的估计器具有可忽略的有限样本偏差,并且推理结果也正确。通过用于血液样本中白蛋白水平的电子健康记录 (EHR) 数据应用程序进一步说明了所提出的方法。实证分析表明,所提出的半参数倾向评分模型比纯参数模型更明智。

更新日期:2021-07-21
down
wechat
bug