当前位置: X-MOL 学术Journal of Official Statistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Proxy Pattern-Mixture Analysis for a Binary Variable Subject to Nonresponse
Journal of Official Statistics ( IF 1.1 ) Pub Date : 2020-09-01 , DOI: 10.2478/jos-2020-0035
Rebecca R. Andridge 1 , Roderick J.A. Little 2
Affiliation  

Abstract Given increasing survey nonresponse, good measures of the potential impact of nonresponse on survey estimates are particularly important. Existing measures, such as the R-indicator, make the strong assumption that missingness is missing at random, meaning that it depends only on variables that are observed for respondents and nonrespondents. We consider assessment of the impact of nonresponse for a binary survey variable Y subject to nonresponse when missingness may be not at random, meaning that missingness may depend on Y itself. Our work is motivated by missing categorical income data in the 2015 Ohio Medicaid Assessment Survey (OMAS), where whether or not income is missing may be related to the income value itself, with low-income earners more reluctant to respond. We assume there is a set of covariates observed for nonrespondents and respondents, which for the item nonresponse (as in OMAS) is often a rich set of variables, but which may be potentially limited in cases of unit nonresponse. To reduce dimensionality and for simplicity we reduce these available covariates to a continuous proxy variable X, available for both respondents and nonrespondents, that has the highest correlation with Y, estimated from a probit regression analysis of respondent data. We extend the previously proposed proxy-pattern mixture (PPM) analysis for continuous outcomes to the binary outcome using a latent variable approach for modeling the joint distribution of Y and X. Our method does not assume data are missing at random but includes it as a special case, thus creating a convenient framework for sensitivity analyses. Maximum likelihood, Bayesian, and multiple imputation versions of PPM analysis are described, and robustness of these methods to model assumptions is discussed. Properties are demonstrated through simulation and with the 2015 OMAS data.

中文翻译:

不受响应的二元变量的代理模式混合分​​析

摘要鉴于调查无答复的现象日益严重,衡量无答复对调查估计数的潜在影响的良好措施尤为重要。现有的度量标准(例如R指标)强烈假设缺失是随机缺失的,这意味着缺失仅取决于被调查者和未调查者所观察到的变量。当缺失可能不是随机的时,我们考虑对无响应的二元调查变量Y的无响应的影响进行评估,这意味着缺失可能取决于Y本身。我们的工作是由于2015年俄亥俄州医疗补助评估调查(OMAS)中缺少分类收入数据而引起的,在该调查中,收入是否缺失可能与收入价值本身有关,低收入者更不愿做出回应。我们假设观察到一组针对无应答者和应答者的协变量,对于无应答项(如OMAS),它通常是一组丰富的变量,但是在单位无应答的情况下可能会受到限制。为了降低维数,并且为了简化起见,我们将这些可用协变量减少为一个连续的代理变量X,该变量可用于受访者和非受访者,与Y的相关性最高,这是根据受访者数据的概率回归分析估算得出的。我们使用潜在变量方法对Y和X的联合分布进行建模,将先前提出的对连续结果的代理模式混合(PPM)分析扩展到二元结果。我们的方法不假定数据随机丢失,而是将其包括为特殊情况,从而为灵敏度分析创建了一个方便的框架。描述了PPM分析的最大似然,贝叶斯和多个插补版本,并讨论了这些方法对假设建模的鲁棒性。通过仿真和2015 OMAS数据演示性能。
更新日期:2020-09-01
down
wechat
bug