当前位置: X-MOL 学术J. Am. Stat. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Inference for Multivariate Regression Model Based on Synthetic Data Generated Using Plug-in Sampling
Journal of the American Statistical Association ( IF 3.7 ) Pub Date : 2021-04-27 , DOI: 10.1080/01621459.2021.1900860
Ricardo Moura 1, 2 , Martin Klein 3 , John Zylstra 4 , Carlos A. Coelho 5 , Bimal Sinha 6, 7
Affiliation  

Abstract

In this article, the authors derive the likelihood-based exact inference for singly and multiply imputed synthetic data in the context of a multivariate regression model. The synthetic data are generated via the Plug-in Sampling method, where the unknown parameters in the model are set equal to the observed values of their point estimators based on the original data, and synthetic data are drawn from this estimated version of the model. Simulation studies are carried out in order to confirm the theoretical results. The authors provide exact test procedures, which in case multiple synthetic datasets are permissible, are compared with the asymptotic results of Reiter. An application using 2000 U.S. Current Population Survey public use data is discussed. Furthermore, properties of the proposed methodology are evaluated in scenarios where some of the conditions that were used to derive the methodology do not hold, namely for nonnormal and discrete distributed random variables, cases in which the inferential procedures developed still show very good performances.



中文翻译:

基于使用插件采样生成的合成数据的多元回归模型推断

摘要

在本文中,作者在多元回归模型的背景下为单次多次插补合成数据推导出基于似然的精确推理。合成数据通过 Plug-in Sampling 方法生成,其中模型中的未知参数设置为其基于原始数据的点估计量的观测值,并从该模型的估计版本中提取合成数据。进行模拟研究以确认理论结果。作者提供了准确的在允许多个合成数据集的情况下,将测试程序与 Reiter 的渐近结果进行比较。讨论了使用 2000 年美国当前人口调查公共使用数据的应用程序。此外,在用于推导该方法的某些条件不成立的情况下,即对于非正态和离散分布的随机变量,所开发的推理程序仍然显示出非常好的性能的情况下,评估了所提出的方法的属性。

更新日期:2021-06-08
down
wechat
bug