当前位置: X-MOL 学术Journal of Official Statistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Statistical Challenges in Combining Survey and Auxiliary Data to Produce Official Statistics
Journal of Official Statistics ( IF 0.5 ) Pub Date : 2020-03-01 , DOI: 10.2478/jos-2020-0004
Andreea L. Erciulescu 1 , Nathan B. Cruze 2 , Balgobin Nandram 3
Affiliation  

Abstract Combining survey and auxiliary data to produce official statistics is gaining interest at federal agencies and among policy makers due to its efficiency. Recent studies have shown the practicality of small area estimation modeling approaches in the context of integrating data from multiple sources to improve estimation at fine levels of aggregation. In this article, agricultural predictions are constructed using a hierarchical Bayes subarea-level model, fit to data available from different sources. Auxiliary data are initially used to complement the survey data and define the prediction space, and then to define covariates for the model. Finally, not-in-sample predictions are constructed using the model output, and benchmarking constraints are imposed on the final set of in-sample and not-in-sample predictions. Unlike most of the studies discussing not-in-sample prediction, this article illustrates a method that uses the data available from multiple sources to define the prediction space. As a consequence, the resulting framework provides a larger set of nationwide predictions as candidate for official statistics, and extrapolation is not of concern. Challenges in developing the methods to combine different data sources are discussed in the context of planted acreage prediction.

中文翻译:

将调查和辅助数据相结合以产生官方统计数据时的统计挑战

摘要由于其效率高,将调查数据和辅助数据相结合以生成官方统计数据正在引起联邦机构和决策者的关注。最近的研究表明,在整合来自多个来源的数据以改善精细聚合水平的估计的背景下,小面积估计建模方法的实用性。在本文中,使用分层贝叶斯分区模型来构建农业预测,以适应不同来源的可用数据。辅助数据最初用于补充调查数据并定义预测空间,然后定义模型的协变量。最后,使用模型输出构造非样本预测,并将基准约束施加于最终的样本内和非样本预测集。与讨论非样本预测的大多数研究不同,本文阐述了一种使用多个来源的数据定义预测空间的方法。结果,作为官方统计数据的候选者,由此产生的框架提供了更大范围的全国范围的预测,而外推法则无关紧要。在种植面积预测的背景下,讨论了开发组合不同数据源的方法所面临的挑战。
更新日期:2020-03-01
down
wechat
bug