当前位置: X-MOL 学术Int. J. Approx. Reason. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multivariate Statistical Matching Using Graphical Modeling
International Journal of Approximate Reasoning ( IF 3.9 ) Pub Date : 2021-03-01 , DOI: 10.1016/j.ijar.2020.12.006
Pier Luigi Conti , Daniela Marella , Paola Vicard , Vincenzina Vitale

Abstract The goal of statistical matching, at a macro level, is the estimation of the joint distribution of variables separately observed in independent samples. The lack of joint information on the variables of interest leads to uncertainty about the data generating model. In this paper we propose the use of graphical models to deal with the statistical matching uncertainty for multivariate categorical variables. The use of Bayesian networks in the statistical matching context allows both to introduce extra sample information on the dependence structure between the variables of interest and to use such an information to factorize the joint probability distribution according to the graph decomposition of a multivariate dependence in lower dimension components. This representation of the joint probability distribution, taking advantage of local relationships, allows to simplify both parameters estimation and statistical matching quality evaluation in a multivariate context. A simulation experiment is performed in order to evaluate the performance of the proposed methodology with and without auxiliary information, as well as to compare it with the saturated multinomial model, in terms of uncertainty reduction. Finally, an application to a real case is provided. Results show a considerable improvement in the quality of statistical matching when the dependence structure is taken into account.

中文翻译:

使用图形建模的多元统计匹配

摘要 在宏观层面上,统计匹配的目标是估计独立样本中单独观察到的变量的联合分布。缺乏关于感兴趣变量的联合信息导致数据生成模型的不确定性。在本文中,我们建议使用图形模型来处理多元分类变量的统计匹配不确定性。在统计匹配上下文中使用贝叶斯网络既允许引入关于感兴趣变量之间的依赖结构的额外样本信息,也允许根据低维多变量依赖的图分解使用这些信息来分解联合概率分布组件。这种联合概率分布的表示,利用局部关系,可以简化多变量环境中的参数估计和统计匹配质量评估。进行模拟实验以评估所提出的方法在使用和不使用辅助信息的情况下的性能,并将其与饱和多项式模型在不确定性降低方面进行比较。最后,提供了一个真实案例的应用。结果表明,当考虑依赖结构时,统计匹配的质量有相当大的提高。以及在不确定性降低方面将其与饱和多项式模型进行比较。最后,提供了一个真实案例的应用。结果表明,当考虑依赖结构时,统计匹配的质量有相当大的提高。以及在不确定性降低方面将其与饱和多项式模型进行比较。最后,提供了一个真实案例的应用。结果表明,当考虑依赖结构时,统计匹配的质量有相当大的提高。
更新日期:2021-03-01
down
wechat
bug