当前位置: X-MOL 学术J. Stat. Comput. Simul. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Concurrent generation of multivariate mixed data with variables of dissimilar types
Journal of Statistical Computation and Simulation ( IF 1.1 ) Pub Date : 2016-04-22 , DOI: 10.1080/00949655.2016.1177530
Anup Amatya 1 , Hakan Demirtas 2
Affiliation  

ABSTRACT Data sets originating from wide range of research studies are composed of multiple variables that are correlated and of dissimilar types, primarily of count, binary/ordinal and continuous attributes. The present paper builds on the previous works on multivariate data generation and develops a framework for generating multivariate mixed data with a pre-specified correlation matrix. The generated data consist of components that are marginally count, binary, ordinal and continuous, where the count and continuous variables follow the generalized Poisson and normal distributions, respectively. The use of the generalized Poisson distribution provides a flexible mechanism which allows under- and over-dispersed count variables generally encountered in practice. A step-by-step algorithm is provided and its performance is evaluated using simulated and real-data scenarios.

中文翻译:

具有不同类型变量的多元混合数据的并发生成

摘要 源自广泛研究的数据集由多个相关和不同类型的变量组成,主要是计数、二元/序数和连续属性。本论文建立在先前关于多元数据生成的工作的基础上,并开发了一个框架,用于生成具有预先指定的相关矩阵的多元混合数据。生成的数据由边缘计数、二元、有序和连续的分量组成,其中计数和连续变量分别遵循广义泊松分布和正态分布。广义泊松分布的使用提供了一种灵活的机制,该机制允许实践中通常遇到的分散不足和过度分散的计数变量。
更新日期:2016-04-22
down
wechat
bug