当前位置: X-MOL 学术J. Multivar. Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A semiparametric latent factor model for large scale temporal data with heteroscedasticity
Journal of Multivariate Analysis ( IF 1.6 ) Pub Date : 2021-07-17 , DOI: 10.1016/j.jmva.2021.104786
Lyuou Zhang 1 , Wen Zhou 2 , Haonan Wang 2
Affiliation  

Large scale temporal data have flourished in a vast array of applications, and their sophisticated structures, especially the heteroscedasticity among subjects with inter- and intra-temporal dependence, have fueled a great demand for new statistical models. In this paper, with covariate information, we consider a flexible model for large scale temporal data with subject-specific heteroscedasticity. Formally, the model employs latent semiparametric factors to simultaneously account for the subject-specific heteroscedasticity and the contemporaneous and/or serial correlations. The subject-specific heteroscedasticity is modeled as the product of the unobserved factor process and subject’s covariate effect, which is further characterized via additive models. For estimation, we propose a two-step procedure. First, the latent factor process and nonparametric loading are recovered through projection-based methods, and following, we estimate the regression components by approaches motivated from the generalized least squares. By scrupulously examining the non-asymptotic rates for recovering the factor process and its loading, we show the consistency and efficiency of estimated regression coefficients in the absence of prior knowledge of latent factor process and subject’s covariate effect. The statistical guarantees remain valid even for finite time points that makes our method particularly appealing when the subjects significantly outnumber the observation time points. Using comprehensive simulations, we demonstrate the finite sample performance of our method, which corroborates the theoretical findings. Finally, we apply our method to a data set of air quality and energy consumption collected at 129 monitoring sites in the United States in 2015.



中文翻译:

具有异方差性的大规模时间数据的半参数潜在因子模型

大规模时间数据在广泛的应用中蓬勃发展,其复杂的结构,尤其是具有跨时间和时间内依赖性的主题之间的异方差性,推动了对新统计模型的巨大需求。在本文中,使用协变量信息,我们考虑了一种具有特定主题异方差性的大规模时间数据的灵活模型。形式上,该模型采用潜在的半参数因子来同时考虑特定主题的异方差性和同期和/或序列相关性。特定主题的异方差被建模为未观察到的因素过程和主题的协变量效应的乘积,通过可加模型进一步表征。对于估计,我们提出了一个两步程序。第一的,潜在因子过程和非参数加载通过基于投影的方法恢复,然后,我们通过广义最小二乘法的方法估计回归分量。通过仔细检查恢复因子过程的非渐近率及其负载,我们展示了在缺乏潜在因子过程和主体协变量效应的先验知识的情况下估计回归系数的一致性和效率。即使对于有限的时间点,统计保证仍然有效,这使得我们的方法在受试者数量显着超过观察时间点时特别有吸引力。使用综合模拟,我们证明了我们方法的有限样本性能,这证实了理论发现。最后,

更新日期:2021-08-01
down
wechat
bug