当前位置: X-MOL 学术Test › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multivariate functional data modeling with time-varying clustering
TEST ( IF 1.2 ) Pub Date : 2020-09-09 , DOI: 10.1007/s11749-020-00733-z
Philip A. White , Alan E. Gelfand

We consider the setting of multivariate functional data collected over time at each of a set of sites. Our objective is to implement model-based clustering of the functions across the sites where we allow such clustering to vary over time. Anticipating dependence between the functions within a site as well as across sites, we model the collection of functions using a multivariate Gaussian process. With many sites and several functions at each site, we use dimension reduction to provide a computationally manageable stochastic process specification. To jointly cluster the functions, we use the Dirichlet process which enables shared labeling of the functions across the sites. Specifically, we cluster functions based on their response to exogenous variables. Though the functions arise over continuous time, clustering in continuous time is extremely computationally demanding and not of practical interest. Therefore, we employ partitioning of the timescale to capture time-varying clustering. Our illustrative setting is bivariate, monitoring ozone and PM\(_{10}\) levels over time for one year at a set of monitoring sites. The data we work with is from 24 monitoring sites in Mexico City for 2017 which record hourly ozone and PM\(_{10}\) levels. Hence, we have 48 functions to work with across 8760 hours. We provide a Gaussian process model for each function using continuous-time meteorological variables as regressors along with adjustment for daily periodicity. We interpret the similarity of functions in terms of their shape, captured through site-specific coefficients, and use these coefficients to develop the clustering.



中文翻译:

具有时变聚类的多元功能数据建模

我们考虑设置随时间推移在一组站点中的每个站点收集的多元函数数据的设置。我们的目标是在各个站点上实现基于功能的功能的基于模型的聚类,我们允许此类聚类随时间而变化。预期站点内以及站点之间功能之间的依赖性,我们使用多元高斯过程对功能集合进行建模。由于拥有许多站点,并且每个站点都有多个功能,我们使用降维来提供可计算管理的随机过程规范。为了共同组合功能,我们使用Dirichlet流程,该流程可跨站点共享功能的标签。具体来说,我们根据函数对外生变量的响应对函数进行聚类。尽管这些功能是在连续的时间内出现的,连续时间内的聚类对计算的要求非常高,并且没有实际意义。因此,我们采用时间尺度的划分来捕获时变聚类。我们的示例性设置是双变量的,监视臭氧和PM一组监视站点随时间的\(_ {10} \)级别持续一年。我们使用的数据来自2017年墨西哥城的24个监测点,它们记录了每小时的臭氧水平和PM (_ {10} \)水平。因此,我们有48个功能可处理8760小时。我们为每个函数提供一个高斯过程模型,使用连续时间的气象变量作为回归变量,并调整每日周期性。我们通过功能的形状来解释功能的相似性,这些功能是通过特定于站点的系数来捕获的,并使用这些系数来发展聚类。

更新日期:2020-09-10
down
wechat
bug