当前位置: X-MOL 学术J. Environ. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Multiple Imputation Strategy for Eddy Covariance Data
Journal of Environmental Informatics ( IF 6.0 ) Pub Date : 2018-07-01 , DOI: 10.3808/jei.201800391
D. Vitale , , M. Bilancia , D. Papale , ,

Half-hourly time series of net ecosystem exchange (NEE) of CO2, latent heat flux (LE) and sensible heat flux (H) measured through the micro-meteorological eddy covariance (EC) technique are noisy and show a high percentage of missing data. By using EC measurements that are part of the FLUXNET2015 dataset, we evaluate the performance of a multiple imputation (MI) strategy based on an efficient computational strategy introduced in Honaker and King (2010), combining the classic Expectation-Maximization (EM) algorithm with a bootstrap approach, in order to take draws from a suitable approximation of posterior distribution of model parameters. Armed with these instruments, we are able to introduce three new multiple imputation models, characterized by an increasing level of complexity, and built on top of multivariate normality assumption: 1) MLR, which imputes EC missing values using a static multiple linear regression of observed values of suitable input variables; 2) ADL, which enriches with dynamic properties the static specification of MLR, by considering an autoregressive distributed lag specification; 3) PADL, which adds further complexity by embedding the ADL model in a panel-data perspective. Under several artificial gap scenarios, we show that PADL has a better ability in modeling the complex dynamics of ecosystem fluxes and reconstructing missing data points, thus providing unbiased imputations and preserving the original sampling distribution. The added flexibility arising from the time series cross section structure of PADL warrants improved performances, outperforming those of other imputation methods, as well as of the marginal distribution sampling algorithm (MDS), a widely used gap-filling approach introduced by Reichstein et al. (2005), especially in the case of nighttime flux data. It is expected that the strategy proposed in this paper will become useful in creating multiple imputations for a variety of EC datasets, providing valid inferences for a broad range of scientific estimands (such as annual budgets).

中文翻译:

涡流协方差数据的多重插补策略

通过微气象涡旋协方差 (EC) 技术测量的 CO2、潜热通量 (LE) 和感热通量 (H) 的净生态系统交换 (NEE) 的半小时时间序列是嘈杂的,并且缺失数据的百分比很高. 通过使用作为 FLUXNET2015 数据集一部分的 EC 测量,我们评估了基于 Honaker 和 King(2010)中引入的高效计算策略的多重插补(MI)策略的性能,结合经典的期望最大化(EM)算法与bootstrap 方法,以便从模型参数的后验分布的合适近似值中进行抽取。有了这些工具,我们能够引入三个新的多重插补模型,其特点是复杂性不断增加,并建立在多元正态性假设之上:1)MLR,它使用合适输入变量的观察值的静态多元线性回归来估算 EC 缺失值;2) ADL,通过考虑自回归分布式滞后规范,通过动态特性丰富了 MLR 的静态规范;3) PADL,它通过将 ADL 模型嵌入到面板数据的角度进一步增加了复杂性。在几种人工间隙场景下,我们表明 PADL 在建模生态系统通量的复杂动态和重建缺失数据点方面具有更好的能力,从而提供无偏插补并保留原始采样分布。PADL 的时间序列横截面结构带来的额外灵活性保证了性能的提高,优于其他插补方法以及边际分布采样算法 (MDS),Reichstein 等人引入的一种广泛使用的间隙填充方法。(2005),尤其是在夜间通量数据的情况下。预计本文提出的策略将有助于为各种 EC 数据集创建多重插补,为广泛的科学估算(例如年度预算)提供有效推论。
更新日期:2018-07-01
down
wechat
bug