当前位置: X-MOL 学术Asia Pac. J. Atmos. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Effectiveness of a Probabilistic Principal Component Analysis Model and Expectation Maximisation Algorithm in Treating Missing Daily Rainfall Data
Asia-Pacific Journal of Atmospheric Sciences ( IF 2.2 ) Pub Date : 2019-06-13 , DOI: 10.1007/s13143-019-00135-8
Zun Liang Chuan , Sayang Mohd Deni , Soo-Fen Fam , Noriszura Ismail

The reliability and accuracy of a risk assessment of extreme hydro-meteorological events are highly dependent on the quality of the historical rainfall time series data. However, missing data in a time series such as this could result in lower quality data. Therefore, this paper proposes a multiple-imputation algorithm for treating missing data without requiring information from adjoining monitoring stations. The proposed imputation algorithms are based on the M-component probabilistic principal component analysis model and an expectation maximisation algorithm (MPPCA-EM). In order to evaluate the effectiveness of the MPPCA-EM imputation algorithm, six distinct historical daily rainfall time series data were recorded from six monitoring stations. These stations were located at the coastal and inland regions of the East-Coast Economic Region (ECER) Malaysia. The results of analysis show that, when it comes to treating missing historical daily rainfall time series data recorded from coastal monitoring stations, the 2-component probabilistic principal component analysis model and expectation-maximisation algorithm (2PPCA-EM) were found to be superior to the single- and multiple-imputation algorithms proposed in previous studies. On the contrary, the single-imputation algorithms as proposed in previous studies were superior to the MPPCA-EM imputation algorithms when treating missing historical daily rainfall time series data recorded from inland monitoring stations.

中文翻译:

概率主成分分析模型和期望最大化算法在缺失日降水量数据处理中的有效性

极端水文气象事件风险评估的可靠性和准确性高度依赖于历史降雨时间序列数据的质量。但是,诸如此类的时间序列中的数据丢失可能会导致数据质量降低。因此,本文提出了一种多输入算法来处理丢失的数据,而无需来自相邻监控站的信息。提出的归因算法基于M分量概率主成分分析模型和期望最大化算法(M PPCA-EM)。为了评估M的有效性PPCA-EM插补算法从六个监测站记录了六个不同的历史日降水时间序列数据。这些站点位于马来西亚东海岸经济区(ECER)的沿海和内陆地区。分析结果表明,在处理沿海监测站记录的历史日降水量时间序列缺失数据时,发现两分量概率主成分分析模型和期望最大化算法(2PPCA-EM)优于先前研究中提出的单次和多次输入算法。相反,先前研究中提出的单输入算法优于M当处理从内陆监测站记录的每日历史降雨时间序列缺失数据时,使用PPCA-EM插补算法。
更新日期:2019-06-13
down
wechat
bug