当前位置: X-MOL 学术Agric. For. Meteorol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparative analysis of seven machine learning algorithms and five empirical models to estimate soil thermal conductivity
Agricultural and Forest Meteorology ( IF 6.2 ) Pub Date : 2022-07-13 , DOI: 10.1016/j.agrformet.2022.109080
Tianyue Zhao , Shuchao Liu , Jia Xu , Hailong He , Dong Wang , Robert Horton , Gang Liu

Soil thermal conductivity (λ) is an important thermal property that is crucial for surface energy balance and water balance studies. 1602 measured soil thermal conductivity values representing 189 soils were used to evaluate five empirical models (i.e., de Vries (1963) model (de Vries 1963), Campbell (1985) model (Campbell1985), Johansen (1975) model (Johansen 1975), Côté and Konrad (2005) model (Côté and Konrad 2005), and Lu et al. (2007) model (Lu 2007)) and seven machine learning (ML) algorithms (i.e., Decision Tree (DT), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), Linear Regression (LR), K-Nearest Neighbors (KNN), Neural Network (NN), and Gaussian Process (GP)) to estimate λ. Our results demonstrated that the average root mean squared error (RMSE) values of ML were 66% and 82% of the empirical model values on validation and test sets respectively. The three best ML algorithms (GBDT, NN, RF) performed significantly better than the three best empirical models (Lu 2007, Côté and Konrad 2005, Johansen 1975): 0.183 < RMSE < 0.259 (W m−1 K−1) for ML algorithms and 0.293 < RMSE < 0.320 (W m−1 K−1) for empirical models. For ML, we recommend the GBDT, NN and RF algorithms. For empirical models, we recommend to use three normalized models (Lu 2007, Côté and Konrad 2005, Johansen 1975) over the physically-based model (DV1963) and the regression model (CG1985). The feature importance rankings performed by the RF and GBDT algorithms show that soil moisture content and soil bulk density are the most critical factors affecting λ. Soil moisture content and soil bulk density together account for more than 80% of the influence importance value of λ. RF gives more consistent feature importance ranking results than GBDT, therefore, we recommend the use of RF for selecting features.



中文翻译:

七种机器学习算法和五种经验模型估算土壤热导率的对比分析

土壤热导率 ( λ ) 是一种重要的热特性,对地表能量平衡和水平衡研究至关重要。代表 189 个土壤的 1602 个测量的土壤热导率值用于评估五个经验模型(即 de Vries (1963) 模型 (de Vries 1963)、Campbell (1985) 模型 (Campbell1985)、Johansen (1975) 模型 (Johansen 1975)、 Côté and Konrad (2005) 模型 (Côté and Konrad 2005) 和 Lu et al. (2007) 模型 (Lu 2007))和七种机器学习 (ML) 算法(即决策树 (DT)、随机森林 (RF) , Gradient Boosting Decision Tree (GBDT), Linear Regression (LR), K-Nearest Neighbors (KNN), 神经网络 (NN), and Gaussian Process (GP)) 来估计λ. 我们的结果表明,ML 的平均均方根误差 (RMSE) 值分别是验证集和测试集上经验模型值的 66% 和 82%。三种最佳 ML 算法(GBDT、NN、RF)的性能明显优于三种最佳经验模型(Lu 2007、Côté 和 Konrad 2005、Johansen 1975):ML 的0.183 < RMSE < 0.259 (W m -1 K -1 )算法和 0.293 < RMSE < 0.320 (W m -1 K -1) 用于经验模型。对于 ML,我们推荐 GBDT、NN 和 RF 算法。对于经验模型,我们建议在基于物理的模型 (DV1963) 和回归模型 (CG1985) 上使用三个归一化模型(Lu 2007、Côté 和 Konrad 2005、Johansen 1975)。RF 和 GBDT 算法进行的特征重要性排序表明,土壤水分含量和土壤容重是影响λ的最关键因素。土壤含水量和土壤容重合计占λ影响重要性值的 80% 以上。RF 给出了比 GBDT 更一致的特征重要性排名结果,因此,我们建议使用 RF 来选择特征。

更新日期:2022-07-14
down
wechat
bug