当前位置: X-MOL 学术Sugar Tech › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sugarcane Yield Prediction Through Data Mining and Crop Simulation Models
Sugar Tech ( IF 1.8 ) Pub Date : 2019-10-21 , DOI: 10.1007/s12355-019-00776-z
Ralph G. Hammer , Paulo C. Sentelhas , Jean C. Q. Mariano

The understanding of the hierarchical importance of the factors which influence sugarcane yield can subsidize its modeling, thus contributing to the optimization of agricultural planning and crop yield estimates. The objectives of this study were to identify and ordinate the main variables that condition sugarcane yield, according to their relative importance, as well as to develop mathematical models for predicting sugarcane yield by using data mining (DM) techniques. For this, three DM techniques were applied in the analyses of databases of several sugar mills in the state of São Paulo, Brazil. Meteorological and crop management variables were analyzed through the following DM techniques: random forest; boosting; and support vector machine, and the resulting models were tested through the comparison with an independent data set. Finally, the predictive performances of these models were compared with the performance of a simple agrometeorological model, applied in the same data set. The results allowed to conclude that, within all the variables assessed, the number of cuts was the most important factor considered by all DM techniques. The comparison between the observed yields and those estimated by the DM models resulted in a root mean square error (RMSE) ranging between 19.70 and 20.03 t ha−1, which was much better than the performance of the Agroecological Zone Model, which presented RMSE ≈ 34 t ha−1.

中文翻译:

通过数据挖掘和作物模拟模型预测甘蔗产量

对影响甘蔗产量的因素在等级上的重要性的理解可以帮助其建模,从而有助于优化农业计划和作物产量估计。这项研究的目的是根据蔗糖的相对重要性,确定和协调影响甘蔗产量的主要变量,并通过使用数据挖掘(DM)技术开发预测甘蔗产量的数学模型。为此,在巴西圣保罗州的数家糖厂的数据库分析中采用了三种DM技术。通过以下DM技术分析了气象和作物管理变量:随机森林;促进 支持向量机,并通过与独立数据集的比较来测试生成的模型。最后,将这些模型的预测性能与应用于相同数据集的简单农业气象模型的性能进行了比较。结果可以得出结论,在所有评估的变量中,切割的数量是所有DM技术考虑的最重要因素。将观测到的产量与DM模型估算的产量进行比较,得出的均方根误差(RMSE)为19.70至20.03 t ha-1,比农业生态区模型的表现要好得多,该模型的均方根误差(RMSE≈34 t ha -1)
更新日期:2019-10-21
down
wechat
bug