当前位置: X-MOL 学术Int. J. Appl. Earth Obs. Geoinf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A data mining approach for global burned area mapping
International Journal of Applied Earth Observation and Geoinformation ( IF 7.5 ) Pub Date : 2018-06-15 , DOI: 10.1016/j.jag.2018.05.027
Rubén Ramo , Mariano García , Daniel Rodríguez , Emilio Chuvieco

Global burned are algorithms provide valuable information for climate modellers since fire disturbance is responsible of a significant part of the emissions and their related impact on humans. The aim of this work is to explore how four different classification algorithms, widely used in remote sensing, such as Random Forest (RF), Support Vector Machine (SVM), Neural Networks (NN) and a well-known decision tree algorithm (C5.0), for classifying burned areas at global scale through a data mining methodology using 2008 MODIS data. A training database consisting of burned and unburned pixels was created from 130 Landsat scenes. The resulting database was highly unbalanced with the burned class representing less than one percent of the total. Therefore, the ability of the algorithms to cope with this problem was evaluated.

Attribute selection was performed using three filters to remove potential noise and to reduce the dimensionality of the data: Random Forest, entropy-based filter, and logistic regression. Eight out of fifty-two attributes were selected, most of them related to the temporal difference of the reflectance of the bands. Models were trained using an 80% of the database following a ten-fold approach to reduce possible overfitting and to select the optimum parameters.

Finally, the performance of the algorithms was evaluated over six different regions using official statistics where they were available and benchmark burned area products, namely MCD45 (V5.1) and MCD64 (V6). Compared to official statistics, the best agreement was obtained by MCD64 (OE = 0.15, CE = 0.29) followed by RF (OE = 0.27, CE = 0.21). For the remaining three areas (Angola, Sudan and South Africa), RF (OE = 0.47, CE = 0.45) yielded the best results when compared to the reference data. NN and SVM showed the worst performance with omission and commission error reaching 0.81 and 0.17 respectively. SVM and NN showed higher sensitivity to unbalanced datasets, as in the case of burned area, with a clear bias towards the majority class. On the other hand, tree based algorithms are more robust to this issue given their own mechanisms to deal with big and unbalanced databases.



中文翻译:

用于全局烧伤区域映射的数据挖掘方法

全局燃烧算法为气候建模者提供了有价值的信息,因为火灾扰动是排放的很大一部分及其对人类的相关影响。这项工作的目的是探索广泛应用于遥感的四种不同分类算法,例如随机森林(RF),支持向量机(SVM),神经网络(NN)和著名的决策树算法(C5) .0),用于通过使用2008 MODIS数据的数据挖掘方法在全球范围内对燃烧区域进行分类。从130个Landsat场景创建了一个包含已烧像素和未烧像素的训练数据库。生成的数据库高度不平衡,被烧掉的类占不到总数的百分之一。因此,评估了该算法处理此问题的能力。

使用三个过滤器执行属性选择,以消除潜在的噪声并降低数据的维数:随机森林,基于熵的过滤器和逻辑回归。在52个属性中选择了8个,其中大多数与频段反射率的时间差异有关。遵循十倍方法,使用80%的数据库训练模型,以减少可能的过度拟合并选择最佳参数。

最后,使用可获得的官方统计数据和基准燃烧区域产品(即MCD45(V5.1)和MCD64(V6))在六个不同区域对算法的性能进行了评估。与官方统计数据相比,MCD64(OE = 0.15,CE = 0.29)和RF(OE = 0.27,CE = 0.21)是最好的协议。与参考数据相比,在其余三个地区(安哥拉,苏丹和南非),RF(OE = 0.47,CE = 0.45)产生了最佳结果。NN和SVM表现最差,遗漏和委托误差分别达到0.81和0.17。SVM和NN对不平衡的数据集表现出更高的敏感度,例如在烧毁区域的情况下,明显倾向于多数阶层。另一方面,

更新日期:2018-06-15
down
wechat
bug