当前位置: X-MOL 学术Int. J. Environ. Res. Public Health › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Respiratory Diseases, Malaria and Leishmaniasis: Temporal and Spatial Association with Fire Occurrences from Knowledge Discovery and Data Mining.
International Journal of Environmental Research and Public Health Pub Date : 2020-05-25 , DOI: 10.3390/ijerph17103718
Lucas Schroeder 1 , Mauricio Roberto Veronez 1 , Eniuce Menezes de Souza 2 , Diego Brum 1 , Luiz Gonzaga 1 , Vinicius Francisco Rofatto 3
Affiliation  

The relationship between the fires occurrences and diseases is an essential issue for making public health policy and environment protecting strategy. Thanks to the Internet, today, we have a huge amount of health data and fire occurrence reports at our disposal. The challenge, therefore, is how to deal with 4 Vs (volume, variety, velocity and veracity) associated with these data. To overcome this problem, in this paper, we propose a method that combines techniques based on Data Mining and Knowledge Discovery from Databases (KDD) to discover spatial and temporal association between diseases and the fire occurrences. Here, the case study was addressed to Malaria, Leishmaniasis and respiratory diseases in Brazil. Instead of losing a lot of time verifying the consistency of the database, the proposed method uses Decision Tree, a machine learning-based supervised classification, to perform a fast management and extract only relevant and strategic information, with the knowledge of how reliable the database is. Namely, States, Biomes and period of the year (months) with the highest rate of fires could be identified with great success rates and in few seconds. Then, the K-means, an unsupervised learning algorithms that solves the well-known clustering problem, is employed to identify the groups of cities where the fire occurrences is more expressive. Finally, the steps associated with KDD is perfomed to extract useful information from mined data. In that case, Spearman’s rank correlation coefficient, a nonparametric measure of rank correlation, is computed to infer the statistical dependence between fire occurrences and those diseases. Moreover, maps are also generated to represent the distribution of the mined data. From the results, it was possible to identify that each region showed a susceptible behaviour to some disease as well as some degree of correlation with fire outbreak, mainly in the drought period.

中文翻译:

呼吸系统疾病,疟疾和利什曼病:知识发现和数据挖掘中火灾与时空的关联。

火灾发生与疾病的关系是制定公共卫生政策和环境保护战略的重要问题。借助互联网,今天,我们可以获得大量的健康数据和火灾报告。因此,面临的挑战是如何处理与这些数据相关的4 V(体积,种类,速度和准确性)。为了克服这个问题,在本文中,我们提出了一种结合数据挖掘和数据库知识发现(KDD)技术的方法,以发现疾病与火灾之间的时空关联。在这里,案例研究针对的是巴西的疟疾,利什曼病和呼吸系统疾病。所提出的方法使用决策树,而不是浪费大量时间来验证数据库的一致性,基于机器学习的监督分类,以了解数据库的可靠性,执行快速管理并仅提取相关和战略性信息。即,可以确定几率高的州,生物群系和火灾发生率最高的年份(月)。然后,使用K-means(一种解决了众所周知的聚类问题的无监督学习算法)来识别发生火灾的城市更具表现力的城市群。最后,执行与KDD相关的步骤以从挖掘的数据中提取有用的信息。在那种情况下,计算Spearman的等级相关系数,这是等级相关性的非参数度量,以推断火灾发生与那些疾病之间的统计依赖性。此外,还生成地图以表示挖掘数据的分布。从结果可以确定,每个地区对某些疾病表现出易感行为,并且与火灾的爆发有一定程度的相关性,主要是在干旱时期。
更新日期:2020-05-25
down
wechat
bug