Support vector machine enhanced empirical reference evapotranspiration estimation with limited meteorological parameters

https://doi.org/10.1016/j.compag.2020.105577Get rights and content

Highlights

  • Accurate estimation of ET0 is needed although meteorological parameters are limited.

  • Simple empirical models gave estimation with insufficient accuracy and stability.

  • SVM has high computational cost when the problems have high dimensionality.

  • SVM mimics empirical models to reduce the dimensions and enhances ET0 estimations.

  • The SVM-Rom model was recommended for tropical climate regions.

Abstract

In regions where economies that are highly dependent on agricultural activities, estimation of evapotranspiration is vital for scheduling and managing the water resources. In this study, the estimation of the reference evapotranspiration (ET0) was performed for the west coast of Peninsular Malaysia, where oil palm is the major crop. Three stations, namely Station 48603 (Alor Setar), Station 48620 (Sitiawan) and Station 48650 (KLIA Sepang) were selected representing different regions along the west coast. Estimation of ET0 using the popular and conventional Penman-Monteith (PM) model is data intensive. On the contrary, simpler empirical models such as the temperature-based Hargreaves-Samani (HS), mass transfer-based Romanenko and radiation-based Makkink models are less accurate, unstable and are greatly affected by local climate conditions. In order to solve the problems of the PM and other simple empirical models, the SVM was used to mimic the simple empirical models. The results of this study revealed that the HS and Makkink models were favorable for ET0 estimation at the northern region, whereas for the central region, the Makkink model was not suitable. Simple averaging of the empirical models improved the estimations. However, the model was still deemed to be poor as the MARE ranged from 0.087 to 0.099, with the R2 still being possible to dip below 0.900. It was found that the SVM could greatly improve the ET0 estimation. Unlike the empirical models, the regional factor had minimal effect on the ET0 estimation; where the SVM-Mak model performed better than the SVM-HS and SVM-Rom models over the whole study area. The best performance was achieved by SVM-Mak model at Station 48620 (Sitiawan) where the MARE was only 0.020 when compared to the chosen standard PM model. It was found that the simple averaging of the SVM models could further boost the performance, but such an approach was non-rewarding. The SVM models were found to be better alternatives for the empirical models when using similar inputs. The study concluded that the SVM model which imitated the Makkink model was suitable for ET0 estimation along the west coast of Peninsular Malaysia.

Introduction

A great amount of water is being circulated around the hydrological cycle in order to sustain the soil water budget and energy balance in the atmosphere (Fan et al., 2018). According to Falamarzi et al. (2014), this amount of water can be as high as 60% of the total global precipitation. Evapotranspiration (ET) is the combined effect of the evaporation of water from the moist soil and wet plant surfaces as well as the escape of water molecules from the stoma of vegetation through the transpiration process. It has been regarded as one of the most important component in the hydrological cycle as it is a measure of surface water to the ambient (Jing et al., 2019). Due to its vital role in the hydrological cycle, careful and accurate estimation of ET is extremely important for proper water resources management, especially in the plantation industry where irrigation is essential for the sustainability of an agricultural economy. Such are the cases that are happening in many agronomy-based countries, one of them is Malaysia, which is located close to the Equator belt of the Earth with a tropical climate. Based on the statistics provided by the Department of Statistics Malaysia, oil palm had contributed 37.9% of the nation’s gross domestic product (GDP) followed by other agriculture activities which had a share of 25.1% (Mahidin 2019). Another study which deployed the Google Earth Engine and machine learning tools to map the distribution of oil palm plantations showed that the 2.71 million hectares covered by oil palm plantations are highly concentrated in the upper west coast line of Peninsular Malaysia, as well as the central southern part (Shaharum et al., 2020). Hence, for these regions, ET shall have to be well reported so that the policy makers are able to draw appropriate decisions regarding water resources allocation.

Traditionally, ET can be measured by using the lysimetric method (Holmes, 1984). However, the disadvantages of using lysimeters are the high setup and operational costs as well as low area coverage, not to mention the tedious laborious work that goes with it. Hence, there is increasing demand for robust models working on meteorological data from weather stations. The United Nations Food and Agricultural Organization had proposed and recommended the Penman-Monteith (PM) model to be the standard for estimating reference evapotranspiration (ET0). The significance of the introduction of ET0 is that it provides the ET rate from the surface of ideal crop, which is 8 to 15 cm tall, green grass cover of uniform height, actively growing, completely shading the ground and not short of water” (Allan et al., 1998). ET0 can then be multiplied with the crop coefficient (depending on the crops and the stage of crop growth) to obtain the potential ET. However, the utilization of the PM model requires a large amount of data, which could be costly and also have to be tediously and meticulously collected. Thence the reason for this study to look for an alternative simpler model for the estimation of ET0.

Considerable efforts had taken place over the past years in order to develop simple models for ET0 estimation with the sole purpose of working with a reduced amount of meteorological data. Generally, the approach is to generate simpler empirical models. There are three main categories of simple empirical models, namely the temperature based, radiation based and mass transfer-based models. Some of the recently developed models and their respective categories are showed in Table 1.

The performance of the empirical models in estimating ET0 had been study extensively. Tabari et al. (2011) compared 31 empirical models to estimate ET0 under humid condition. The study was carried out in Iran and it was concluded that temperature-based and radiation-based models had better performances than the mass transfer-based models. The HS model had the closest estimation as compared to the standard PM model. Valle Júnior et al. (2020) did a comparative study on 29 empirical models to estimate ET0 for the Brazilian savanna. Their investigations revealed that the PT model as well as other radiation-based models could perform better than both the temperature and mass transfer based models. Kumar et al. (2012) reported that the HS model was suitable for ET0 estimation in semi-arid climate whereas the radiation-based models were favored for humid environment. However, another study at the Senegal River Valley showed that mass transfer models were more accurate (Djaman et al., 2015). Therefore, it is obvious that the main drawback of empirical models is their lack of consistency where the accuracies are much dependent on local calibrations as well as climate conditions (Ali Ghorbani et al., 2018). Simple empirical models just do not have the universal approach needed for ET0 estimation (Kisi et al., 2017, Gafurov et al., 2018).

In order to overcome the dependency on data (shortcoming of the PM model) and lack of consistency (other simpler empirical models), machine learning tools were developed to take over the task of estimating ET0. Citakoglu et al. (2013) studied the performance of the artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS) in predicting the mean reference evapotranspiration in Turkey. Saggi and Jain (2019), on the other hand, utilized deep learning to estimate the ET0 in the northern region of India. As ET0 estimation is a highly complex hydrological process, the support vector machine (SVM) proposed by Cortes and Vapnik (1995) had attracted the attention of many researchers worldwide. This is due to its ability to handle extremely complex problems by mapping the inputs into a feature space via a kernel function transformation (Raghavendra and Deka, 2014). Besides, Moazenzadeh et al. (2018) opined that it has fast processing speed with outstanding accuracy. These properties of the SVM allow it to be applied in many hydrological models, including the estimation of ET0. It was started by Kisi and Cimen (2010) when they compared the SVM with several empirical models such as the Turc, Ritchie and Penman models in California. The study proved that the SVM has the potential to be used as an alternative to those empirical models. A similar approach was taken by Shrestha and Shukla (2015). The authors further enhanced the work of Kisi and Cimen (2010) by using the SVM to estimate the crop coefficients of pepper and watermelon. Along with the development of other soft computing techniques, the performance of SVM is always contrasted with another machine learning tool. For instance, Tabari et al. (2012) compared the performance of the SVM with the ANFIS for the estimation of ET0 in semi-arid environment. The SVM was also being compared with ANN models to estimate ET0 in Brazil where meteorological parameters were limited (Ferreira et al., 2019, Wen et al., 2015). The SVM could performed better in certain cases, depending on the types of inputs as well as the climatic conditions. However, some had claimed that the SVM is not suitable for dealing with high dimensional problems such as ET0 estimation (Raghavendra and Deka, 2014, Chia et al., 2020). This is due to the increasing computational cost that will burden the computers (Huang et al., 2019). In other words, in order to achieve the optimum performance of the SVM, minimal inputs needed to be fed into the model.

The issue that bugs the PM model as well as the SVM is similar: that is, high data requirement that will impair their suitability and performance. The purpose of this study is to develop a simple ET0 estimation tool that does not need site calibration. Hence, in this study, the authors attempted to use the SVM to imitate or mimic the simple empirical models. In addition, the black box operation feature of the SVM could provide it with more flexibility to overcome the rigid nature inherent in empirical models. The performance of individual SVM models were compared with their respective empirical models. Three notable and remarkable empirical models, one from each category was chosen: HS model (temperature based), Romanenko model (mass transfer based) and Makkink model (radiation based). At the end of the modelling process, the simple averaging of empirical and SVM models were computed. As far as the authors of this paper are concerned, there has been no study carried out on the application of SVM for ET0 estimation in a region with a tropical climate, let alone for the dense region rich with oil palm plantations. The publication of this paper could fill the research gap of this field, subsequently to ease the decision-making process of the water policy makers. The overall conceptualization of this paper is illustrated in Fig. 1.

Section snippets

Study area and data

Three meteorological stations located at the west coast of the Peninsular Malaysia were selected as the study sites. A tropical climate exists in this area where it is warm and humid all year long. The selected stations are Station 48603 (Alor Setar), Station 48620 (Sitiawan) and Station 48650 (Kuala Lumpur International Airport, KLIA Sepang). These stations were chosen due to the high land coverage of oil palm plantations where accurate ET0 estimation is important. Daily meteorological data

Calibration of empirical models

One of the characteristics of ET0 estimating empirical models is the essential calibration when used at different regions. This can help to maximize the accuracy of the model at the area of interest. Table 5 shows the calibrated coefficients of the HS, Romanenko and Makkink models at different stations based on the equations shown in Table 3.

From Table 5, it can be seen that empirical models do not have a universal form for all regions. For the Romanenko model, Station 48603 (Alor Setar) and

Conclusions

ET0 had been estimated in three stations along the west coast of Peninsular Malaysia, where the climate is tropical and had vast abundance of oil palm plantations. The estimations were done via two approaches, which were using the empirical and SVM models. The three chosen empirical models were the temperature-based HS model, mass transfer-based Romanenko model as well as the radiation-based Makkink model. Calibration was done to fit the empirical models so that they suit the local conditions

CRediT authorship contribution statement

Min Yan Chia: Investigation, Writing - original draft, Methodology. Yuk Feng Huang: Writing - review & editing, Resources, Software, Supervision, Conceptualization. Chai Hoon Koo: Validaion, Funding Acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

This research was funded by Universiti Tunku Abdul Rahman (UTAR), Malaysia through Universiti Tunku Abdul Rahman Research Fund under project number IPSR/RMC/UTARRF/2018-C2/K03.

References (35)

  • N.S.N. Shaharum et al.

    Oil palm mapping over Peninsular Malaysia using Google Earth Engine and machine learning algorithms

    Remote Sens. Appl.: Soc. Environ.

    (2020)
  • N.K. Shrestha et al.

    Support vector machine based modeling of evapotranspiration using hydro-climatic variables in a sub-tropical environment

    Agric. For. Meteorol.

    (2015)
  • H. Tabari et al.

    SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment

    J. Hydrol.

    (2012)
  • L.C.G. Valle Júnior et al.

    Comparative assessment of modelled and empirical reference evapotranspiration methods for a Brazilian savanna

    Agric. Water Manage.

    (2020)
  • M. Ali Ghorbani et al.

    Forecasting pan evaporation with an integrated artificial neural network quantum-behaved particle swarm optimization model: a case study in Talesh, Northern Iran

    Eng. Appl. Comput. Fluid Mech.

    (2018)
  • Allan, R. G., Pereira, L., Raes, D. and Smith, M. 1998. Crop evapotranspiration - Guidelines for computing crop water...
  • M.Y. Chia et al.

    Recent advances in evapotranspiration estimation using artificial intelligence approaches with a focus on hybridization techniques—a review

    Agronomy

    (2020)
  • Cited by (54)

    • Long-term forecasting of monthly mean reference evapotranspiration using deep neural network: A comparison of training strategies and approaches

      2022, Applied Soft Computing
      Citation Excerpt :

      Machine learning models come in handy for solving the issue of data restrictions [5]. Several models have been proposed in the past, such as the artificial neural network (ANN) [6,7], support vector machine [8,9], tree model [10,11], fuzzy inference system [12–14] and even ensemble models [15,16]. Recently, the modern deep learning techniques, which is a part of the machine learning methods based on ANN but albeit with better learning capability, were introduced to the agricultural sector.

    View all citing articles on Scopus
    View full text