Introduction

Precision agriculture, as a crop management practice based on differential resource application at highly resolved space and time, requires readily available information about the crop status at a resolution that corresponds to its goals. Leaf area index (LAI), defined as the amount of (the upper side) of foliar area per unit of projected ground area, is one of the crop biophysical parameters that convey the most valuable information on plant status. The amount of LAI is directly linked to quantitative aspects of photosynthesis, respiration and canopy water exchange and, therefore, any change in LAI can be associated with changes in vegetation productivity (Bréda, 2003) resulting from plant health issues or nutritional status, which in turn, makes LAI an effective diagnostic parameter of crop status (Huang et al., 2019).

LAI is a key variable in process-based crop growth models (Abrahamsen & Hansen, 2000; Kersebaum, 2011; Li et al., 2015; Nendel et al., 2014) because it represents the interface between plant growth and radiation, water and gas exchange with the environment (Asam, 2014; Kasampalis et al., 2018). At the same time, LAI being a parameter with typical light interception and reflectance properties (Zheng & Moskal, 2009) and a dominant surface area component, it is particularly suitable to be characterized and measured by remote sensing. LAI quantification through remote sensing, therefore, offers a unique opportunity to incorporate measured reference values of this biophysical parameter into modeling routines to improve the accuracy of model predictions, in a process called data assimilation (Huang et al., 2019; Jin et al., 2018).

LAI is one of the most common canopy parameters that have been estimated, measured or mapped with remote sensing (Fang et al., 2019). Satellite remote sensing in comparison with other platforms has the advantage of offering readily available, sometimes free of charge, reflectance data over large areas. Satellites can also support monitoring programs through time spans of several years, ensuring long term information continuity through global LAI products (Defourny et al., 2019; Fang et al., 2019; Sinha et al., 2020). Most available global LAI products such as “LAI 300 m”, “LAI 1 km” (both from Copernicus Global Land Service), and “MODIS Leaf Area Index” (NASA), however, cannot offer LAI estimates at a spatial resolution higher than 300 m, which makes them unsuitable for precision agriculture. Nevertheless, low spatial and temporal resolutions (typical satellite limitations) are gradually being overcome with the development of recent satellites (Kooistra et al., 2012; Mulla, 2013), to the point that satellite-driven site-specific planning is now possible for larger crop fields.

European Space Agency’s Sentinel-2 mission represents the most effective combination of high to moderate spatial resolution, high multispectral resolution and data accessibility. These characteristics made Sentinel a perfect candidate to test different approaches to estimate LAI at a spatial scale appropriate for within-field differential crop management. Therefore, in the study presented here, a broad array of methods to estimate LAI in barley (Hordeum vulgare) before flowering stages, at relatively high resolution using Sentinel-2 data, was examined. The context of this study was provided by a European project with the focus on Barley fertilization and productivity.

There are several methods to estimate crop biophysical parameters using remote sensing information. Two main basic approaches are commonly available: empirical (or statistical) and physical (Richter et al., 2012b; Sadeh et al., 2021; Verrelst et al., 2019). In the first case, a direct statistical relationship between the reference vegetation parameter (for example, LAI measured in the field) and the remote sensing reflectance data is established. Reflectance information from sensor bands can be combined into vegetation indices (VIs) (Leonenko et al., 2013) or used directly without a priori considerations of band combination (Upreti et al., 2019). In the second case, typically machine learning is used (Dahms et al., 2016; Mao et al., 2019) given its ability to deal with high-dimensionality complex data structures and non-parametric distributions (Maxwell et al., 2018).

The physical approach uses light-plant interaction models (also known as radiative transfer models, or RTM) to relate vegetation spectra (model output) to the target biophysical parameter (model input; e.g. LAI). This relationship can be inverted by means of a model inversion to retrieve the target biophysical parameter (Asam, 2014; Boegh et al., 2013; Fang et al., 2019; Verrelst et al., 2019). Look-up tables (LUT) are one of the most common methods to generalize reflectance-LAI relationships to, then, be used to retrieve LAI from image reflectance (Richter et al., 2012b; Upreti et al., 2019). A physical model or RTM can also be used to train a neural network to establish reflectance-LAI relationships, in what is called a hybrid modeling-statistic approach (Weiss et al., 2002), as in the case of the Sentinel-2 for Agriculture system (Defourny et al., 2019).

Despite the vast number of studies applying and comparing any possible combination of these methods, and despite numerous review studies, there is still a fragmentary view of the advantages and disadvantages of each method in the context of what can be achieved with respect to the user’s goals and needs. Moreover, since these methods can be applied on different sensors and platforms and with different objectives, judgement on the relative efficacy and appropriateness of any of these methods has become more and more difficult. At the same time, the availability of tools for applying all these methods has been increasing dramatically in recent years. Therefore, it is suggested that testing many methods simultaneously on specific situations seems an apt way of achieving the necessary overview to assess the applicability of any potential method on precision agriculture.

Even though the availability of Sentinel-2 products has significantly expanded the monitoring capabilities for precision agriculture (Segarra et al., 2020), the prevalence of clouds and the need for precise timing, sensor flexibility and higher spatial resolution often makes it necessary to resource remotely piloted aircrafts (RPA) (Hunt et al., 2013; Schirrmann et al., 2016; Tsouros et al., 2019; Yang et al., 2017). This means that methods of retrieving key biophysical parameters need to be tested also for RPA-born sensors, which often have fewer bands than Sentinel-2. To account for this issue, most of the methods tested in this study were also applied, when possible, to a simpler multispectral configuration of four bands, RGB (Red, Green, Blue) + NIR (Near Infrared), with the hypothesis that simpler configurations can be as effective as more complex ones to retrieve LAI.

The goals of this study were: (a) to test a broad array of methods to estimate LAI in barley using Sentinel-2 data, (b) to determine the transferability of the methods to simpler RPA systems, and (c) to provide a comprehensive account of remote sensing-based LAI retrieval methods and compare them in terms of both, usability and performance.

Methods

Field data

Between the years 2018 and 2020, a total of nine barley experimental fields were established near Müncheberg and Dedelow (Brandenburg, Germany), and Gembloux (Wallonia, Belgium) (Table 1). Each field was subdivided into subplots (treatment units; TUs) where different levels of nitrogen fertilization were applied to produce crop growth variability. TUs in the Dedelow and Müncheberg fields were designed to be 10–25 times larger than 10 × 10 m to allow the inclusion of at least 2–3 Sentinel-2 pixels well centered in the field. TUs in Gembloux were designed for a different experiment, and were just 6 × 2 m.

Table 1 Field experiment characteristics

At Müncheberg (M) and Dedelow (D), crop leaf area index (LAI) was measured with a SunScan device (SS1 Canopy Analysis System, Delta-T Devices Ltd, Cambridge, UK.), following the company’s instructions. At each TU, depending on its size, 1–3 points were established and marked with a flag. Three SunScan readings were made within a radius of 1 m from the point flag at a microsite with uniform canopy characteristics. Readings were then averaged for each point and then averaged for each TU. At least one LAI measurement was taken between the time plants grew taller than 0.15–0.2 m (to obtain a LAI larger than 0.3, which was considered to be the SunScan sensitivity threshold) and the emergence of awns (at which point it was presumed that the reflectance of awns could significantly mask the physiological status of the blades in the canopy).

At Gembloux (G), the green leaf area index (GLAI) was measured by removing all plants from a 1 × 1 m sampling quadrat and measuring the projected area of each green barley blade and stem. The total sum of the green area was related to the biomass of the sample, and the relationship green area/biomass was used to extrapolate GLAI to the entire treatment plot. The distinction between GLAI (which is based only on green leaves) and LAI (which includes also senescent leaves) in a destructive method is experimentally possible because individual leaves can be discriminated by color and measured separately.

In the case of SunScan, since the effect of non-green leaves on the measurements is not well understood, it is not known whether the device measures LAI or GLAI. For this reason, in this study, it was decided to use the term LAI in the broad sense, as a generic term for the amount of plant tissue to be recorded by remote sensors.

Remote sensing data

Sentinel-2 satellites are part of a constellation of the European Space Agency satellites characterized by high frequency (3–5 days) and high resolution (10 m in the visual-NIR range) ground imaging (ESA, 2015). Their sensors are active in 13 spectral bands: 4 visible and 9 infra-red. Level 2A (L2A) data, representing reflectance at the bottom of atmosphere (BOA) in cartographic geometry, were used for the analyses. A series of Sentinel-2 images around the dates of LAI measurements at Dedelow and Müncheberg were downloaded and tested for cloud cover over the experimental fields.

Other remote sensing data, whose availability was restricted to a few images, was used to test the applicability of the LAI models to other sensors. These sensors comprised RapidEye and PlanetScope satellites, from the Dedelow and Müncheberg sites, and a remotely piloted aircraft (RPA) at Gembloux.

RapidEye data consist of five bands in the blue, green, red, red-edge and near-infrared ranges with a spatial resolution of 5 m. Images were acquired at Level 3A and then translated from DN into top-of-atmosphere reflectance with the method suggested in its image specification documentation (Planet webpage, Products, Planet archive: assets.planet.com /docs/ Planet_Combined_Imagery_Product_Specs_letter_screen.pdf). RapidEye satellites were decommissioned in 2020. PlanetScope data consist of four bands in the blue, green, red and near-infrared ranges with a spatial resolution of 3 m. The images used in this study corresponded to a Level 3B product type, containing surface reflectance values.

The RPA, equipped with a Sony RX100-MK II (Sony Corporation, Tokyo, Japan; bands: RGB) and a Micasense RedEdge-MX (MicaSense, AgEagle Sensor Systems Inc., Seattle, WA, USA; bands: BGR, Red Edge and NIR) cameras was flown at 50 m, producing images at 0.05 m pixel size. The software PIX4DMapper (Pix4D SA, Switzerland) was used for radiometric pre-processing and mosaicking.

Analyses

Vegetation indices

A total of 12 Sentinel, one RapidEye and one PlanetScope satellite images were selected for the analyses. TUs at Dedelow and Müncheberg were located on the images and 2–3 pixels were selected at the center of each TU and at least 10 m away from the plot limit to avoid mixed pixels. Values of these pixels were averaged per TU to be used in the analyses.

Vegetation indices (VIs) chosen from the current literature (Clevers et al., 2017; Frampton et al., 2013; Haboudane et al., 2002; Kross et al., 2015; Prabhakara et al., 2015) were tested to explore the possibility of predicting LAI with satellite data using selected band combinations (Table 2).

Table 2 List and definition of vegetation indices used

The parameter S in the WDVI was calculated with pixels that could be visually identified as bare soil on the images. Whenever possible, pixels of varied brightness were chosen to better represent the red-near-infrared (NIR) soil line.

The relationship between the measured LAI and the satellite indices was initially explored using Sentinel images from 2018 and 2019, which corresponded to field experiments designed for this purpose. VI-LAI functional relationship was described by a non-linear least squares curve fitting approach (Shattuck, 2015). Index performance as a predictor of LAI was assessed by measuring the tightness of the fit. Plots established at Müncheberg in 2020 were later used to test the index performance.

Additional satellite images (RapidEye and PlanetScope) were radiometrically normalized to Sentinel data to make sensor reflectance values comparable. Invariant features of different brightness intensities were visually identified, and their pixel values plotted in a red-NIR space to obtain a linear function to be applied to recalculate the target image pixels. The best performing indices were then calculated with the normalized reflectance values to assess the stability of the indices across sensors.

The site at Gembloux was used to test the appropriateness of using RPA reflectance information on the best performing VIs. Out of the 6-m of length of each TU, 2 m were reserved for destructive sampling, leaving a TU effective area of 2 × 4 m. A polygon of 1 × 2 m was overlaid on the RPA images at the center of each TU effective area and the pixels enclosed were used to obtain an average reflectance value for each of the five bands. The VI values calculated with this information and the LAI measured were compared with the already built VI models described above.

The VI-based models that best correlated with measured LAI were further validated with a leave-one-out cross-validation approach using the ARTMO (Automated Radiative Transfer Model Operator) tool (Rivera et al., 2013) on the same dataset used for the VIs. Predicted LAI values from each of the selected indices were used as variables to be processed in ARTMO with a series of machine learning algorithms described below (See Machine learning regression section).

Machine learning regression

A series of machine learning regression algorithms (MLRA) from the ARTMO tool (Rivera et al. 2014), version 3.28, was used for selecting potential models to predict LAI from Sentinel bands. In principle, MLRAs can be applied using all the 12 Sentinel-2 bands. However, since most high to very high resolution satellites and drones are equipped with sensors in the visible-NIR spectrum only, the applicability of the regression models to these sensors, made it necessary to analyze the possibility of using only the four most prevalent bands, which are the bands blue, green, red and NIR.

Pixel values of the TUs from the 12 Sentinel-2 images (years 2018–2020, 76 measurements) used for the indices analyses, together with the field LAI measurements were combined in a database to apply the following MLRAs: Bagging Trees (BaT), Boosting Trees (BoT), Gaussian process regression (GPR), least squares linear regression (LSLR), partial least square regression (PLSR), principal components regression (PCR), random forest fit ensemble (RFFE), random forest tree bagger (RFTB), regression tree (RT), relevance vector machine (RVM) and support vector regression (SVR). Regression performance as a predictor of LAI was assessed by measuring the tightness of the fit. Model validation was performed using the k-fold cross-validation method on the same dataset. This method randomly divides the observations into subsamples according to the number “k” provided by the user (in this study k = 10). All subsamples minus one (k-1) are then used to train the dataset and the excluded subsample is used to validate the results. This process is repeated as many times (k times) as subsets exist.

PROSAIL inverse modeling and look up tables

PROSAIL, a tool to predict the reflection spectra of plant canopies as measured by remote sensing (Jacquemoud et al., 2009), uses the interaction of two models: PROSPECT and SAIL. PROSPECT is a radiative transfer model of a leaf, based on a simple representation of its physical and chemical structure (Jacquemoud & Baret, 1990); and SAIL is a canopy reflectance model that produces reflectance values (Verhoef, 1984). When the reflectance is known (as measured by a satellite, for instance), inverting PROSAIL allows estimation of the crop target variable, in the present case LAI, by creating possible scenarios that can be represented in the form of look up tables (LUT).

The entire process was performed with the ARTMO tool. The model versions Prospect 4 (leaf model) and 4Sail (canopy model) were chosen for the forward modeling. Input parameters for both models were taken from known value ranges from wheat (Berger et al., 2018), in situ measurements and authors’ experience (Table 3). Soil reflectance spectra were obtained from bare soil pixels on the corresponding images.

Table 3 Parameter value used in the forward PROSAIL run to generate the LUT

The Latin Hypercube Sampling (LHS) option was chosen to generate a sample of 5000 model outputs (forward modeling). Each of these outputs represents a “scenario”, that is, a spectrum originated from a given, possible value of leaf area index. LHS generates a near-random sampling of parameters while ensuring a good representation (full stratification) of the sample variability (McKay et al., 2000).

In the following step, the ARTMO tool utilizes a cost function (or merit function) (Verrelst et al., 2013), to compare satellite input spectra to the spectra generated with the forward modeling, with the aim of retrieving candidate LAI values from the inverse modeling (Verrelst et al., 2019). Out of more than 60, seven cost functions were selected: RMSE, Geman and McClure, K(x) = log(x) + x, K(x) = (log(x))2, least absolute error, power divergence measure, and trigonometric (Leonenko et al., 2013; Rivera et al., 2013; Verrelst et al., 2013). In the next step, Gaussian noise of 5% and 10% were introduced and the possibility of choosing two multiple best solutions were selected.

To test the possibility that the inversion process can be accurate enough if done with four bands in the visible-NIR range (as in a typical drone camera), as opposed to the 12 Sentinel bands, the entire analysis was repeated with the Sentinel B, G, R and NIR (2, 3, 4 and 8) bands.

Sentinel-2 for agriculture

Sen2-Agri is an open-source operational system consisting of a processing chain that generates a set of products for agriculture monitoring from Sentinel-2 and Landsat 8 satellite data. Products include image composites of cloud-free surface reflectance data (L3A), cropland masks (L4A) and crop type maps (L4B), and layers with normalized difference vegetation index (NDVI) and leaf area index (LAI) (L3B) (Defourny et al., 2019).

An important feature of Sen2-Agri is that once installed, it can automatically connect to a set of configurable data sources, download the necessary satellite level 1 or level 2 products and generate all the desired products from a chosen area at a selected season. Since the system uses Sentinel-2 data, products can be delivered at 10-m resolution. LAI layer retrieval is performed from the Sentinel-2 bands 3–9, 12 and 13 using machine learning to build a non-linear regression model. The regression model is estimated using reflectance values simulated with the PROSAIL model (see 0).

About 100 Sentinel-2 images covering the barley growing seasons at the Dedelow and Müncheberg sites in 2018, 2019 and 2020 were processed with the Sen2-Agri system to produce the corresponding LAI maps. TUs were identified and 2–3 pixels were selected at the center of each TU and at least 10 m away from the plot limit to avoid mixed pixels. Estimated LAI values from these pixels were averaged per TU at selected image dates to be compared with the ground LAI data.

Results

Field data

LAI values measured on the field ranged from 0.53 to 5.15, with an average of 1.86. Averages by site were 1.82, 2.45 and 1.85 for Dedelow, Gembloux and Müncheberg respectively (Fig. 1). Averages by crop were 1.84 and 2.38 for spring and winter barley respectively (Fig. 1).

Fig. 1
figure 1

Distribution of LAI measured on the field according to A site and B crop type. SB Spring barley, WB Winter barley. Cross marker indicates the mean value

The Gembloux site showed a more symmetrical distribution of LAI compared to the other sites. WB developed a significantly higher LAI than SB.

Analyses

Vegetation indices

A preliminary analysis of the correlation between indices and LAI revealed the presence of some obvious outliers. Images from two dates in Müncheberg 2018 had hazy pixels over the spring barley fields and needed to be discarded. A visual inspection of the WDVI performance suggested that a standardization analogous to NDVI would improve its correlation to LAI, and therefore a normalized WDVI (NWDVI) was created using Eq. 1:

$${\text{NDWVI}} = {\text{WDVI}}/\left( {{\text{B}}8 + {\text{B}}4} \right)$$
(1)

where WDVI is the weighted differential vegetation index as defined in Table 2, and B8 and B4 are the Sentinel bands 8 and 4 respectively.

Table 4 VI-LAI curve fitting

The closest fit to LAI was observed for OSAVI, NWDVI and NDVI84 (i.e. the conventional NDVI with bands NIR and red) with a coefficient of determination (r2) of 0.905, 0.894 and 0.893 respectively (Table 4). If a fixed value of the constant S in NWDVI is used, then the overall relationship of this VI to LAI is almost identical to NDVI.

The results of applying the fitted models to the two best performing indices show that the predictions can adequately replace the observations, even when 2020 survey data are included (Fig. 2). This inclusion did not constitute a true validation test, but just a preliminary test. A rigorous validation was done by cross-validation as described in Methods.

Fig. 2
figure 2

Prediction accuracy of the indices OSAVI and NWDVI. Black points represent values from 2018 and 2019, used to build the models; grey points are values from 2020

Fig. 3
figure 3

Prediction accuracy of OSAVI for three satellite sensors

The same OSAVI and NWDVI models were applied to PlanetScope and RapidEye images, which resulted in a slight reduction of the coefficient of determination (Table 5), from 0.905 to 0.857 in the case of OSAVI (Fig. 3) and from 0.894 to 0.852 in the case of NWDVI (Fig. 4, left). However, when the S parameter in NWDVI (see definition in Table 2) was obtained from each sensor separately, the overall fit increased to 0.881 (Fig. 4, right). The improvement can be seen clearly when focusing on the changes in squares and empty circles on Fig. 4.

Fig. 4
figure 4

NWDVI prediction accuracy for three satellite sensors. Left, with the same soil parameter value for all sensors. Right, with soil parameters extracted for each sensor

Fig. 5
figure 5

NWDVI prediction accuracy for all sensors

Table 5 VI prediction accuracy for all satellite sensors (Sentinel, PlanetScope and RapidEye) together using the models described in Table 4

Data extracted from RPA at the Gembloux site was included in the LAI-NWDVI relationship. A new exponential function was fitted with the data, achieving a r2 of 0.794 with a RMSE of 0.324 (Fig. 5).

As described in Methods, the VI-based models that best correlated with measured LAI were processed with the ARTMO tool to validate the predictions. Using the same dataset (years 2018 and 2019), predicted LAI with OSAVI and NWDVI were used (separately) as variables to be processed in ARTMO with a series of machine learning algorithms and validated with a leave-one-out cross-validation approach. Cross validation between the VIs and observed LAI with the machine learning algorithm least squares linear regression (LSLR) resulted in RMSEs of 0.32–0.34 (Table 6). Coefficients of determination resulted in similar values (0.88–0.89; Table 6) to those obtained in the curve fitting procedure (Table 5).

Table 6 Cross validation results of the best performing vegetation indices processed with the LSLR algorithm

Machine learning regression

Unlike the VI approach where only 2018 and 2019 were used to build the model, for the machine learning regressions, it was necessary to increase the database size and therefore, all three years were included.

Relevance vector machine (RVM) and Gaussian process regression (GPR) were the best performing algorithms for both 12- and 4-band configurations, with RMSEs of 0.34–0.37 and coefficients of determination of 0.80–0.83 (Table 7). Both in terms of RMSE and r2, the 4-band configuration performed slightly better than the 12 bands (Table 7). Predicted-observed relationships look very similar for both 12- and 4-band configurations (Figs. 6 and 7) with a tendency to diverge from the observations at higher values of LAI.

Table 7 Performance of the machine learning regression algorithms (ordered by accuracy with 12 bands) with all 12 Sentinel bands and with the four 10-m visual-NIR bands
Fig. 6
figure 6

LAI prediction accuracy of the machine learning algorithms using all 12 Sentinel bands

Fig. 7
figure 7

LAI prediction accuracy of the machine learning algorithms using 4 visual-NIR Sentinel bands

PROSAIL inverse modeling and look up tables

In an initial step, PROSAIL generated reflectance data (canopy spectra) using the range of plant parameters, including LAI, proposed in Table 3. Spectra generated with the forward modeling showed a pattern consistent with the expected spectra shape for canopies ranging from low to high LAI (Fig. 8), where low LAI spectra displayed high reflectance in red and blue wavelengths (depicted with orange-yellow tones), typical of sparse, less green canopies.

Fig. 8
figure 8

Look-up table (LUT) vegetation spectra (N = 5000) generated by PROSAIL in the forward modeling process. Colors represent the value of leaf area index (LAI) corresponding to each spectrum

The two cost functions resulting in best fits, Geman and McClure and least absolute error, showed RMSEs of around 0.5 and 0.6 for the tests with 12 Sentinel bands and the four visible-NIR bands, respectively (Table 8). The coefficient of determination (r2) was somewhat higher in the case of the 4-band configuration (0.84) than the 12 bands (0.81), for the two best performing cost functions (Table 8).

Table 8 Validation of retrieved LAI from the simulated spectra in PROSAIL with respect to measured LAI, for tests with 12 and 4 satellite bands

When plotting the predicted versus the observed LAI, it can be seen that the relatively high RMSE (in comparison with the RMSE of other methods in previous and subsequent sections) was due to a bias in the prediction towards the underestimation of LAI, more noticeable in the 4-band configuration (Fig. 9).

Fig. 9
figure 9

Inverse modeling. LAI retrieval using the Geman and McClure (G&M) cost function with 12 Sentinel bands (left) and 4 visual-NIR bands (right)

Sentinel-2 for agriculture

Sen2-Agri was able to generate more than 100 LAI products from Müncheberg and Dedelow for the 2018–2020 period. These included products with missing information due to clouds and other image quality issues.

The relationship between values extracted from the LAI products and the ground surveys reached a coefficient of determination (r2) of 0.856 with a RMSE of 0.304 (Fig. 10). One LAI product from April 19, 2019, few days away from a ground LAI measurement of winter barley at Müncheberg was missing. Sen2-Agri’s quality control system most likely discarded the entire image, even though the information on this particular experimental plot was suitable for the analysis.

Fig. 10
figure 10

Sen2-Agri model accuracy. D Dedelow, M Müncheberg

Comparison between methods

The two best performing VI-based models constructed with the NLSCF tool were processed with ARTMO to see if both tools would provide comparable RMSE and r2 values. To do this, OSAVI and NWDVI were used separately in place of satellite bands as variables to be processed in ARTMO with the LSLR algorithm and validated with a leave-one-out cross-validation approach. OSAVI and NWDVI statistics were very similar (maximum difference of ± 0.024) with both ARTMO and NLSCF methods (Table 9) indicating that the fitting procedures and the resulting predictions were comparable.

Table 9 Comparison between model fitting methods

Sen2-Agri is the only method that did not use data from this study to develop the predictive model. A comparison between predictions from Sen2-Agri and any of the other methods provides an estimation of the extent at which predicted LAI correspond to physical variables related to one another. A comparison between predictions from Sen2-Agri and OSAVI (Fig. 11), for example, showed a very tight linear relationship, suggesting that both methods truly describe the same crop parameter.

Fig. 11
figure 11

Comparison between OSAVI and Sen2-Agri LAI prediction models

Discussion

Most of the indices tested, that is NDVIs using bands 7 or 8, simple ratios (SR), OSAVI and the normalized WDVI, resulted in RMSEs lower than 0.4 and r2 higher than 0.8. This is consistent with previous studies on VIs on crops, using satellites (Asam, 2014; Boegh et al., 2013; Clevers et al., 2017; Dong et al., 2019; Reisi Gahrouei et al., 2020) and RPAs (Marino & Alvino, 2019; Yao et al., 2017). The vegetation indices that showed the highest accuracy to measure LAI in barley seem particularly convenient because they only require red and NIR bands, the two most common bands in all multispectral platforms.

In almost all VIs, there was an exponential-type relationship to LAI with a clear tendency to underestimate LAI values around 5 (Figs. 2, 3, 4, 5). This saturation effect of indices at higher values of LAI continues to be a drawback of many indices (Cao et al., 2017; Dong et al., 2019; Gu et al., 2013), however, in cases such as model data assimilation in which often LAI estimates are needed at early stages of crop development, indices can still be useful.

Another common limitation of indices is the effect of soil reflectance at low levels of plant cover, a problem intended to be solved by OSAVI (Steven, 1998) and WDVI (Clevers, 1989). Adding a normalization term to WDVI improved the prediction substantially (compare WDVI with NWDVI in Table 4). Moreover, the possibility of being able to adjust the soil parameter can improve further its performance when using different sensors. It is suggested that this index is tested in other situations where a strong soil reflectance influence is suspected.

Canopy properties at different development stages are also relevant in LAI estimation with RS. In this work, it was attempted to encompass the widest range of stages possible with the limitation of plants being too small for LAI measurements (before the beginning of stem elongation) or the canopy reflectance being strongly affected by the appearance of ear awns (beginning of the heading stage).

In general, the best performing machine learning algorithms did slightly worse than the best VIs and had higher RMSE values. Relevance vector machine (RVM) and Gaussian process regression (GPR), the best performers at both 12- and 4-band combinations, represent two different approaches.

RVM is based on the concept of support vector machines but placed in a Bayesian context and using Gaussian-type mechanisms (Samui & Dixon, 2012). RVM has been found to perform as well as support vector machines, while requiring fewer model terms (Elarab et al., 2015). GPR, an approach based on a Bayesian framework and a Gaussian probability distribution, performs well when spectral data reduction is needed, and has been applied with success to estimate plant biophysical parameters (Verrelst et al., 2012), but showed a poor performance compared to other methods in another study (Mao et al., 2019) and mixed results in another (Upreti et al., 2019). Both GPR and SVM were also among the best performing machine learning regression algorithms in a study to estimate crop LAI in Spain (Verrelst et al., 2015). However, since many top performers in that study did not show good results in the present study and vice versa, it is possible that the performance of particular algorithms can vary with site, crop and sensor. Rather than the choice of one or another algorithm, the main intention behind this exercise was to assess the general performance of machine learning in the context of other methods. In this respect, ML seems to be at least as powerful as the use of Vis, with the advantage over the latter that no previous knowledge about the canopy-light interaction is needed to decide on what band combination and relationship is more appropriate for parameter retrieval.

Surprisingly, the 4-band configuration showed better results than when using 12 bands. One possible explanation is that discarding bands functions as a sort of band selection and a consequent dimensionality reduction. In a study to determine the minimum number of satellite bands to achieve the optimal LAI retrieval accuracy (Mao et al., 2019), results indicated this number to be between 6 and 5, but in no case using more bands than the minimal number resulted in lower accuracies, as it did in the presentr study. In some studies with RPAs predicting plant biophysical parameters, a small number of bands was increased by adding band combinations (VIs) (Romero et al., 2018) and plant height metrics (Han et al., 2019) but none of them considered the possibility of few bands in its original form. If these results can be verified in other situations, it follows that relatively simple sensor spectral configurations are sufficient for monitoring crop development and productivity.

The inverse modeling approach produced less accurate LAI predictions than the machine learning algorithms. Even though the coefficient of determination may still indicate that model is satisfactory, RMSE values are much higher than the two previous methods, and most were higher than 0.5, indicating a below-excellent status according to Richter et al. (2012a). In the predicted vs. observed plots from the inverse modeling with 12 bands (Fig. 9, left), all points in general are farther away around the one-to-one line compared to the other methods. In the case of the 4 bands (Fig. 9, right), there is also a bias towards the underestimation of LAI.

The performance of the inverse modeling can in theory be improved by testing different values or ranges of input parameters. This flexibility typical from models, however, is also a disadvantage, because the method of parameter estimation can become very complicated. Besides this, canopy and leaf radiative transfer models are specific to species and need to be adjusted accordingly. Despite the intrinsic complexity of models and their inversion, in the present study, this process was relatively simplified by ARTMO. Among other features, the possibility of using many cost functions simultaneously reduced the number of uncertainties in the process.

Sen2-Agri also achieved an accuracy comparable to the VIs, with a coefficient of determination slightly lower than the best performing indices. Part of its accuracy is due to the fact that Sen2-Agri uses the information from previous dates in an explicitly multitemporal data frame (Defourny et al., 2019). This method requires all bands of Sentinel-2 images, and it is unsuitable for other platforms at higher spatial resolution (e.g. Planet or RPAs), which typically also have less spectral bands. Unlike the methods based on indices presented here, Sen2-Agri is able to estimate LAI of many crop types and species simultaneously, which makes it particularly appropriate for global surveys.

Of all the methods considered, VIs were among the best performers, which is an important result given their simplicity. Additionally, VIs do not require more than two or three bands and a minimum training data size. VIs’ simplicity, however, make them prone to be affected by confounding variables other than leaf and canopy reflectance properties, which in turn affects VIs’ applicability to other conditions (Mourad et al., 2020). Based on these considerations, the use of VIs is suggested when the training data is scarce, and their application is restricted to a narrow range of varying conditions. Among all the possible VIs available, it is recommended to favor indices that include a soil correction factor.

Machine learning regressions can handle multivariate problems much better than VI models, eliminating the need to decide for one or another band. Whether ML is able to manage information from different sensors, as VIs did in this study, is something that needs to be investigated, although this study suggests that sensors with few bands, such as RPAs, are well suited for ML methods. In general, the application of ML can be useful, provided that enough information is available. In that case, instead of using ML as an alternative to VIs, it can be used parallel to VIs while rapidly exploring different input variables.

LAI retrieval with RTMs has also produced relatively accurate results. However, even though tools like ARTMO are user friendly, the inherent complexity of RTMs still requires a level of expertise and experience that makes this approach less useful for finding simple solutions, and probably much more suitable for a deeper understanding of light-canopy interactions in different circumstances and with different crops.

Sen2-Agri also showed that it can perform satisfactorily with barley, which means that it could be effective for other crops. However, since this tool depends on Sentinel-2 and cannot use higher spatial resolution sensors, the spatial resolution of the biophysical parameter to be retrieved cannot be improved. Therefore, Sen2-Agri is more suited for global studies rather than for precision agriculture.

Conclusions

A normalized version of the weighted difference vegetation index tested for the first time proved to be very accurate for the estimation of LAI in barley. This VI is almost equivalent to the widespread NDVI, except for a soil correction factor that can be directly extracted from the target image.

The fact that LAI measured in this study was consistent with an independent estimation method such as the Sen2-Agri system, indicates that both are measuring the same parameter and that this quantity can reliably assist crop models.

This study stresses on the importance of assessing the appropriateness of most current approaches for retrieving crop biophysical parameters, by placing these methods on an equal footing. This task has become simpler with the increasing availability of statistical algorithms and tools.

Here it is suggested that the precision agriculture community could truly profit from these broad comparisons, because the relative performance of retrieval methods varies greatly among production systems and environments. For example, in this study, some VIs showed a remarkable accuracy, but because VIs have a strong empirical component, their application in most cases needs the availability of in situ or specific data. Also, the rapid development of both far- and close remote sensing data sources constantly challenges established criteria and makes it more and more necessary having at hand efficient assessment tools.

Remotely piloted aircrafts (RPAs) probably constitute the fastest growing remote sensing industry. RPAs are becoming increasingly available, they are easy to use, accurate, and their high spatial resolution is particularly suited for precision farming. In addition, as this study suggests, LAI can be effectively retrieved with few sensor bands, which further supports the use of drones. Nevertheless, and exactly for that reason, more comprehensive testing is still necessary to further appraise the convenience and effectivity of RPAs.

The proliferation and diversification of remote sensing tools seems to be pointing towards cooperation between sensors rather than deciding which tool is more appropriate than the other. In this respect, remote sensing data integration or fusion seems to be a valid path to explore in future research.