Introduction

Chenopodium quinoa Willd (quinoa) is a highly nutritious crop that can provide an excellent balance of fiber, lipids, carbohydrates, vitamins, and minerals (Vega‐Gálvez et al., 2010). Moreover, quinoa is naturally salt tolerant and has the ability to grow within a range of harsh environments (Adolf et al., 2012; Hariadi et al., 2011). As such, quinoa has the potential to provide a highly nutritious food source that can be grown on marginal lands that are not suitable for other major crops: either now or into the future. Despite its potential, quinoa is still an underutilized crop (Massawe et al., 2016), with active breeding programs (Zurita-Silva et al., 2014) and field management strategies (Alvar-Beltrán et al., 2020) having only recently emerged. To improve agronomic traits and increase the productivity of quinoa, ongoing monitoring of phenotypic traits of different varieties and under a range of field conditions are required to help identify the suitable accessions for cultivation.

In recent years, unmanned aerial vehicles (UAVs) have become an increasingly useful tool for the collection of ultra-high spatial resolution, near real-time sensor-agnostic information (Manfreda et al., 2018; Xiang et al., 2019). While they have been increasingly employed for a variety of plant-phenotyping and crop breeding type applications, to date, relatively few studies have assessed their potential for mapping quinoa phenotypic traits. In a recent example, Sankaran et al. (2019) evaluated correlations between features extracted from multispectral and thermal UAV imagery and quinoa stomatal conductance, plant height and seed yield under non-irrigated and irrigated treatments, and found that the UAV-based green normalized difference vegetation index (GNDVI) and canopy temperature data were able to discriminate between non-irrigated and irrigated quinoa plants. In a study to identify the prediction of quinoa stomatal conductance, Ivushkin et al. (2019) utilized three UAV-mounted sensors, including a light detection and ranging (LiDAR) scanner, thermal camera and hyperspectral camera, to show that a combination of the datasets increased prediction when using a multiple linear regression model. However, apart from the two above-mentioned studies, no other research that uses UAV-based data for predicting quinoa phenotypic traits has been identified.

Among a range of crop phenotypic traits, UAV-derived LAI as a morphological trait and chlorophyll content as a physiological trait provide fundamental information with considerable application potential in crop breeding programs (Jin et al., 2020). While UAV imagery has not previously been used for predicting LAI and chlorophyll content of quinoa plants, it has been used extensively to map LAI and chlorophyll content of other crops. For example, multispectral UAV imagery has been used to predict winter wheat LAI at critical growth stages (Jiang, et al., 2019b), seasonal leaf area dynamics of sorghum breeding lines (Potgieter et al., 2017), soil–plant analysis development (SPAD) measured chlorophyll content values of maize (Deng et al., 2018) and the leaf chlorophyll content of a potato crop (Roosjen et al., 2018). Vegetation indices (VIs) extracted from UAV-based imagery have been the most commonly-used information to predict LAI and chlorophyll content of crops (Jin et al., 2020). VIs represent a combination of spectral bands designed to maximize sensitivity to vegetation characteristics while minimizing confounding factors such as soil background reflectance, directional viewing angles and atmospheric effects (Huete et al., 2002). In addition to VIs, the UAV-based 3D point cloud (including information of canopy thickness, height and leaf density distribution) and canopy textural information (e.g., mean, variance, homogeneity, entropy etc.), representing the structural characteristics of the crop canopy, can also be used for phenotyping, especially the morphological trait of crops such as LAI (Comba et al., 2020; Guillen-Climent et al., 2012; Zheng et al., 2019). However, the ability of UAV-based imagery for mapping LAI and chlorophyll of quinoa plants remains unexplored.

Three broad types of approaches are commonly employed to assist with mapping plant phenotypic traits using remote sensing or UAV-based data. These comprise parametric, physically-based and non-parametric methods. Parametric methods are the simplest approach for UAV-based crop growth monitoring, allowing the relationship between crop traits and variables to be easily interpreted. However, they are generally not particularly robust (Zheng, et al., 2018b). Physical-based methods (e.g., radiative transfer models) are based on physical laws that can enhance the reliability of predicted results, but the requirement of site-specific information (e.g., biophysical and biochemical variables, sensor viewing and solar angles, and crop structure parameters) for model characterization tends to reduce their implementation and efficiency (Fenghua et al., 2017; Jacquemoud et al., 2009). In more recent times, machine learning methods (which fall under non-parametric methods), have been employed in crop monitoring due to their ability to discover hidden information and their lower sensitivity to data skewness (Maimaitijiang et al., 2017; Yang et al., 2017). Zheng et al. (2018b) compared a simple linear parametric model, a physical-based model and 13 non-parametric models (i.e., three non-parametric linear regressions and 10 machine learning methods) for estimating winter wheat leaf nitrogen content and found that random forest regression achieved the highest accuracy. Many other studies have also applied machine learning methods to improve the estimation of UAV-based phenotypic traits of various crops, such as corn (Lee et al., 2020), maize (Singhal et al., 2019), tomato (Johansen et al., 2020) and rice (Zha et al., 2020). However, the performance of machine learning techniques for predicting phenotypic traits of quinoa using UAV-based imagery is yet to be explored.

Overall, the research aimed to evaluate the utility of UAV-based imagery to estimate two key phenotypic traits: LAI and chlorophyll content. To do this, a random forest machine learning approach was employed to predict LAI and SPAD-based chlorophyll for both control and saline-irrigated quinoa plots based on collected UAV-based multispectral imagery. The specific objectives of this study were: (1) to retrieve the LAI and SPAD-based chlorophyll of quinoa at the plot level using multispectral UAV data and random forest models; and (2) to evaluate the effects of control and saline treatments on prediction accuracies through using training data based on control plots only, saline-irrigated plots only and all available information.

Materials and methods

Experimental site and design

The field-based quinoa phenotyping experiment was conducted during the 2019–2020 growing season at the King Abdulaziz University Agricultural Research Station in Hada Al-Sham (21.7961° N, 39.7256° E), Saudi Arabia. The site is located in a tropical arid climate with annual rainfall of < 100 mm and has a predominantly sandy loam soil type. The field trial covered an area of 90 m × 50 m, comprising 756 plots of 1 m × 1 m and distributed within four control blocks and four saline-irrigated blocks (see Fig. 1). Forty seeds were planted in each plot on November 17 and 18, 2019. Across the 756 plots, there were 141 quinoa accessions, of which 2 accessions were replicated four times, 47 accessions were replicated twice, and 92 accessions were replicated three times, in each of the control and saline-irrigated treatments. Using a drip-irrigation delivery system, freshwater irrigation was applied daily for 10 min until December 18, 2019, when the saline treatments were started. Here, the water salinity was 7.5 dS/m with irrigation occurring for 20 min every second day. Each of the 756 plots contained four tubes with 12 drippers located at each 0.3 m. The dripper discharge was 4 l/h. Based on previous experiments of irrigation optimization at the same location, the irrigation times were set to 10 min per day until flowering and subsequently 20 min every second day until harvest, producing a daily average water volume for irrigation of 8 l for each plot. The salt concentration was gradually increased to approximately 17.19 dS/m on January 23, 2020 until harvest, which started on February 17, 2020. Supplementary NaCl was added because it is the major salt component in sea water, as described in Negrão et al. (2017). The electrical conductivity (EC) of the groundwater used for irrigation was measured as a base line. Then, the amount of NaCl was calculated to reach the required salt concentration level. Finally, NaCl was externally dissolved in 25 l of the groundwater, while calibrating the water EC before adding the solution to the irrigation tank.

Fig. 1
figure 1

Quinoa phenotyping experiment at the King Abdulaziz University Agricultural Research Station in Hada Al-Sham showing the eight blocks and associated water tanks for control and saline-irrigated treatments. UAV-based MicaSense RedEdge imagery collected on February 23, 2020 shows the 40 selected plots (yellow squares) for focused field analysis and nine ground control points (blue circles) for geometric correction of the unmanned aerial vehicle (UAV) imagery. The photos of plot samples show examples of a plot under the saline and control treatments at three dates during the growing season

Field measurements at the plot level

A total of 40 plots, representing five plots from each of the eight blocks, were selected for sampling to ensure that replicated accessions were assessed across different plots (Fig. 1). LAI and chlorophyll measurements were collected from the 40 selected plots on January 23, February 10 and February 23, 2020. The LAI was measured at twilight using an LAI-2200C Plant Canopy Analyzer (LI-COR Biosciences, Lincoln, NE, USA). For each plot, samples were collected next to quinoa plants at the center and from each of the four corners, with an average of the five measurements per plot subsequently calculated (LAIavg). It should be noted that LAI measurements were collected next to quinoa plants, even for plots (especially those under saline-irrigation) with a small number of unevenly distributed plants. To preclude overestimation of LAI for plots with reduced plant density and to allow comparison of LAI between plots with different plant numbers, LAIavg was normalized (LAInor) based on the mapped vegetation fraction (VF) within each plot:

$$LAI_{{nor}} = LAI_{{avg}} \times \frac{{VF_{p} }}{{VF_{{max}} }},\;{\mkern 1mu} VF_{p} = \frac{{VF_{{veg}} }}{{VF_{{all}} }}$$
(1)

where VFp and VFmax represent the VF of each individual plot and the maximum VF of all 40 selected plots, respectively. VFveg and VFall are the number of mapped vegetation pixels and all pixels in each plot, respectively. Further information regarding the extraction of vegetation pixels is presented in the section of data analysis. The plant chlorophyll, hereon referred to as SPAD-based chlorophyll, from each plot was determined using a SPAD-502 Plus Chlorophyll Meter (Konica Minolta Optics Inc., Osaka, Japan). Five different upper canopy leaves of selected plants within each plot were measured three times per leaf. Upper canopy leaves in each plot were selected to provide a representation of the UAV-imaged leaves. The 15 SPAD readings per plot were averaged to derive an overall SPAD-based chlorophyll estimate for each plot. It should be noted that SPAD readings from 13 out of the 40 sampled plots were not available on February 23 as the leaves had become withered and senescent. As such, a total of 107 SPAD-based chlorophyll measurements were used for subsequent analysis.

UAV campaigns and image pre-processing

Data collection

A DJI Matrice 100 quadcopter (DJI Innovations, Shenzhen, China) provided the UAV platform for collection of MicaSense RedEdge-MX (MicaSense, Seattle, WA, USA) multispectral data. With a sensor resolution of 1280 × 960 pixels, the multispectral data consisted of blue (459–491 nm), green (546.5–573.5 nm), red (660–676 nm), red edge (711–723 nm) and near infrared (NIR; 813.5–870.5 nm) bands. Three UAV campaigns were performed in concert with the in-situ collection of quinoa LAI and SPAD-based chlorophyll data. The MicaSense image data were collected within one hour of solar noon under cloud-free conditions with 80% side-lap and 89% forward overlap (1 photo per second). The Universal Ground Control Station (UgCS) Client application (SPH Engineering, SIA, Riga, Latvia) was used for flight planning to collect image data, with the UAV flying at a speed of 2 m/s with a flight line distance of 4.8 m and at a height of 27 m, which produced a ground sample distance of 0.018 m. Each flight took approximately 16 min to complete. Nine ground control points (GCPs) were placed within the study area for geometric correction (Fig. 1). The co-ordinates of GCPs were measured using a Leica GS10 base station with an AS10 antenna and a Leica GD15 smart antenna as a rover (Leica Geosystems, St Gallen, Switzerland). All collected global navigation satellite system (GNSS) data were processed using the Leica Geo Office software (Leica Geosystems, St Gallen, Switzerland) to achieve sub-centimeter positional accuracy. Before each flight, eight near-Lambertian reflectance panels were placed within the UAV flight overpass for radiometric calibration. The eight panels were produced from Masonite and had reflectance intensities of approximately 5%, 15%, 25%, 40%, 50%, 55%, 70% and 85%, with measurements taken before each flight using an ASD FieldSpec HandHeld2 spectroradiometer (Malvern Panalytical, Malvern, UK). A spectralon panel with nominal reflectance of 100% was used to optimize the ASD before measurements. For each panel, five ASD reference spectra were recorded and averaged into a single spectral sample. By averaging the ASD reflectance values within the full width at half maximum (FWHM) for each band, the ASD measurements were resampled to match the corresponding spectral bands of the MicaSense camera.

Image Pre-processing

All MicaSense imagery were processed in Agisoft Metashape 1.7 (Agisoft LLC, St. Petersburg, Russia) to produce a georeferenced ortho-mosaic for each data capture. An initial reflectance calibration step was applied to the MicaSense photos in Agisoft Metashape based on the calibration panel with a surface reflectance of 25% (Cao et al., 2019). A region of interest of one of the panel centers within a selected photo was created and the corresponding ASD measurements were imported for that panel for each band. Based on the relationship between image digital numbers and ASD measurements, all photos collected within a single flight were calibrated to ensure consistent brightness between photos before further processing. The camera model was determined automatically when the software loaded the metadata of the imported images. The information of band alignment was included in the metadata as well. After the calibration process, the photos were aligned using a key point limit of 40,000 and a tie point limit of 10,000. Tie points across bands were also generated during this process to strengthen the geometric accuracy between bands. Then, the imagery was georeferenced based on GCPs, achieving root mean square errors ranging from 0.006 to 0.014 m. The dense point cloud was produced at high quality using moderate depth filtering for all the image datasets. Since the focus was on quinoa plants at the plot level rather than the individual plant level, a point density of high quality combined with moderate depth filtering was found to provide both an accurate surface representation and computational efficiency. Subsequently, a digital elevation model (DEM) was generated using the dense point cloud as source data and while enabling hole filling. Finally, an orthomosaic was produced using the DEM surface and the default Mosaic blending mode, producing a pixel size of 0.018 m. The output bands of the MicaSense ortho-mosaics were 16-bit integer values. The average digital number (DN) of the pixels that covered each panel were calculated from the ortho-mosaic. The ASD reflectance of each panel was calculated by resampling the ASD measurements within the full width at half maximum (FWHM) for each band to match the corresponding spectral bands of the MicaSense camera. Finally, a linear empirical function was developed between the average DN of the calibration panels in the orthomosaics and the resampled ASD reflectance to convert the ortho-mosaics to at-surface reflectance (Barreto et al., 2019; Johansen et al., 2019).

Data analysis

Extraction of variables at the plot level

Since spectral data are the most commonly-used and valuable information from UAV-based imagery to predict both LAI and chlorophyll of crops, the five spectral bands and 25 commonly used VIs (Table 1) were extracted from the MicaSense ortho-mosaics for each quinoa plot and used to build prediction models of LAI and SPAD-based chlorophyll at the plot level. To extract variables at the plot level, a shapefile was produced and manually placed for each plot using a square polygon with dimensions of 1.2 m × 1.2 m (1.44 m2). As some plants appeared outside the boundary of the 1 m2 plots due to plant growth, especially during the last two campaigns, the larger polygon area ensured most plants per plot were included. Calculation of LAI was based on all pixels within this 1.44 m2 area. Since SPAD-based chlorophyll is specific to the plant leaves, the variables of all pixels representing quinoa plants were extracted by excluding canopy gaps within the 1.44 m2 area in the comparisons between pixel-derived and field-measured SPAD-based chlorophyll.

Table 1 Vegetation indices used in this study

Quinoa plant pixels were identified and extracted using a support vector machine (SVM) classification performed in the ENVI software (EXELIS, Boulder, CO, USA). An SVM is a supervised machine learning model that has been widely used for the classification of remotely sensed imagery (Maulik & Chakraborty, 2017; Roli & Fumera, 2001). Regions of interest were selected for two classes i.e. quinoa and non-quinoa features, with all spectral bands used as training data. The SVM settings were set to the default values of the radial basis function (i.e., gamma in kernel function = 0.25, penalty parameter = 100, pyramid levels = 0, and classification probability threshold = 0). Based on ground truth data, the Kappa coefficients of the SVM classifications for all UAV imagery were > 97%. Consequently, the variables of the pixels representing quinoa plants were extracted for SPAD-based chlorophyll estimation. Additionally, the extracted plant pixels were used to calculate the vegetation fraction values for each plot (Eq. 1).

Estimation and evaluation of LAI and SPAD-based chlorophyll predictions

Here, random forest regression models were used to estimate LAI and SPAD-based chlorophyll. The random forest machine learning approach has been shown to provide a fast and accurate method for estimating crop traits (Houborg & McCabe, 2018; Shah et al., 2019). Three random forest models were developed for each of the LAI and SPAD-based chlorophyll UAV datasets, using a training data set based on: (1) all observations, (2) control observations only and, (3) saline observations only. To develop the models, the optimal number of variables available for splitting at each tree node (mtry) was first identified. The mtry values were tested across a range from 1 to 30 (the number of input variables) via an iterative trial-and-error process to determine the optimal mtry value for each model. The optimal mtry value for each of the three models for both LAI and SPAD-based chlorophyll was identified by the value achieving the smallest mean squared error. The optimal mtry values were 15 and 8 for the LAI and SPAD-based chlorophyll models, respectively. A total of 1000 decision trees (ntree) were used for all random forest models to ensure stable predictions and computational efficiency. The variables used in each random forest regression model included the five spectral bands and 25 commonly used VIs identified in Table 1. Similar to previous research (e.g. Johansen et al., 2020), training sets were compiled by randomly sampling approximately 2/3 of the entire dataset with replacement, with the remaining data used as the testing set.

Individual random forest models were developed based on all quinoa plots, the control plots only, and the saline plots only. In each dataset, there were 30 variables, comprising the five spectral bands and the 25 VIs (see Table 1). To assess the performance of the 25 VIs, they were divided into three categories: (1) VIs with only the visible bands (i.e., the blue, green and red bands); (2) VIs with the NIR band, but excluding the red-edge band; and (3) VIs including the red-edge band. The importance of each variable was evaluated based on how much the accuracy decreased when a variable was excluded (Liaw & Wiener, 2002): determined here using the percentage increase in mean squared error (%IncMSE). The accuracy of the least-squares and random forest models were evaluated using coefficient of determinations (R2) and the root-mean-squared error (RMSE) between predicted and measured LAI and SPAD-based chlorophyll. The best performing models were subsequently applied to all 756 plots to predict LAI and SPAD-based chlorophyll for each of the three UAV dates and to assess differences between control and saline-irrigated plots. In all cases, the random forest models were implemented with the R package “randomForest” (https://CRAN.R-project.org/package=randomForest).

Results

Evaluation of variables importance

The importance of different variables for estimation of LAI and SPAD-based chlorophyll using random forest regression was evaluated, with results presented in Figs. 2 and 3. As can be seen, individual spectral bands were generally less important than VIs. For instance, the percentage increase in mean squared error (%IncMSE) values were close to 0 for the green and red-edge bands of the MicaSense camera for LAI estimation (Fig. 2) and the NIR band for SPAD-based chlorophyll estimation (Fig. 3), indicating little contribution of those bands for the predictions.

Fig. 2
figure 2

Variables importance, as measured using the percentage increase in mean squared error (%IncMSE), in the random forest regression models constructed on: a the control plots only; b the saline plots only; and c all plots for predicting leaf area index (LAI). Vegetation index abbreviations are expanded on in Table 1

Fig. 3
figure 3

Variables importance, as measured using the percentage increase in mean squared error (%IncMSE), in the random forest regression models constructed on: a the control plots only; b the saline plots only; and c all plots for predicting SPAD-based chlorophyll. Vegetation index abbreviations are expanded on in Table 1

The top 20% of important variables were identified for each model, representing 6 out of 30 variables (see boxed outlines in Figs. 2 and 3). For the LAI models, most VIs with the NIR band (but without the red-edge band) were highly ranked in their importance. Five VIs including the NIR band (but excluding the red-edge band) were among the six highest ranked variables when predicting LAI of the control plots only (Fig. 2a), whereas the LAI models based on the saline plots only (Fig. 2b) and all plots (Fig. 2c) comprised three and four NIR VIs, respectively, among their six highest ranked variables. Apart from RENDVI, which was ranked in the top 20% most important variables in the LAI model developed on the control plots only (Fig. 2a), the contributions of VIs including the red-edge band for LAI predictions were limited (Fig. 2). In contrast, VIs with the red-edge band were significantly more important in the SPAD-based chlorophyll models (Fig. 3). This was particularly the case for the SPAD-based chlorophyll model using the control plots only, where the top 20% most important variables consisted entirely of VIs with the red-edge band (Fig. 3a). Overall, the results of variable importance showed that the red-edge band performed well for estimating SPAD-based chlorophyll (R2 > 0.7), while it correlated poorly with LAI (R2 < 0.2). On the other hand, the NIR band was correlated well with LAI (R2 > 0.45) but showed almost no correlation with chlorophyll (R2 < 0.08).

To compare variable importance in the random forest models including different datasets (all, control only, saline only) for predicting LAI and SPAD-based chlorophyll, the variable importance was sorted into five ranked categories (Fig. 4). The variables with the highest importance ranking in Fig. 4 corresponded to the top 20% most important variables outlined in Figs. 2 and 3. For the LAI prediction models developed on the control plot only, saline plot only and all plots, the variable importance ranking was the same in 13 out of 30 cases, while the remaining variables (except NIR, CIgreen, and MTVI2) were within a single importance ranking category of each other. Conversely, the ranking of variable importance for predicting chlorophyll showed larger variation in the three models (control only, saline only, all). Only 7 out of 30 variables had the same importance ranking across the three models, while 11 variables were within a single importance ranking of each other. The remaining 12 variables showed larger variation in importance between the three models. For instance, NPCI was one of the top 20% most important variables in the model developed on saline plots only, but was ranked 24th out of 30 variables in the model trained on control plots only (Figs. 3 and 4). Based on the observations of importance and the variation of importance variables between different models, it is clearly important not to exclude variables in one model based on findings from another model.

Fig. 4
figure 4

The rankings of variable importance in the random forest regression models developed on the control plots only (C), the saline plots only (S), and all plots (All) for the estimation of leaf area index (LAI) and SPAD-based chlorophyll. Abbreviations for the vegetation indices are expanded upon in Table 1

Assessment of model predictions

To explore the generality of the different model constructions, the random forest models developed on all quinoa plots were applied to all plots, the control plots only and the saline plots only, whereas the models developed on the control plots only and the saline plots only were only applied to those specific plots. Compared to the models trained on only control or saline plots, the models trained using all plots slightly improved the LAI and SPAD-based chlorophyll predictions when applied to the control or saline plots (Fig. 5). This may be attributed to the larger range of measurements in the dataset of all plots (including both control and saline plots), as the lowest and highest measures of LAI and SPAD-based chlorophyll belonged to the control and saline datasets, respectively. For predicting LAI, a reduction in RMSE of 9.58% and 36.13% was observed when using the model developed on all plots and applied to the control plots and saline plots, respectively, while the increase in R2 was negligible (≤ 0.01) (Fig. 5a, b). For predicting SPAD-based chlorophyll, the models developed on and applied to the control or saline plots had very similar values of R2 (difference ≤ 0.002) and RMSE (difference < 0.18) to the results produced when using the model developed on all plots (Fig. 5d, e). These results indicate that the models developed on all plots could improve the LAI predictions of the control or saline plots, whereas predicting SPAD-based chlorophyll of control or saline plots were not influenced by the inclusion of additional plots from both treatments within the models.

Fig. 5
figure 5

Scatterplots, coefficient of determination (R2), root mean square error (RMSE) and number of samples (N) of predicted measurements of (a, c) leaf area index (LAI) and (d, f) SPAD-based chlorophyll based on the MicaSense RedEdge unmanned aerial vehicle (UAV) imagery using random forest models developed on (a, d) all quinoa plots or only control plots, applied only to the control plots, b, e all quinoa plots or only saline plots, applied only to the saline plots, and c, f all plots, applied to all plots

Mapping of LAI and SPAD-based chlorophyll

Since the models developed on all measured quinoa plots provided the best results, they were applied to predict the variations of LAI and SPAD-based chlorophyll for all 756 quinoa plots across each of the three dates. As shown in Fig. 6, values of LAI and SPAD-based chlorophyll for both control and saline quinoa plots generally decreased from January 23 to February 23, reflecting the observations that plant leaves tended to wither and senesce towards harvest. Compared to SPAD-based chlorophyll, the LAI predictions showed considerably more outliers, with high values located outside the whiskers of the boxplot. These outliers were likely due to lodging of plants, which is the permanent displacement of crop stems from their vertical position as a result of stem buckling and/or root displacement. Lodging was found in 25 control and 17 salt plots and appeared to be wind-related rather than influenced by treatment type. The LAI values (except the minimum value) of saline plots were slightly higher than those of the control plots on January 23 and February 11, while the control plots showed higher LAI values (e.g., the maximum, median and mean values) than the saline plots on February 23. The SPAD-based chlorophyll values (except the maximum value) of the control plots were higher than the saline plots on all three dates. The differences in LAI and SPAD-based chlorophyll values between control and saline plots for the last image campaign on February 23 likely responded to the final field irrigation occurring on February 19 (as quinoa plants need to dry out prior to harvest; Aguilar and Jacobsen (2003) and the accumulating salinity stress in the saline plots. It is noted that the range of SPAD-based chlorophyll values increased through the growing season, likely reflecting the varying salt tolerance of the different quinoa accessions.

Fig. 6
figure 6

Box plots of the variation in the leaf area index (LAI) and SPAD-based chlorophyll estimates from the MicaSense RedEdge camera using the random forest regression models developed on all measured plots for all 756 control and saline plots on January 23, February 11 and February 23, 2020

Figure 7 shows the spatial pattern of the control and saline-irrigated quinoa plots in terms of LAI and SPAD-based chlorophyll as derived from the MicaSense imagery collected on February 23. Most of the control plots (59.0%) and saline plots (59.8%) had LAI values between 0.3 and 0.9. LAI values higher than 0.9 were mainly distributed in the control plots (with 4.5% more plots than the saline plots), while relatively lower values (LAI < 0.3) mostly appeared in the saline plots (with 3.7% more plots than the control plots). Similarly, the predicted distribution in terms of SPAD-based chlorophyll also showed more control plots in the high value range (> 50) and more saline plots in the low value range (< 10). Although 4.0% more plots under saline treatment than control plots occurred with SPAD-based chlorophyll value between 40 and 50, SPAD-based chlorophyll values higher than 50 mostly appeared in the control plots (37.8%) with 13.8% more occurrences than the saline plots. Conversely, 6.6% fewer plots with a SPAD-based chlorophyll value < 10 were distributed in the control plots than the saline plots. While the low LAI or SPAD-based chlorophyll values in the saline plots were likely due to salinity stress, 51 out of 141 plant accessions in the saline plots had similar or even larger average LAI and SPAD-based chlorophyll values than their corresponding replicates in the control plots, possibly indicating high salt tolerance.

Fig. 7
figure 7

MicaSense RedEdge based maps of leaf area index (LAI) and SPAD-based chlorophyll using the random forest regression model trained on all plots i.e. for all 756 control (C) and salinity-treated (S) plots on February 23, 2020. Frequency plots show the distribution of LAI and SPAD-based chlorophyll estimates

Discussion

To predict LAI and SPAD-based chlorophyll in quinoa at the plot level, random forest machine learning models were used and evaluated how different variables derived from a UAV-based multispectral camera affected the prediction accuracy. The results illustrate that vegetation indices (VIs) were generally more important than the individual spectral bands for predicting LAI and SPAD-based chlorophyll, which is in line with the findings of previous studies using VIs to mitigate the effects of illumination geometry and enhance the spectral information of crops (Fernandez-Gallego et al., 2019; Lillesand et al., 2015; Xue & Su, 2017). Many studies have indicated that using red-edge reflectance bands can improve the prediction of chlorophyll in crops such as rice (Evri et al., 2008), winter wheat (Kanke et al., 2012), and maize (Deng et al., 2018). Likewise, in this research on quinoa, the red-edge band and VIs with the red-edge band proved useful for predicting SPAD-based chlorophyll. The result is likely due to the sensitivity of the red-edge wavelength region (680–760 nm) to chlorophyll contents. In the red-edge range, the reflectance of green plants increases fast in response to the increased chlorophyll absorption of light in the red portion of the spectrum (Horler et al., 1983; Le Maire et al., 2008; Zarco-Tejada et al., 2019). Thus, the red-edge band and associated VIs are relatively sensitive to the SPAD-chlorophyll value. In addition, Deng et al. (2018) found that the red-edge VIs were less sensitive to areas with low vegetation coverage. Hence, non-vegetation features (e.g., soil and shadows) have limited effects on the prediction of the SPAD-chlorophyll based on the red-edge VIs. Measurements of other biophysical and biochemical parameters of crops have also been found to be highly correlated with the red-edge reflectance, with examples including nitrogen contents in winter wheat (Zheng et al., 2018a), biomass in corn and soybean (Kross et al., 2015), and LAI in wheat and maize (Sun et al., 2019). On the other hand, the results showed that predictions of LAI were not improved by using the red-edge band or from the VIs that included the red-edge band. Instead, the NIR band and the VIs based on the NIR band generally had higher importance in the random forest models. This is consistent with a number of previous studies that have detailed that NIR reflectance (700–1000 nm) can be strongly influenced by LAI, while being less sensitive to variations in chlorophyll content (Delegido et al., 2013; Sun et al., 2019; Yao et al., 2017). Overall, the findings demonstrate the utility of using multispectral cameras that include both a red-edge and NIR band when mapping phenotypic traits such as chlorophyll and LAI. Therefore, the VIs including red-edge and NIR bands are recommended for the predictions of LAI- or chlorophyll-related phenotypic information, such as biomass, nitrogen and yield. While this study focused on spectral information (i.e. spectral bands and VIs) and achieved high accuracies for mapping LAI and SPAD-chlorophyll, the use of 3D point clouds and canopy textural information from UAV-based imagery and their combination with VIs should be explored in future research for quinoa phenotyping, particularly of morphometric characteristics.

Results also identified that separate random forest models specifically developed for LAI estimation of control or saline plots improved predictions of plots under different treatment regimes. Conversely, SPAD-based chlorophyll of control and saline plots could be predicted with similar accuracies using models developed on either plots of the specific treatments or all plots including both treatments, possibly due to the control (0.82–70.22 based on field data) and saline plots (0.80–69.76 based on field data) having a similar range of SPAD-based chlorophyll values. Considering the high accuracies of predicting LAI (R2 > 0.97 and RMSE < 0.19) and SPAD-based chlorophyll (R2 > 0.98 and RMSE < 3.04), the developed random forest models did not seem to be affected by the large variety of accessions (34 out of 141 accessions included in the training data). Other studies have also achieved good results when using a random forest approach on predicting phenotypic traits in crops of different accessions. For example, Sun et al. (2021) found that random forest regression resulted in better accuracies than linear regressions for predicting canopy chlorophyll content in 24 varieties of winter wheat and one variety of soybean. Yuan et al. (2017) compared random forest, artificial neural network, support vector machine and a partial least-squares regression models for retrieving LAI in 46 varieties of soybeans and found that random forest models achieved the best performance. In line with the current study, these previous efforts demonstrate that the random forest approach is suitable for predicting phenotypic traits from a large selection of sample plots even when variations in accessions, treatments and progression in crop growth are present. An advantage of random forest is its ability to accommodate large datasets through non-linear regression based on a large number of decision trees and high-dimensional data training (Breiman, 2001). However, it is worth noting that a random forest cannot generally predict beyond the training data range, as it only uses the values of the training dataset to grow regression trees (Iqbal et al., 2018). Therefore, it is important to carefully select training datasets covering a wide and representative variable range to improve model accuracy (Maxwell et al., 2018).

Using the LAI and SPAD-based chlorophyll results from the random forest models, traits likely to contribute to and/or reflect salinity tolerance in quinoa plants could be quantified. Approximately 36% of the 141 quinoa accessions in the saline plots showed LAI and SPAD-based chlorophyll values that were either similar or higher than their corresponding replicates in the control plots. These specific accessions might indicate a genetic tolerance to environmental stress, and provide an avenue for further identification of salinity tolerant accessions for breeding studies. However, more phenotypic information may further facilitate UAV image-derived identification of salt tolerance in quinoa accessions, e.g. plant height, canopy temperature, biomass and yield. Based on the results of quinoa phenotyping in this study, it is likely that multispectral UAV data combined with random forest modeling can be applied for deriving information on additional quinoa traits. In addition, observations from multiple sensors may assist in identifying accessions with high salinity tolerance. For example, Awlia et al., (2016, 2021) used red–green–blue (RGB) and chlorophyll fluorescence (ChlF) imaging to capture the growth, morphology, color and photosynthetic performance for identifying salt tolerance in Arabidopsis thaliana plants. Malbeteau et al. (2021) and Stutsel et al. (2021) used UAV-based thermal data to monitor canopy temperature of wild tomato species in response to salt stress. Ivushkin et al. (2019) combined plant height from a LiDAR scanner, the UAV-based hyperspectral physiological reflectance index (PRI) and canopy temperature from UAV-based thermal cameras to increase the discrimination of quinoa plants in the control and salt-treated plots. While all of these studies showcase the potential of UAV data to detect salt stress in crop, the underlying genetic insights that deliver the observable phenotypic features need to be investigated (McCabe & Tester, 2021). Such investigations will be assisted by the collection of additional phenotypic traits and identification of the most suitable image datasets and modeling approaches from which to derive these observable features.

Conclusion

The potential of multispectral UAV imagery to map LAI and SPAD-based chlorophyll and then to discriminate plant response to control and salt treatments in a quinoa phenotyping experiment was explored using a random forest machine learning approach. An examination of the relative information content delivered through using individual spectral bands and VIs derived from multispectral UAV imagery was also detailed. Results illustrated that VIs were generally more informative than individual bands for predicting LAI and SPAD-based chlorophyll. Most of the VIs with the red-edge band were found to be important predictor variables in the SPAD-based chlorophyll models, whereas the VIs including the NIR band (but without the red-edge band) had the highest importance for predicting LAI. In order to capture the largest number of observable phenotypes, it is recommended to use a camera with both NIR and red-edge bands for mapping quinoa traits. The performance of different training datasets for predicting LAI and SPAD-based chlorophyll of control and saline-irrigated plots were also evaluated. Compared to those models trained on control or saline plots only, the models developed using all plots decreased the RMSE by up to 9.58% and 36.13% when predicting LAI of the control and saline plots, respectively. Limited RMSE reduction (< 5.8%) occurred in the results for predicting SPAD-based chlorophyll. Overall, the results demonstrate the practical utility of employing multispectral UAV-based sensing for the estimation of key phenotypic traits of quinoa and provide a foundation for further exploration and identification of potential salinity tolerances. In addition to LAI and SPAD-based chlorophyll, other observable traits related to yield and salt tolerance such as plant height, canopy temperature and growth rate, can be integrated into the analysis to identify desirable commercial traits of the quinoa accessions studied.