Introduction

Yield data were one of the earliest forms of precision agriculture (PA) data available to growers and researchers, with Global Navigation Satellite System (GNSS) enabled grain yield monitors becoming commercially available in the early 1990s. The spatial variation displayed in these early grain maps was certainly a principal driver for the initial development of site-specific crop management. Grain yield mapping remains widely accessible to growers in mechanized production systems and is the only definitive method for spatially auditing the actual at-harvest yield. Despite this, the adoption of yield mapping has been relatively slow (Bramley 2009; Fountas et al. 2005; Kutter et al. 2011; Say et al. 2018; Schimmelpfennig and Ebel 2011), and a recent survey in Australia (Bramley and Ouzman 2018) has identified that yield maps are considered to be under-used by the industry and growers. Despite this, Bramley and Ouzman (2018) reported that the adoption and use of yield maps was highly “valuable as a lever to gaining ‘buy-in’ to sensing and PA more broadly”.

There are several complementary reasons for yield data being under-utilised.

  1. (i)

    Yield data can be inherently noisy and there are a variety of operational and calibration conditions that contribute to the noise in the data (Leroux et al. 2018a; Lyle et al. 2013). Noisy data are often difficult to interpret when mapped, and while many approaches for data-filtering and mapping have been proposed (e.g. Lyle et al. 2013; Robinson and Metternicht 2005; Sudduth and Drummond 2007; Sun et al. 2013 among others), these have not been well translated into industry services for growers to date (Leroux et al. 2018b).

  2. (ii)

    Spatial yield patterns are also often affected by managerial as well as environmental effects. Local expertise is therefore often needed for correct interpretation, which requires a time investment from the grower and the service provider. Yield maps can also be potentially unstable, particularly in dryland systems, where differences in the amount and timing of precipitation often influences yield patterns (see Taylor et al. 2007 for examples).

  3. (iii)

    Increasing time-series may make interpretation more difficult, not easier, in the absence of clear agri-diagnostic tools to simplify the multi-seasonal yield data. Even if data are correctly processed and local expertise is available to interrogate the yield maps, adding additional layers each year/season can make long-term visual and statistical interpretation more difficult.

All these issues mean that interpreting yield patterns, and therefore deriving spatial decisions from spatio-temporal yield data, is rarely a straightforward process. In the first instance, growers would like to know where yield is temporally consistent; consistently high, low or average, and where yield response is more variable over time, i.e. inconsistent.

From an early stage of yield monitor data analysis, it has been recognized that the inter-annual temporal variability in yield patterns may be as important as annual spatial yield variability (McBratney et al. 1997). Based on per-pixel calculations, multiple studies in the late 1990s and early 2000s investigated temporal variation using local yield means and variance statistics on relatively short time series of yield maps (2–6 years) (Blackmore and Moore 1999; Blackmore et al. 2003; Li et al. 2007; Ping and Dobermann 2005; Ping et al. 2005; Stafford et al. 1996). These statistical approaches assumed that crop yield response and its spatial distribution are mainly affected by permanent soil characteristics, which leads to the same spatial yield pattern in each year (Blackmore et al. 2003). However, non-stable factors (e.g. climate patterns, crop type and rotation, soil water content, infiltration and crop water use) also have a significant influence on spatio-temporal yield variability and interact dynamically with spatially stable properties (e.g. soil texture, landscape position) (Basso et al. 2012). Such interactions impose a clear limitation on statistical approaches to analysing spatio-temporal yield variability. However, it should be noted that the analysis of the variability in multi-year yield data is not necessarily only limited to the coherence and stability of yield patterns. Spatio-temporal yield data has also been used to inform other decision models, such as drainage effects in irrigated corn (Delbecq et al. 2012), however, the models used in such approaches are not valid for yield pattern analysis.

Most previous research using yield data has focused on spatial analysis of yield and not on the temporal or spatio-temporal variability in yield data (Griffin et al. 2008; McKinion et al. 2010; Stafford et al. 1996). In part, this may be explained by a realization that the early temporal studies were data-limited and based on some assumptions of stability between crop types and rotations (Bakhsh et al. 2000; Kitchen et al. 2005; Ping and Dobermann 2005). These short time series were unlikely to capture the full range of potential crops, rotations and potential production climatic conditions. This limitation no longer exists and time series of yield data > 10 years and across multiple rotations are commonly available (e.g. Leroux et al. 2018b) when working with growers who were early adopters of grain yield monitoring in the 1990s and early 2000s. These longer time series form a more coherent and robust data set, which should facilitate a better understanding of temporal and spatio-temporal variation in production. It also permits temporal analyses to be targeted to specific conditions e.g. a specific crop type or specific climatic pattern (Leroux et al. 2018b. However, as noted above, longer time series pose a problem for visual interpretation. Interpreting 10–20 + individual yield maps is not an easy task, especially if the maps contain noise and artefacts.

To realize the potential of having long time series of high spatial resolution yield data, easy-to-use tools with modern data analysis techniques are needed to handle these data (Bramley and Ouzman 2018). Such tools must support high throughput computation and deliver interpretable and understandable results to producers and agronomists at a low computational cost (Bauckhage and Kersting 2013; Raj et al. 2015). Despite the recognized need, methods to simplify and interrogate observed (spatio-) temporal yield variation for site-specific decision-making are not available to growers. Recent papers have revisited this need: Leroux et al. (2018b) worked from a presumption that the yield data are cleaned but not interpolated, i.e. they exist as irregular points, and using a multivariate segmentation approach. Layton et al. (2019) developed a probability-based approach for identifying yield productivity zones, assuming that the yields corresponding to a given yield productivity zone behave similarly and therefore derive from the same probability distribution. Other studies have formed some assumption between crop vigour and yield and used multi-temporal satellite imagery to interrogate temporal variation in production (e.g. Boydell and McBratney 2002; Georgi et al. 2018; Tagarakis et al. 2013). These approaches tend to use image analysis techniques (or derivatives). When yield is correctly interpolated onto a regular grid, an image is generated. It is therefore possible to directly apply image analysis techniques to interpolated yield dat—either by assuming each map is a single-band image or each map is a band within a constructed multi-band image.

Generally, digital image processing and analysis in the remote sensing discipline uses simple or complex computer algorithms to reduce noise in data and prepare images for subsequent processing, information extraction and interpretation. Simple arithmetic pixel operations (e.g. data sums and differences, colour mixing, multispectral ratio and vegetation indices) and more complex pixel operations (e.g. spectral transformations) allow for image enhancement, feature extraction and pattern recognition by exploiting the resulting (sometimes synthetic) images (Albertz 2009; Sabins 1996; Schowengerdt 2007). In a multi-temporal context (e.g. satellite time series analysis), these spectral operations are highly suitable for the evaluation of spatio-temporal pattern variability and change detection (e.g. Blasch et al. 2015a, b). They should equally be suitable to the analysis of yield map time-series; however, such an approach has yet to be tested.

To address the limitation of commonly-used purely statistical approaches, the need for easy-to-use tools capable of the efficient handling of high-resolution yield data and long yield map time series, and the demand of interpretable and understandable results for growers and agronomists, this study proposes a novel pattern recognition-based method—the Multi-temporal Yield Pattern Analysis (MYPA)—to reveal the long-term variation in yield, expressed in categorical productivity-stability units (PSUs). A PSU is a relatively homogenous unit in terms of its relative productivity (yield) each year and the temporal stability of this relative yield. The overall goal is to develop an alternative method to purely statistical approaches, based on modern image analysis techniques. The specific objectives are: to i) detect erroneous data sets within the yield map time series, ii) synthesis the information within multiple yield maps into a single understandable and interpretable layer that indicates the relative yield response and its stability over a 10 + years period by using principal component analysis (PCA), statistical per-pixel calculations and k-means clustering, and iii) evaluate the hypothesis that the novel MYPA approach enhances insights into spatio-temporal yield variation compared to existing statistical approaches. To evaluate the hypothesis, the MYPA output was compared and validated to a statistical approach (STAT), adopted from the methods proposed by Blackmore et al. (2003) and Ping and Dobermann (2005).

Materials and methods

Sites description

Four fields were chosen for validation of the MYPA. The fields were characterized by having a relatively long time series of available yield data (10 + years). Two relatively large fields (Field 1: 80 ha; Field 2: 132 ha) were available from the cropping belt near Gurley (− 9.8204, 150.0314) in North-West New South Wales, Australia, and two smaller fields (Field 3: 21 ha; Field 4: 25 ha) from a lowland arable farm near Warkworth (55.3597, − 1.6485) Northumberland, UK. These provided two contrasting production systems on which to test the methodology. The Australian fields, with predominantly heavy cracking clays, are dryland production systems in a humid, subtropical zone with warm to hot summers (average annual maximum temperature of 26.1 °C) and summer dominant rainfall. The annual average precipitation is ~ 650 mm. Winter cropping systems are short (~ 6 months) and reliant on stored moisture rather than in-season precipitation. Wheat production is strongly water-limited and typically 2–5 ton ha−1 on average. The UK fields mainly contained loamy and light clayey soils with slow hydraulic permeability. The fields are in a cool-climate production system (annual average maximum temperature is 12.2 °C) with regular monthly rainfall dispersed throughout the year (annual average precipitation of 690 mm). The growing season for winter crops is longer (~ 9 to 11 months) and wheat yields tend to be 7–10 ton ha−1 on average. Yield is strongly temperature/radiation limiting, and despite the small difference in annual precipitation with the Australian sites, waterlogging is often more problematic than water-deficit due to the much lower levels of evapotranspiration in northern England.

Pre-processing of yield data and yield map generation

Yield maps were obtained for a 10 + year period for all four fields. Not every year had available yield data, either due to crop failure (Australian fields), fallow rotations, alternative crops (particularly potatoes in the UK) or poor yield sensor operation (missing data). Yield data from the Australian fields was not available after 2006 due to a reorganization of the field boundaries. If the yield data for a year was missing, nonsensical or not relevant, then it was not included in the analysis. The decision on which years to include, or not include, was subjective and involved some local knowledge. Table 1 lists the yield data from grain and seed crops that were considered suitable after discussion with the growers for mapping and use in the analysis.

Table 1 All study fields—descriptive statistics of yield data collected from each year, including crop harvested. All data expressed as ton ha−1

The following data pre-processing steps are based on the protocol of Taylor et al. (2007) for establishing management classes, developed for broadacre production systems at The University of Sydney. The yield data were filtered and cleaned to remove outliers and erroneous values before interpolation. Due to differences in mean yield between the Australian and UK fields and between canola and wheat in the UK, potential yield values were set to 0.1–10 ton ha−1 for all Australian fields and canola fields in the UK and to 1–20 ton ha−1 for UK wheat fields. Values outside these ranges were considered improbably low, as even bare areas in fields still record yield due to deconvolution within the combines (Whelan and McBratney 2000), or impossibly high based on known yield potentials and removed as a first step in data clean-up. Each field was then individually trimmed to values within ± 2.5σ (σ = standard deviation) of the field mean. This follows the simple filtering approach proposed by Taylor et al. (2007) that was derived using similar yield data sets. Interpolation onto a 5 m grid was done using block-kriging (20 m blocks) with a local variogram using the Vesper shareware (Minasny et al. 2005). Statistics of the filtered yield are shown in Table 1. It is recognized that the approach of Taylor et al. (2007) is simplistic and will not identify local spatial outliers and that other more sophisticated yield filtering approaches are now available (e.g. Leroux et al. 2018a; Vega et al. 2019). However, the use of local block kriging interpolation will provide additional smoothing of the data and has been shown to diminish the effect of local spatial outliers in the maps (Whelan and McBratney 1999). There are various ways that the yield data could be pre-processed, but the key points are that a well understood and robust method was used and that the same interpolated data will used as an input into the different methods of deriving yield patterns. The interpolation grid was kept constant for each year so that all yield predictions were co-located at the end of the process and the same yield maps were used for all comparisons. Georeferenced single-band raster data sets (saved in the GeoTIFF format) were generated for each year yield map using R (R Core Team 2018) and the R package “sp” (Bivand et al. 2013; Pebesma and Bivand 2005). These raster data sets were stacked to a multi-temporal layer stack for subsequent use as input data.

Multi-temporal yield pattern analysis workflow

The intent of the analysis is to synthesis the information within multiple yield maps into a single layer as a final output/map.

The MYPA process uses the multi-temporal layer stack for both pattern detection and pattern stability analysis before determining yield PSUs. The four processing steps: (1) Data Preparation, (2) Pattern Detection, (3) Pattern Stability Analysis and (4) Delineation of PSUs are described in the following sections. The MYPA final output is a map representing yield PSUs (Fig. 1a) that indicates the relative yield response (e.g. high, low or medium) and how stable that response is (i.e. the certainty associated with it). While the final output is a single image, intermediate outputs are also generated and can also be interrogated for relative yield only or stability only. The process is designed to be as automated as possible once a decision on suitable yield maps (and correct yield interpolation) has been performed. Overall, the MYPA algorithm was implemented in R (R Core Team 2018)—a free software environment for statistical computing—using specific R packages for spatial and image analysis and raster/vector data such as “cluster” (Maechler et al. 2018), “gstat” (Pebesma 2004; Graeler et al. 2016), “raster” (Hijmans 2017), “rgdal” (Bivand et al. 2018), “RStoolbox” (Leutner et al. 2018) and “sp” (Bivand et al. 2013; Pebesma and Bivand 2005), and basic R packages for plotting (“ggfortify”; Tang et al. 2016; Horikoshi and Tang 2016) and statistical analysis (“stats”; R Core Team 2018). Maps were generated in QGIS (QGIS Development Team QGIS Development 2009) and all plots in R.

Fig. 1
figure 1

Schematic illustrating the workflow and processing steps of a the proposed MYPA approach and b the conventional statistical-based STAT approach, including data preparation and key actions, to generate a final map

Data preparation

In order to analyse multi-temporal yield data from the same field, two main production constraints must be overcome: (a) the crop rotation effect and (b) the presence of erroneous data sets. The crop rotation effect refers to the varying absolute yield from year to year due to changing crop types within a specific crop rotation system or due to varying inter-annual weather conditions. Table 1 shows clearly that years with wheat or sorghum or canola have relatively high absolute yield values compared to years with chickpea (Australia: wheat: 0.3–8.8 ton ha−1, sorghum: 0.1–7.8 ton ha−1, chickpea: 0.1–2.6 ton ha−1; UK: wheat: 4.5–15.7 ton ha−1, canola: 1.8–6.3 ton ha−1). Generally, the physical soil properties (e.g. organic matter content, soil texture) of a field are relatively stable over the analysis period of 10 + years, if no major erosion events or extensive land use changes occur. Secondly, unusual or outlying data sets might exist in the time series. If these are erroneous due to specific management effects (e.g. one-off incorrect or unusual management decisions) and/or sensor/calibration errors, they must be eliminated from the time series in order to avoid “contaminated” results. However, unusual data sets originating from soil, weather, site-specific conditions or variable weather patterns should remain in the time series.

To ensure a yield map time series free of erroneous yield map(s) for subsequent processing, data normalization and semi-automated map outlier detection are performed on the layer stack. This ensures that maps will exhibit the same range of yield response, without any influence of different crop types or crop rotation effects (Blackmore 2000). The two main goals of the data preparation step are;

  1. a)

    Data normalization to permit comparison of yield from different crops with potentially very different absolute yield values (Li et al. 2007; Schenatto et al. 2017). For each layer (year), the yield maps were normalized to the maximum observed yield in that year (Table 1) using Eq. 1. This results in all maps having normalized yield values expressed between 1 and 2.

$$Y_{norm} = \frac{{Y_{{}} - Y_{min} }}{{Y_{max} - Y_{min} }} + 1$$
(1)

where Ynorm is the normalized yield pixel value, Y the original yield pixel value, and Ymin and Ymax the minimum and maximum pixel value from the same yield map within data set. A constant of 1 was added to the equation so that the output is in the range [1,2] and suitable for any type of subsequent analysis, i.e. it cannot contain zero values. The natural variation in yield originating from soil properties, other site-specific conditions (e.g. relief, hydrology) or weather are still preserved in the data.

  1. b)

    The identification and elimination of outlier yield maps from the yield map time series; years with unusual yield data should be identified and omitted before analysis in consultation with the grower/agronomist. However, this may not always be done properly. To validate the selected yield maps, a PCA was performed on the multi-temporal layer stack of all yield maps (see Yield Pattern Detection for PCA description).

Yield pattern detection

To detect spatio-temporal yield patterns, a transformation using PCA (R package “RStoolbox”; Leutner et al. 2018) was used. The normalized yield maps form an image where each yield map layer is one spectral band in the image and each pixel is considered as a discrete point with n years of yield data. The PCA transformation—a linear transformation of the (usually) correlated original spectral bands—is a common complex pixel operation in image analysis. It allows the analysis of highly correlated multi-dimensional data to remove redundant information, isolate noise components and reduce dimensionality without significant information loss (Panda et al. 2010). By rotating original spectral bands to a data variance maximum, the uncorrelated output bands (principal components—PCs) allow for the identification of patterns in the data, here yield data. The PCs highlight similarities and differences between bands (maps) to reveal the underlying dimensionality in the multi-band image. They also form a method of compressing the original image data (Abdel-Kader 2011; Mulla 2013; Smith 2002). This makes PCA a powerful tool for pattern recognition. Compared to the classic statistical approaches, as proposed by Blackmore et al. (2003), the PCA approach is free of any one or more factorial assumptions regarding inter-annual repetitiveness of yield response and its spatial distribution.

Following the PCA, plots of the first and second PCs (PC1 and PC2) and the eigenvectors associated with each layer (yield map) can be used to quickly and easily identify potential issues. If an outlying map is detected, the reasons for it being an outlier are investigated. If it is considered to be erroneous (e.g. due to specific management effects, sensor/calibration errors), it is removed, usually after consultation with an expert with local field knowledge, such as the grower or agronomist. If data is removed, the PCA is repeated on the remaining layers until a stable solution (no clear outlying maps) is found. If a good selection process was employed in the pre-processing, then it is very plausible to have no outliers identified and no data removed at this step. This process is considered useful and necessary because it permits spatial differences in yield response to be simply expressed as a graph. The PC1 vs PC2 graph may help to identify issues not immediately obvious from a visual inspection of the maps. It also permits the user to err towards keeping, instead of omitting, maps in the initial selection of the yield data in the knowledge that truly unusual maps will be removed at this step.

Once a stable outcome from the PCA is achieved, the band (normalized yield) and PC values were extracted for each pixel, a linear regression performed for all combinations of years and PCs and the coefficient of determination (R2) computed. PCs with at least a moderate relationship (R2 ≥ 0.5) with at least one year of data were selected for the final processing according to the rationale that if over half the variation in any one yield map was explained, then the PC was relevant for management and must be included.

Temporal pattern stability analysis

The temporal stability of the normalized yield response was determined by using statistical per-pixel analysis on the layer stack. For each pixel, the standard deviation (σ) of normalized yield was calculated across the multi-year stack. This generated a single final image of the σ of yield response, i.e. the relative stability of normalized yield response, across all analysed field-years.

Yield productivity-stability units delineation

The best performing PCs (§ Yield pattern detection) selected for yield productivity and the overall stability σ (§ Temporal pattern stability analysis) were subsequently stacked into one final layer stack. The k-means clustering algorithm (R package “cluster”; Maechler et al. 2018) was used to find groups (clusters) in the unlabelled data of the final stack. Briefly, the k-means clustering method aims to allocate the pixel values of desired data layers (here: various PCs and σ) into k-classes (clusters) in order to minimize the within-class variability and maximize the differences between the means of the k-classes (Hartigan and Wong 1979). Hard and soft (fuzzy) k-means clustering are widely used approaches for management class/zone delineation in PA (e.g. Guastaferro et al. 2010; Lark and Stafford 1997; Moral et al. 2010; Morari et al. 2009; Ortega and Santibáñez 2007; Pedroso et al. 2010; Ping and Dobermann 2005; Ping et al. 2005; Rodrigues Junior et al. 2011; Tagarakis et al. 2013; Taylor et al. 2007; Van Meirvenne et al. 2013). Easy-to-manage and coherent management units are desired for site-specific management (Tisseyre and McBratney 2008). Statistical approaches to calculate the optimum solution from fuzzy k-means classification algorithms exist (Boydell and McBratney 2002; Córdoba et al. 2013), but an agronomic solution for the optimum number of management units remains somewhat subjective. From the final layer stack, a 2-, 3-, 4-, 5-, 6- and 7-class solution from the k-means classification was performed to produce final PSUs options. In order to identify a preferred value for k, a validation step was performed using the zonal opportunity index derived from McBratney et al. (2000). This index is based on the result of a multivariate analysis of variance (MANOVA) and the number and size of discrete PSUs generated from the 2–7-class solutions (details given in § Validation of final map quality and comparison of methods). Once the best solution was identified, PSU mean values of all yield data sets (excluding removed erroneous data sets), selected PCs, and the overall stability (σ) were calculated for each PSU. These PSU statistics were used to assign a descriptive label to each specific PSU indicating both productivity and stability, such as (a) high productive and stable, (b) high productive and unstable, (c) low productive and stable and (d) low productive and unstable. To assign the labels, all PSUs were first ranked according to their average yield response across all years and then secondly, the overall stability (high or low) of the two highest yield response clusters were analysed and described. For example, if for a specific cluster, the majority of yield data sets showed a high zonal mean yield response and a low σ, it was assigned to the high productive and stable unit, if the majority showed a high yield response and a higher σ than the neighbouring PSU, it is the high productive and unstable unit.

Statistical approach for multi-temporal yield data analysis

As the intent is to propose an alternative approach to multi-temporal yield analysis, the MYPA output was compared and validated against a purely statistical approach (STAT), derived from the method of Blackmore et al. (2003) and Ping and Dobermann (2005), applied to the same initial data set.

The STAT approach differs in two main areas, i) the visual outlier detection in the data preparation step, such that the final selected yield data stacks could differ between the two approaches and ii) the yield pattern detection step that does not use any form of data compression or translation (PCA) (Fig. 1b). It only uses a statistical simplification to calculate the pixel mean (µ) across the normalized yield maps. All other processing steps were kept equal to the MYPA approach in order to maintain comparability. Consequently, the σ layer is the same as the layer used for the MYPA approach (§ Temporal Pattern Stability Analysis). A two-layer stack, comprising the µ and σ layers, was generated to delineate PSUs. To make a direct comparison between both approaches possible, the PSUs were derived using the same k-means classification as the MYPA approach (and again with a range of 2–7 for k) and validated the same way (§ Validation of final map quality and comparison of methods) to assess the STAT PSU outputs. Linear regression analysis was used to reveal the relationships between the individual year yield data, individual PCs and µ.

Validation of final map quality and comparison of methods

To validate the quality of the PSU map outputs and statistically compare which approach delineated more feasible PSUs (PSUMYPA vs. PSUSTAT), the following three methods were used: (a) MANOVA, (b) descriptive statistical analysis and (c) zonal opportunity index. The zonal opportunity index is also a key component of the cluster validation step during the PSUs delineation procedure. It allows for the identification of the best class solution (i.e. the most suitable number of PSUs) and the corresponding k-level (§ Yield productivity-stability units delineation).

Multivariate analysis of variance

A MANOVA analysis was performed with the yield data as the dependent variables and the derived PSUs from the two different approaches (PSUMYPA and PSUSTAT) as the independent variables. MANOVA uses discriminant analysis to reduce the multi-temporal yield data to a single variable that is then related to the yield units of both approaches (Olson 1974). It has been shown to be effective in assessing the quality of zoning in agricultural fields (Ping et al. 2005; Taylor and Whelan 2011; Uribeetxebarria et al. 2018). Pillai’s Trace, with a higher value indicating a better model fit, and Wilks’ Lambda, with lower values indicating better model fit, were generated to determine how well the MANOVA performed. The MANOVA analysis was conducted in R.

For each field the MANOVA outputs for PSUMYPA and PSUSTAT were compared and ranked between each level of k, with a score of 1 assigned to the method with the higher Pillai’s value and the lower Wilks’ lambda. The mean ranks for both statistics were derived across all k-levels for each individual field. This provided a summary of which method, PSUMYPA or PSUSTAT, performed best according to the MANOVA output across all possible outcomes (k[2,7]).

Descriptive statistics of yield productivity-stability units

The final PSU raster maps across all levels of k for both approaches were converted in R into a polygon map and the area of each polygon recorded. Descriptive statistics across the PSUs generated by each method, for each field were calculated. This included the total number of discrete zones generated, the number of zones > 0.25 ha and the mean area of all zones and mean area of zones > 0.25 ha. The choice of the 0.25 ha threshold was made as zones smaller than this are likely to be unmanageable in these cereal systems given the speed and width (often 24 m) of typical, current input applications. However, this threshold is arbitrary, and could be easily altered to accommodate a change in cropping type, e.g. if MYPA is applied into narrow row perennial horticultural systems, or a change in operational resolution of future machinery. From these data, the percentage area of the field that was considered ‘unmanageable’ was calculated. While the MANOVA tests how well the PSUs account for yield variance, these statistics relate to the technical opportunity to respond to the yield PSU patterning (Tisseyre and McBratney 2008). The presence of many small units would increase the uncertainty associated with variable-rate applications.

Zonal Opportunity Index

To identify the preferred output for each field from the range of potential k-levels [2,7], an adaption of the zonal opportunity index (McBratney et al. 2000), based in turn on the Akaike Information Criterion (AIC) (Akaike 1974), was used

$$O_{z} = 2z - n_{s} *ln\left( {1 - \varLambda } \right)$$
(2)

where Oz is the Zonal Opportunity Index, z the number of zones generated by each level of k-classification (i.e. discrete units not the number of classes), Λ (Wilks’ lambda) from MANOVA is a measure of the percent variance in dependent variables (yields) not explained by differences in levels of the independent variable (classes) thus subtracted from 1 to provide an estimation of ‘goodness of fit’ and, ns the number of (independent) data (errors) in the data set. Due to spatial auto-correlation in the data, ns< n (and n is the number of grid points in the interpolated data). For the comparison and ranking of k outputs within a field, ns will be constant. For this analysis, ns was set at 1000 for all fields, which is approximately 12% of the data in the smallest field (Field 3) and greater than the maximum z observed in the analyses (z = 755, Field 3, PSUMYPA, k = 7).

The Oz allowed the k-level outputs for both approaches to be compared within a field. However, the use of a constant ns value across the four fields restricted comparisons to within, and not between, fields. Approaches to estimating ns exist (see discussion of Taylor and Bates 2012), but, as the intent here is to identify the preferred solution within a field (and not to try to identify if one field outperforms another), the use of a constant was considered sufficient. The use of the form of the AIC permits the quality of the fit to be corrected against the number of discrete units (zones) generated, with higher numbers of zones penalized as this makes management potentially more difficult (Pedroso et al. 2010). The value of z can be altered to reflect either the total number of zones (polygons) in the output or the total number of ‘manageable’ zones (here defined as zones > 0.25 ha). For this analysis, both were used in a two-step process. Firstly, the Oz was calculated using the total number of polygons in the output to identify the k-level associated with the best PSUSTAT solution based on the whole data set. Then, for this selected level of k, the Oz was recalculated and compared for the PSUSTAT and PSUMYPA approach based on the number of ‘manageable’ zones (i.e. polygons > 0.25 ha in the output). This was termed zonal opportunity of manageable zones (Oz-man) and considered a more rigorous comparison as the intent is to identify PSUs that could be managed agronomically, and not to find a purely statistical solution. The percentage area associated with small, ‘unmanageable’ zones was also considered when comparing the outputs and, if it was > 10% of the total area, then the nearest k-level with an ‘unmanageable’ area < 10% was also considered in the analysis.

Results and discussion

Data preparation

The normalized interpolated yield maps for Field 1—Australia are shown in Fig. 2 (other normalized map series are presented in Supplementary Data S1a–c). Field 1—Australia has been chosen as an illustration as it contains maps that appear to have linear and unusual features (particularly 1997 and 2006). The field had seven years of yield data between 1997 and 2006 with wheat, sorghum and chick-pea crops (Table 1, Fig. 2). The plotting of PC1 vs. PC2 and the eigenvectors (Fig. 3a) indicated that the 1997 yield map (Fig. 2a) was an outlier in the yield map stack. In 1997, the northern part of the field had a very low yield response relative to the southern part that does not repeat in any other year. This was due to different management in the northern section creating lower-yielding conditions. As this management was a ‘one-off’, and not intended to be repeated, this information is not relevant to future management. The 1997 yield data was removed and the PCA repeated, which showed no remaining outlying years (Fig. 3b). However, there is some management effect in the Sorghum 2006 data (Fig. 2g). The 2006 yield map was similar to the Wheat 2004 map (Fig. 2e) in the attribute space (Fig. 3b). This map could have been removed based on local opinion and an apparent observable management effect (Fig. 2g). However, since statistically it was not mapped as an outlier, the decision was made to include it in the subsequent analysis. The PCA plots of the other three fields did not indicate any outlying yield maps (Fig. 4).

Fig. 2
figure 2

Field 1—Australia: Normalized yield maps with varying crop types (Sorghum, Chickpea and Wheat) including the outlying yield map (wheat 1997)

Fig. 3
figure 3

Field 1—Australia: Eigenvectors (red arrows) and data values by PC1 and PC2 (black points)—a of all yield data sets after first iteration; b of remaining yield data sets after second iteration of the outlier detection process (Color figure online)

Fig. 4
figure 4

Eigenvectors (red arrows) and data values by PC1 and PC2 (black points) of all yield data sets after first iteration: a field 2—Australia; b Field 3UK; (c) Field 4UK (Color figure online)

Yield pattern detection

From the PCA, the individual PCs were extracted and mapped. The first PC extracted a pattern that best describes the dominant pattern across all years. This pattern was then removed, and the second PC found the dominant pattern in the remaining data and so forth. Again, Field 1—Australia is used as an example to visualize the individual PCs arising from the PCA (Fig. 5) (other PC map series are presented in Supplementary Data S2a-c). Table 2 shows the R2 values associated with a linear regression between the individual PCs and the individual yield maps for all 4 fields as well as the percentage of total variation explained by each PC. The R2 values between the annual yields and the mean yield map (µ map) from the STAT method are also shown.

Fig. 5
figure 5

Field 1Australia: Maps of the first 6 Principal components showing the spatial patterns associated with each PC (af are respectively PCs 1–6)

Table 2 Model fits (R2) for the linear relationship between normalized yield (dependent variable) and the PCs/µ (independent variable)

The PC maps tended to show an increasing trend in short-range variation as PC number increased. This is typical as the first PCs tend to extract the larger trends and patterns in the data, leaving the later PCs to explain more random effects and a much lower percentage of the overall variation (Table 2). The first two PCs explained 76-81% of the yield variation in each year (Table 2).

Using the PCA-based pattern recognition approach, relatively good relationships for individual yield data years were found with individual PCs, however, i) not all the PCs were found to have a good relationship with a yield year and ii) different fields had different PCs that exhibited a good relationship with individual year yield maps (R2 > 0.5). This threshold value (0.5) for a good relationship could be altered to be more or less strict in the selection of PCs. The rationale here was that if over half the variation in a yield map was explained, then the PC was relevant for management. It also meant that for any one yield year, only one PC could be selected as representative for that year. Using this threshold, there were between one and four PCs selected per field (Table 3). As expected, PC1 was always selected, and was related to multiple years in all fields. Fields where the yield maps differed considerably between years, tended to have PCs that related to individual years and more PCs chosen. Conversely, Field 2, with a very similar year-to-year pattern, only had PC1 selected (maps in Supplementary Data S2a).

Table 3 The Principal Components (PCs) selected in the yield pattern step to be used for clustering in the MYPA methodology (derived from Table 2)

The relationships between the µ map and the individual year yield maps (Table 2) were more varied than for the PCs. For example, Field 3—UK had 4 PCs selected, but the µ map did not have a R2 > 0.5 with any year. In contrast, Field 2—Australia, had 5 years related to µ and only 4 years related to a PC. The µ maps for each field are shown in Fig. 6. The mean pattern tends to follow the PC1 pattern (Fig. 4 and in Supplementary Data S2a–c).

Fig. 6
figure 6

Detected mean yield pattern by the STAT approach for each demonstration field: a Field 1Australia, b Field 2Australia, c Field 3UK and d Field 4UK

Temporal pattern stability analysis

To evaluate the temporal-spatial stability of yield, statistical pixel-wise analysis using σ over time as a measure was applied on the layer stack of the remaining normalized yield data. This pattern stability evaluation method enabled the identification and visualization of yield variability in space and time for all four fields (Fig. 7), where a high σ per pixel indicated high variability (unstable yield pattern) and low σ per pixel indicated low variability (stable yield pattern). Using Field 1—Australia again as an example (Fig. 7a), there was a highly stable area in the south-western part of the field and moderate to highly unstable areas along the field boundary, especially in the northern and central-eastern part (denoted “A”). This area of instability was associated with a small gully and an area of heavier clay soils in the field. In drier years, this area had better soil moisture availability, generating higher yields. However, in ‘wet’ years, its landscape position and heavier soil type was susceptible to water-logging effects. It tended to be either an area of relatively higher or lower yield.

Fig. 7
figure 7

Temporal stability of yield pattern as σ of all normalized yield maps, showing the spatial yield variability over time for each demonstration field: a Field 1Australia, b Field 2Australia, c Field 3UK and d Field 4UK

Yield productivity-stability units delineation

In order to produce information layers to support management decisions (especially risk management), the derived yield pattern layers (Fig. 5, Supplementary Data S2a-c and Fig. 6) were merged with the stability layer (Fig. 7). All possible solutions (k = [2,7]) for both the MYPA and STAT approaches were derived. An example, using the k = 4 solution for Field 1, is shown in Fig. 8 for both the MYPA and STAT approaches. The outputs for all fields and all levels of k were passed to the validation and comparison analyses to identify the optimal output.

Fig. 8
figure 8

Examples of the final PSU maps derived for Field 1 using k = 4 for both the MYPA (a) and STAT (b) approaches. Maps are shown in raster format. Comparisons between classes within a map were used to assign labels (high/low productivity; stable/unstable response). The mean annual yield, overall mean yield and mean PC response are given in Supplementary Data (S3a and S3b)

Within Fig. 8, similar broad patterns were observed for the two methods, but clear differences were also noticeable. At this stage it is unclear which is preferred, but the key observation is that there are differences in the two outputs (see also Fig. 10).

Validation of final map quality and comparison of methods

Multivariate analysis of variance

Visually, the PSUMYPA appears to be slightly more coherent, with larger more contiguous zones (Figs. 8 and 10). It may be expected that this increased coherence would come at a statistical price. However, the mean scores from the ranking of the MANOVA output (Table 4 and Supplementary Data S4) indicated that the MYPA model was statistically better than the STAT (lower Wilks’ Lambda and the higher Pillai’s Trace) in three of the 4 fields (Fields 1, 3 and 4). In these fields, the MYPA approach outperformed STAT for all values of k. For Field 2, which only had PC1 selected for the MYPA yield stability analysis, Pillai’s trace indicated that the STAT approach was optimal (but not for all k-levels) and there was no difference between the STAT and MYPA approach using Wilks’ lambda (mean score of 0.5).

Table 4 Mean scores for the performance of the MYPA vs. STAT calculated across all fields and all levels of k classes

Descriptive statistics of clustering

Comparing both approaches, for a given k-level, the MYPA approach generated more discrete zones (polygons) at lower values of k (≤ 4) in the larger Australian fields (Fields 1–2) (Fig. 9a). With higher values of k (> 4), the MYPA almost always produced fewer polygons than the STAT approach (Field 2, k = 7 is the only exception). Figure 9b shows the total area associated with small (considered ‘unmanageable’) polygons in each output. The smaller UK fields tended to have a more rapidly escalating percentage area associated with small polygons, but the trend for the MYPA to have less area associated with ‘unmanageable’ polygons was constant in Fields 3 and 4. At low k-levels, Field 1 is similar for the two approaches, but the MYPA had a consistently lower percentage area of ‘unmanageable’ zones at higher k-levels, while Field 2 showed little divergence between the two methods.

Fig. 9
figure 9

a The total number of discrete polygons (zones) created by k [2,7] classes using the MYPA (solid bars) and STAT (hatched bars) methodologies in four different fields and, b the total percentage area of small (‘unmanageable’ polygons < 0.25 ha) in each field for both methodologies (MYPA solid lines; STAT dashed lines)

Zonal Opportunity Index

Results for the calculation of Oz for all values of k[2,7] for the two methods (MYPA and STAT) in the four fields are shown in Table 5 (full details of variables used in the calculation are given in Supplementary Data S5). For Field 1 and 2, the larger Australian fields, lower values of k indicated a better fit for the STAT method, while higher values showed a preference for MYPA. However, Field 2 was very similar in Oz responses at all levels of k. Fields 3 and 4 indicated that the MYPA was consistently the better performed method based on the Oz metric. The best performed k-level is indicated in bold in Table 5 for both STAT and MYPA. Although a small sample size (four fields), the optimum k-level for MYPA was equal to or less than the STAT optimum k-level in all cases. For the four fields, the k-level associated with the best performed STAT model was selected and used for a direct comparison between the MYPA and STAT output (Table 6).

Table 5 Zonal Opportunity (Oz) calculated for 6 levels of classification (k[2,7]) using both the STAT and MYPA methods for generating Productivity-Stability Units (PSUs)
Table 6 Zonal opportunity of manageable zones (Oz-man) calculated for the best performed k-level from the Oz STAT analysis in each field (see Table 5)

When the zonal opportunity was recalculated using only zones of a manageable size, i.e. z = number of zones > 0.25 ha, a more pronounced outcome was achieved in favour of MYPA (Table 6). Despite selecting the k-level associated with the best performed STAT method in Table 5, the MYPA outperformed STAT in 3 of the 4 fields (lower Oz-man). Field 1 switched from having a lower STAT Oz to having a lower MYPA Oz-man. For Field 2, the STAT Oz-man was lower, but the difference between the two methods was much closer, with similar Oz-man and percentage area values. For the other three field (Fields 1, 3-4), the lower MYPA Oz-man was also associated with a smaller percentage of ‘unmanageable’ area, i.e. more of the field was manageable and the zoning was preferable using MYPA. For Field 3, the analysis was redone using k = 3 as the preferred result (k = 5) from Table 5 had 14.9% of the field area in zones < 0.25 ha for both methods. A maximum of 10% was set as an acceptable threshold, but this is arbitrary and should be adjusted to suit the target production system. It is noted that the levels of k selected in this process (3-4) reflect other work that has indicated that a 2-4 class solution tends to be optimal when classification is applied to spatial data sets (Pedroso et al. 2010).

Figures 8 and 10 illustrate that the two approaches generate similar patterns although there are some differences. In particular, the MYPA approach tended to generate larger and more coherent units, which translated into fewer and larger zones when converted into polygons and less area in small ‘unmanageable zones. This is a desired result, as the more coherent the zoning the fewer the changes and decisions that need to be made when performing differential (variable-rate) operations. There is a cost and a level of uncertainty associated with any rate change (Tissyre and McBratney 2008). The smoother zoning is particularly evident in the UK fields (Field 3-4) and indicated that the MYPA pattern is more amenable to site-specific management and technical constraints (Tisseyre and McBratney 2008), although a formal analysis of this has not yet been done.

Fig. 10
figure 10

Final PSU maps derived for Field 2, 3 and 4 using the optimal k-level solution for both the MYPA (ac) and STAT (df) approaches. Maps are shown in raster format. Comparisons between classes within a map were used to assign labels (high/low productivity; stable/unstable response). The mean annual yield, overall mean yield and mean PC response are given in Supplementary Data S6ab–8ab

Field 2 exhibited very little difference (visually and statistically) between the two approaches. This is the most temporally uniform field of the four. It had the highest percentage of variance explained by PC1, and PC1 was the only PC that was selected. This indicated a stable spatio-temporal pattern. In addition, 5 of the 6 individual years had a R2 > 0.5 with the mean yield map (Table 2). Under these conditions there appears to be no advantage to the MYPA method. Using a straight statistical (µ and σ) approach to yield zoning is just as effective in temporally stable systems. This, however, is not usually the norm in dryland systems, as indicated by the other three fields where the MYPA approach was better.

Field 1 presented the biggest deviation between the two maps, particularly in the unstable area “A” identified in Fig. 7a that was classed as stable with the MYPA approach and unstable in the STAT approach. As described previously, this area was characterized by deeper, clay soils that tended to yield the highest in a normal year (e.g. 1998, 2000) but were prone to waterlogging in ‘wet’ years, with significant in-season rainfall (e.g. 2004), and had a relatively lower yield in those years. In the ‘wet’ years, the shallower, lower clay soils in other areas of the field tended to improve their yield. However, while the relative yield was lower in the “A” area in 2004, the actual yield tended to be fairly stable, and it was actually slightly higher in this area in 2004 compared to 2000. The MYPA has identified that this area was stable in absolute terms, if not in relative terms between normal and ‘wet’ years.

Further discussion

Using the Oz statistic and the MANOVA outputs, the MYPA explained more of the yield variance and generated larger, more coherent zones than the established STAT approach. Furthermore, it offered an opportunity to assess the quality of the yield maps being used, unlike the STAT approach. However, while the Oz provides a metric for ranking and comparison, it is not intended to be an absolute statistic. It should be used with some degree of interpretation. In this study, it was used in conjunction with information relating to the percentage field area that was described by small, unmanageable zones (< 0.25 ha for these fields) and the patterns and mean area of the PSU zones. The MYPA was particularly more effective when only areas of the field that could be differentially managed were considered.

The MYPA is semi-automated; with the specification of fixed threshold values, associated with the selection of yield maps and of PCs, it could be fully automated. However, in the form presented here it invites the user to verify and engage in the PSU process. The STAT approach could equally be fully automated, but provides no options to vary the yield pattern analysis to cope with different types of yield data, e.g. arable cereal yield vs. field horticulture root crops vs. vineyard yields vs. tree crop yields etc. Different types of production have different effects and limitations with their yield data and, while the interpolation process can smooth some of these differences, it is not always intuitive to use a one-size fits all approach to analysing very different time-series of yield data.

The PSUs presented here are not potential management units (or zones), they are yield units. They could be used as management units; however, this is not recommended. The PSUs are intended to simplify and synthesize time-series of yield data. As such, the PSU maps could be used in the derivation of management units or for on-the-go differential management, but they should be used in conjunction with other relevant and available information, e.g. spatial soil data, maps of canopy vigour etc.…, when making agronomic decisions. The intent with this MYPA workflow is not to generate management units but to present a simpler, more coherent yield data layer that growers can interpret and work from (instead of a long time-series of yield maps) and to do this in a logical and quasi-automated fashion.

The MYPA primarily presents a new approach to assessing productivity patterns. Yield stability uses the same method, the σ over time at a point, as the STAT. The σ was considered a good method to assess stability. However, PSU delineation may benefit from alternative methods of assessing yield stability and is a potential area for further development.

Conclusions

In this study, a novel pattern recognition-based method—Multi-temporal Yield Pattern Analysis (MYPA)—was proposed for multi-year yield data analysis in arable cropping systems as an alternative approach to purely statistical approaches. Applying pattern recognition techniques to long yield time series offered a high potential to reveal long-term spatio-temporally yield production units at the local scale. The MYPA method was designed free of assumptions of how both stable (e.g. permanent soil conditions) and non-stable factors (e.g. weather pattern, crop type and cycle) affect yield response and temporal variability. This approach synthesizes the information within multiple yield maps into a single layer that indicated the relative yield response and how stable that response was across a 10 + year period. Image analysis techniques, principal component analysis (PCA) and k-means clustering, were demonstrated as useful techniques, capable of handling high-resolution yield data and long yield map time series efficiently. For the detection of erroneous yield data sets from multi-year yield maps, PCA was demonstrated as an efficient approach to identify years with management effects (e.g. “one-off” management decisions such as double cropping). Compared to more commonly used purely statistical analysis approaches, the MYPA approach generated larger, more coherent units and with these four examples presented a better statistical solution. The hypothesis that the novel MYPA approach enhances the insights into spatio-temporal yield variation compared to purely statistical approaches was demonstrated by multivariate analysis of variance (MANOVA) and a zonal opportunity index. The advantages of the MYPA approach appear to be stronger when yield data do not exhibit strong spatio-temporal patterns, i.e. yield maps are changeable between years. Overall, the MYPA demonstrated its potential high value as a robust approach that can potentially be encoded to easily produce interpretable and understandable information layers to support management decisions, especially for risk management. Further work will examine if this holds true with other yield time series, such as satellite biomass time-series, and how it compares to other emerging approaches (e.g. Layton et al. 2019; Leroux et al. 2018b).