1 Introduction

Warming of the global climate system has occurred in response to rising atmospheric concentrations of anthropogenic greenhouse gases (GHGs; IPCC 2014). Resulting climatic changes include an increased frequency of extreme weather events such as heat waves, droughts, floods, cyclones, and fires. These changes impact all aspects of our lives including agricultural systems and thus food security. A business-as-usual approach to global GHG emissions is projected to have “severe, widespread, and irreversible impacts” (IPCC 2014), and so management to reduce emissions is imperative. The agriculture, forestry, and other land use sectors generate a greater share of anthropogenic GHG emissions than any other economic sector (24%; Smith et al. 2014), and so has an important role to play in reducing emissions. The reductions from this sector can be achieved by reducing the emissions from different management practices, as well as by mitigating emissions of atmospheric CO2 from sequestering carbon (C) in soil and biomass.

Unfortunately, the effectiveness of different agricultural practices for reducing or abating GHG emissions can be variable (e.g., Lui et al., 2016; Du at al., 2017; Dumbrell et al. 2017; Feng et al. 2018; Huang et al. 2018; Meier et al. 2017), making it difficult for land managers and policy makers to identify widely applicable abatement actions. Practices that provide abatement in one location may not be effective in others (e.g., Lam et al. 2013; Powlson et al. 2016; Robertson et al. 2015; Sun et al. 2020; Trost et al. 2013). This can be caused by differences in local climate, through effects on biomass production and thus upon the amount of carbon returned to soil in residues (Meier et al. 2020; Robertson et al. 2015; Allen et al., 2013), or differences in local soil properties such as soil organic carbon (SOC), texture, and pH which affect processes of SOC sequestration and nitrous oxide (N2O) emissions (Barton et al. 2016; Charles et al. 2017; Huang et al. 2018; Rosace et al. 2020; Trost et al. 2013). Another consideration is that some practices may have contrasting effects on different GHG emissions. A relevant example is that reducing CO2 by increasing SOC can result in increased N2O emissions (e.g., Bos et al. 2016; Gregorich et al. 2005; Kumara et al. 2020; Mei et al. 2018; Meier et al. 2020; Trost et al. 2013; Xia et al. 2018). Such interactions are especially important where GHG abatement comes from storing C in soil, because the rate of SOC increase declines over time as SOC stores saturate. So, while many studies have quantified GHG abatement potential achieved through a single mechanism (e.g., increasing SOC or decreasing N2O emissions; Lui et al., 2016), there is little information on the net GHG abatement across multiple GHG and a range of practices each implemented at multiple sites with different soils, climates, etc.

Gaining such information from experiments is a daunting endeavour because of the resources required to run appropriate field experiments over the time scales (e.g., 25 to 100 years) required to properly assess GHG abatement. In the face of these difficulties, simulation models are useful tools (Smith et al. 2020; Lui et al., 2016). One example of the application of these models is the simulation studies undertaken of GHG abatement potential of multiple management practices (Fig. 1) possible at case study sites in some Australian grain production regions (Dumbrell et al. 2017; Meier et al. 2017; Palmer et al. 2017). Each of these studies determined the abatement potential of different practices for a small number of sites or limited area. The strength of this approach was its relevance to each case study through using data on local climate and soils, and having management practices tailored to these conditions. However, the practices providing the greatest abatement varied across the different sites reflecting the heterogenous environments and practices. For example, conserving rather than burning crop stubble (scenario 2, Table 1) resulted in contrasting GHG abatement in the case study sites at Dalwallinu (Fig. 1g) and Wimmera (Fig. 1e) because of differences in SOC sequestration at the two sites (Meier et al. 2017). The challenge now is to generalise across case studies such as these, in order to provide comprehensive insights into the practices that provide abatement across the diverse environments.

Fig. 1
figure 1

On-farm management practices with potential to mitigate GHG emissions: (a) retaining instead of burning crop stubbles, (b) application of animal manure, (c) use of cover crops to replace uncropped fallow (credit: Carl Davies), and (d) matching nitrogen fertiliser inputs to crop requirements. The mitigation benefits of these and other practices were assessed at eight case study sites located in the (e) northern, (f) southern, and (g) western grain production regions of Australia.

Table 1 Management practice scenarios simulated across sites and soils and associated predictors of GHG emissions (defined in Table 2).

As computing power increases, it is becoming common to run large numbers of simulations with complex cropping system models, with statistical techniques used to explore and/or summarise the output from these simulations (Godde et al. 2016; Gladish et al. 2019; Shahhosseini et al. 2019). Machine learning algorithms are well suited to finding patterns in large datasets, and random forest is a popular machine learning method for application to heterogeneous data (Breiman, 2001; Paluszynska, 2020; SAS 2022; Shaikhina et al., 2019). It is effective for evaluating data that have many variables (termed “predictors” within this methodology, e.g., annual rainfall, soil texture, fertiliser rate), many classes within a predictor (e.g., loam, clay, and sandy for a soil texture predictor), heterogeneous data types (e.g., continuous, categorical), data interactions (e.g., records with more than one treatment implemented), and small or imbalanced datasets. Regression and classification trees exploit this variability to group classes of predictors that are important for a targeted outcome; random subsampling from a population permits many such trees to be calculated. In the random forest approach, the ensemble of trees is analysed to build a model that identifies the most important predictors for an outcome (Breiman, 2001; Fawagreh et al. 2014). The approach is highly regarded for its accuracy and power as a data mining tool (Fawagreh et al. 2014), and has been used in several recent studies for prediction of SOC (e.g., Nabiollahi et al. 2019; Payen et al. 2021).

In this study, we used the random forest method to determine management practices that consistently provide GHG abatement across diverse environments of Australian grain production areas. The method was applied to simulated changes in SOC sequestration and soil N2O emissions from eight case studies sites (described in Section 2.2.1). To our knowledge, the random forest methodology has had little or no previous application to the problem of identifying practices for net abatement from changes in SOC and from non-CO2 GHGs such as N2O (Fawagreh et al. 2014; Liaw and Wiener 2002) despite the importance of considering multiple GHGs in the search for effective practices and policy for GHG abatement (e.g., Nong et al. 2021). We found that increased cropping intensity (through incorporating cover crops, increasing frequency of grains crops, or selecting crops with higher biomass in the rotation) was markedly and consistently of greater importance for predicting GHG abatement balance than any other predictor, so we focus the discussion on this predictor.

2 Materials and methods

2.1 The case study sites

2.1.1 Description

This study was based on soil, crop, climate, GHG emissions (from changes in SOC and soil N2O emissions), and gross margins from 15 GHG abatement scenarios (Table 1) simulated for eight case studies (Fig. 1e–g). Full details of the data and simulations are given by Dumbrell et al. (2017), Meier et al. (2017, 2020, 2022), and Palmer et al. (2017) and only an overview is provided here. The case study sites were located in different agro-ecological zones of Australian’s grain-producing areas, spanning diverse climates (subtropical, temperate, Mediterranean), soils (sands, texture contrast, clay), and crop management (summer and winter crops). There were up to 7 soil types at each of the sites.

The GHG abatement scenarios (Table 1) were chosen to (1) potentially provide GHG abatement and (2) fit within constraints to cropping system management at each site. Management practices in the scenarios included changes to crop stubble management, manure application and N fertiliser rates, as well as changes to the amount of crop biomass grown each year (referred to as changes to “cropping intensity”) achieved by using short-term cover crops or making alternative crop choices. The abatement scenarios were “tuned” for local implementation. For example, for sites with a Mediterranean climate where crops are grown in winter, an increase in cropping intensity could be achieved by planting a cover crop in summer (which would grown in years of adequate summer rainfall). By contrast, in northern Australian locations where rainfall is more evenly distributed throughout the year and both winter and summer crops can be grown, an increase in cropping intensity could include replacing bare fallows with a cover crop, increasing the frequency of grains crops to more than one per year, or selecting a crop potentially having larger biomass (e.g., replacing chickpeas with faba beans) in the crop rotation. The tailoring of scenarios to local practices meant that not all scenarios were simulated at all sites.

The management scenarios were simulated for all site-soil combinations with the Agricultural Production Systems sIMulator v7.5 (APSIM; https://www.apsim.info/). APSIM is a widely used and well-tested model being applied to and validated for diverse farming systems in many countries (Holzworth et al. 2014; Keating and Thorburn 2018) including Australian grain production systems (Hochman and Horan 2018). Specific case study site validations are described by Godde et al. (2016), Meier et al. (2017, 2020), and Palmer et al. (2017). All simulations were run for 100 years. Each 100-year simulation was repeated ten times, each starting in different years (giving 526,000 records). Outputs from the simulations were used to calculate annual GHG emissions and gross margins for each site-soil-scenario-start year combination. Gross margins were calculated as crop income less direct costs of crop production (Dumbrell et al. 2017). The GHG emissions consisted of (a) N2O emissions from the soil (0.0–1.0 m) and (b) CO2 associated with the change in SOC (0.0–0.3 m) (Meier et al. 2017).

The global warming potential (GWP) of N2O and CO2 was converted to the carbon dioxide equivalent (CO2e) mass using the 100-year GWP conversion factors of 3.67 for CO2 (CO2e-SOC) and 298 for N2O (CO2e-N2O), respectively (Myhre et al. 2013). The GHG balance in each year was calculated from the sum of CO2e-SOC and CO2e-N2O. GHG abatement was determined by comparing the emissions from the scenarios (S2-15, Table 1) to a high emissions baseline (S1). The average annual change in GHG emissions (∆GHG balance) and gross margins of each scenario compared to the baseline was determined over 25 and 100 years after the simulation commenced, consistent with the time horizons identified by the Intergovernmental Panel on Climate Change (Solomon and Srinivasan, 1995) and aligning with permanence periods for sequestered carbon in Australia under the Carbon Credits (Carbon Farming Initiative) Act 2011 (Cth) s. 86A (Austl.).

2.1.2 Predictors for GHG emissions

Independent variables (termed “predictors” in the random forest terminology) that potentially predict the ∆GHG balance were identified from the dataset for management practice scenarios and location-based characteristics. Four management-based predictors were derived, namely crop residue management, manure application, change in cropping intensity, and N fertiliser rate (Table 2). Climate and soil properties known to influence GHG emissions were also selected as potential predictors for GHG abatement because they can override the effect of management practices when compared between locations. The predictors were derived from the climate and soil data used in the simulations.

Table 2 Management-based and location-based predictors of GHG emissions.

2.1.3 Correction of data imbalance

The combined dataset was imbalanced because there were differences in the number of soils and scenarios simulated at each case study site, yet each site was equally important for representing the soils, climates, and scenarios in the study. For example, 40% of the simulated data came from the Liebe site compared with 2% from the Wimmera site. These differences needed to be corrected to avoid bias toward sites that were more frequently represented in data. The imbalance was corrected by additional sampling (“oversampling”) records from the underrepresented sites. This technique was used because we wanted to ensure that no predictive value was lost from any of the data. Loss of predictive value could have occurred if problems with imbalanced data were overcome by removing (“undersampling”) records from the overrepresented sites. Oversampling was achieved by increasing the number of records from underrepresented sites through random sampling, with replacement, until equal proportions of each combination was achieved. The oversampling was performed using the caret package (Kuhn, 2022) within the R software environment (R Core Team 2020).

2.2 Analyses

2.2.1 The random forest approach

The random forest approach (Breiman, 2001) was implemented using the randomForest package in R (Liaw and Wiener 2002; R Core Team 2020). In this analysis, two random forests were generated to predict ∆GHG balance averaged over the first 25 and 100 years after the simulation commenced (subsequently referred to as 25 and 100 years). The dataset was randomly divided into training (75%) and testing (25%) datasets, using the rsample package (Silge et al. 2022), to evaluate the predictive performance of the random forest models. The number of regression trees “grown” was set to 1000 and the importance of predictors then computed. Default arguments for the randomForest package in R were used including (a) the number of variables randomly sampled as candidates at each split, which was based on the number of predictors (12/3 = 4); (b) sampling of cases with replacement; and (c) setting the minimum size of terminal nodes to 5, to minimise computation time and avoid overfitting.

2.2.2 Statistics

Statistics for evaluating model performance

The capacity of each random forest to predict ∆GHG balance was evaluated against the testing dataset using the coefficient of determination (r2) statistic to indicate the proportion of data that fit the regression model.

Statistics for ranking predictor importance within random forests

The randomForestExplainer package (Paluszynska et al. 2020) was used in the R programming environment (R Core Team 2020) to calculate the following statistics for ranking predictor importance:

  1. (a)

    Increase in mean squared error

The increase in mean squared error statistic is the average increase of mean squared error of the random forest after a predictor is excluded (Paluszynska et al. 2020). This statistic is a relative measure of importance, and larger values of this statistic indicate that the predictor is of greater importance to the outcome of interest.

  1. (b)

    Times a root and node depth

This statistic refers to the proportion of the 1000 trees in the random forest for which a predictor was identified to split the root (i.e., first) node in the tree (Paluszynska et al. 2020). Predictors that split the data first provide the greatest difference between observations in the dataset.

A related statistic also used here is node depth, which refers to the average number of splits (nodes) at which the predictor occurs in the forest (the first split is node “1” and indicates highest importance).

  1. (iii)

    Increase in node purity

Increase in node purity refers to the decrease in the sum of squared errors accumulated whenever a specific predictor is chosen to split the data (Paluszynska et al. 2020). This is a relative measure and predictors with larger values are more important for predicting the variable of interest.

3 Results and discussion

3.1 Predictive strength of the random forests for ∆GHG balance

The random forest models for prediction of ∆GHG balance explained a high proportion of variation in the test data over both 25 and 100 years (r2 = 0.88 and r2 = 0.86, respectively). These values were acceptable given the high diversity amongst case study farms, so the models were relied upon to identify important predictors of ∆GHG balance.

3.2 Ranking of emission predictors

All statistics identified CropIntensity (cropping intensity) as the most important predictor of ∆GHG balance. The mean squared error (IncMSE) for prediction of ∆GHG balance for 25 and 100 years increased by 95 and 99%, respectively, when cropping intensity was excluded from the model (Fig. 2). This increase was two to four times that which occurred when other predictors of ∆GHG balance were excluded from the model for both time periods. Other statistics also confirmed the importance of CropIntensity for predicting ∆GHG balance for both 25 and 100 years (Table 3): CropIntensity was the predictor used to split the root node (nRoot) 325 and 348 times. This rate of splitting was at least 25% greater than for all other predictors and resulted in an average node depth (Depth) of 1.3 for cropping intensity compared with 1.7 and 1.8 for the next most important predictors. In a related outcome, the node purity statistic (IncPurity) was 3.3 times greater than for the next most important predictor when cropping intensity was selected to split data in the random forest model.

Fig. 2
figure 2

Increase in mean squared error (MSE Increase) when individual predictors of ∆GHG balance were excluded from the random forest. The predictors are defined in Table 2.

Table 3 Increase in mean squared error (IncMSE), times a root (nRoot), node depth (Depth), and increase in node purity (IncPurity) statistics for ranking the cropping intensity (CropIntensity) predictor compared to the next most important predictor for the random forest models predicting ∆GHG balance. Statistics are defined in Section 2.2.2 and presented for both 25- and 100-year simulations.

3.3 Effect of cropping intensity on GHG emissions

Cropping intensity (CropIntensity) was markedly and consistently of greater importance for predicting ∆GHG balance than any other predictor (Fig. 2), so it is valuable to examine the variation in ∆GHG balance and the effects of cropping intensity on the individual GHG that define ∆GHG balance. Mean ∆GHG balance decreased markedly (i.e., abatement occurred) over both 25 and 100 years when cropping intensity was increased (Fig. 3a), although there was a considerable range in ∆GHG balance across the different scenarios, sites, and environments (as indicated by the error bars). The abatement was driven by decreases in ∆CO2e for SOC emissions (i.e., increases in stored SOC; Fig. 3b) that offset the increase in ∆CO2e from N2O emissions (Fig. 3c). Average annual CO2e for SOC was lower for the 100 than 25 year results (Fig. 3b) because the rates of increase in SOC caused by higher cropping intensity declined over time as soils became increasingly saturated in SOC. Nevertheless, even over 100 years, decreases in ∆CO2e from SOC offset increased ∆CO2e from N2O emissions and resulted in net emissions reductions (negative ∆GHG balance). While increasing cropping intensity is a well-established practice for potentially abating GHG emissions (Basche et al. 2016; Kaye and Quemada 2017; Lui et al., 2016; Smith et al. 2014), we showed its contrasting effects on both SOC and N2O and that it has the potential to provide GHG abatement across diverse climates, soils, and crop management occurring the Australian grain production areas.

Fig. 3
figure 3

(a) Mean (circles) and standard deviations (± 1, vertical error bars) of changes in the simulated GHG balance (∆CO2-e) relative to a high emissions baseline under different crop intensities (defined in Table 2) over 25 and 100 years. The GHG balance was the result of (b) changes in SOC and (c) N2O emissions under the different cropping intensities. Positive ∆CO2-e values represent an increase in GHG emissions, negative values represent abatement. (d) Median annual change in gross margins (∆GM) in response to differences in cropping intensity relative to a high emissions baseline under different crop intensities.

The predictor CropIntensity was consistently more important for predicting ∆GHG balance across the cases studies than Nfertiliser (Table 3), despite inorganic N fertiliser use being an important source of agricultural emissions in Australian grain production systems (Commonwealth of Australia 2021; Mielenz et al. 2016; Sevenster et al. 2022). One reason for the relatively better performance of CropIntensity than Nfertiliser as a predictor was that all soils had relatively low SOC (0.5–2.2%; Meier et al. 2022) so there was potential for the increased biomass carbon returned to soils to be stored. The low SOC of many Australian agricultural soils has arisen because the decomposition rate of the original SOC has exceeded the rate of inputs of carbon from crops after the commencement of crop production (Luo et al. 2010). These soils therefore had the potential to build SOC in response to increased carbon inputs from biomass when the cropping intensity increased (Fig. 3a).

In comparison, the Nfertiliser predictor was relatively less important for predicting ∆GHG balance of this dataset because conservative N application rates are already used by many grain producers (Hochman and Horan 2018), and these practices were reflected in the scenarios. So even the scenarios with increased N fertiliser inputs (S3 and S5, Table 1) were relatively conservative and there was little potential to alter GHG emissions from decreasing N fertiliser inputs. The importance of the Nfertiliser predictor could change, however, if a greater number of emissions had been included when calculating GHG balances in the dataset, e.g., emissions embodied in the production and transport of N fertiliser (Sevenster et al. 2022).

Different results would also have been obtained if we considered some N fertiliser management scenarios well outside current common practice. The conservative N application rates common in Australian grain production limit yields, an outcome that farmers generally accept in order to limit the risk of not getting an economic return (through higher yields) on higher N rates in dry years (Hochman and Horan 2018). Higher N fertiliser rates will, on average, increase yields as well as N2O emissions. However, GHG intensity (i.e., emissions relative to yields) of grain production is lower in this situation (Sevenster et al. 2022). The differences between that result and that of our study, which aimed to identify practices that will reduce GHG emissions, illustrate the need for a clear objective when assessing the GHG benefits of different crop management practices or designing policies to support adoption of practices.

3.4 Effect of cropping intensity on gross margins

The increasing cropping intensity predictor included planting a crop at every opportunity in the cropping sequence. Although this produced larger abatement compared to scenarios with lower crop intensity, it was less profitable across the majority of site-soil-start year combinations (Fig. 3d), more so for the 25- than 100-year simulations. The result occurred because the increase in cropping intensity did not consistently increase yield and thus crop income, resulting in the average crop gross margins decreasing by $57 and $28 AUD ha−1 year−1 over 25 and 100 years. Limiting the implementation of cropping intensity practices to favourable opportunities (e.g., only sowing a cover crop when soil water is adequate) can increase average annual gross margins (Whish et al. 2009; Rose et al. 2022), but doing so reduces the abatement achieved.

Maximizing abatement from increased cropping intensity could be helped if farmers could access financial support from carbon markets to compensate for the generally reduced gross margins (Fig. 3d). In Australia, the Clean Energy Regulator administers the Emissions Reduction Fund to purchase carbon credits in exchange for evidence of GHG abatement (CER 2021). The “2021 soil carbon method” administered by the Fund explicitly includes cover crops in its methodology but does not include other ways of increasing cropping intensity, such increasing the frequency of grains crops or crops with larger biomass in the rotation where the environment permits. Thus, a broader definition of cropping intensity to include both the size and number of crops could increase the options for increasing SOC and hence GHG abatement.

4 Conclusions

Achieving GHG abatement in cropping systems requires identification and adoption of practices that are both likely to reduce emissions from various sources of GHG and be widely effective (Nong et al. 2021). Coupling the random forest approach with the locally specific results of case studies enabled us to identify, for the first time, practices likely to provide consistent GHG abatement across the diverse environmental and management conditions of Australian grain production areas. This outcome was remarkable considering that conditions at the case study sites ranged from deep sandy soils in low rainfall Mediterranean climates, to heavy clay soils in subtropical climates with double the rainfall of the Mediterranean sites. The random forest methodology facilitated a redefinition of the management practice scenarios to more general predictors that were common to all case studies. The redefined scenarios could have been used to evaluate GHG abatement in the case studies without a random forest. However, this is not an obvious approach for analysing case study results because the usual purpose of a case study approach is to examine research questions in a local context. By comparison, the methodology used in this study could both make use of the diversity of the case study conditions and provide statistics for the general importance of the different management practice predictors.

The clear finding from our study that increasing cropping intensity could generally provide abatement is consistent with results of experimental studies on practices providing GHG abatement (Lui et al., 2016). It also provides a clearer comparison of the benefit of increasing cropping intensity with other practices than is likely to be the case with experiments. Further, the method used in this study accounts for the contrasting effects of SOC sequestration and N2O emissions on GHG abatement, increasing the generality of the findings (Nong et al. 2021). We also found there was a trade-off between abatement and profitability from increasing cropping intensity, indicating that economic incentives are likely needed to support the widespread adoption of these practices. Current Australian carbon accounting methodologies do not recognise some of the ways to increase cropping intensity and changes to the methodology will be required for Australian grain farmers to earn carbon credits from these. This conclusion illustrates that our approach has the potential to benefit industry by providing information on abatement activities and policies. Similar application of the approach to other problems has clear potential to increase the return on research investment.

The study was based on simulations of the management practice scenarios at each of the case study sites and, as with all simulation studies, further testing is required to build confidence in the predictions. The results can guide efforts put into that testing. The study also only considered effects of SOC sequestration and N2O emissions on GHG abatement. It would be useful to consider other sources of on-farm (e.g., lime application) and off-farm (e.g., emissions embodied in production of nitrogenous fertilizers) emissions in future studies as is commonly done with life cycle assessments. Where those assessments are conducted through case studies (Sevenster et al. 2022), the machine learning approach used here, or similar, could help better generalise the case study results.