1 Introduction

Climate change is presumed to cause a significant impact on forest ecosystems during the XXI century (Kirilenko and Sedjo 2007; Lindner et al. 2008). In recent years, the uncertainties derived from forest growth prediction under climate change have given rise to an intense scientific production. In this regard, the development of growth–environment relationships is a preferred standpoint for adapting traditional empirical growth indicators to a changing climate (Fontes et al. 2010). As site index (SI) is the most popular growth indicator for even-aged forest management (Skovsgaard and Vanclay 2008), the development of site index-environment models has become a general research goal (Wang et al. 2004; Seynave et al. 2005; Monserud et al. 2006).

The models developed for this purpose are usually based on the “direct” prediction of SI as a function of climatic, edaphic and/or physiographic predictors by means of a certain regression technique. Admittedly, these models have been increasingly relying on machine learning approaches, which are becoming an attractive modeling alternative since: (1) they provide methods for automatic variable selection, (2) they usually allow for automatically capture nonlinear response–predictor relationships, and (3) some of them allow for automatically model interactions between predictors (González-Rodríguez and Diéguez-Aranda 2020). Reasons (2) and (3) become especially notable when we consider nonparametric learning techniques, such as the “rule-based” approaches, which have become a frequent resource for SI-environment modeling (Crookstion et al. 2010; Barrio-Anta et al. 2020; Watt et al. 2021).

However, as noted by (Sabatia and Burkhart 2014), there is significant controversy regarding the robustness and interpretability of the rule-based models. Concerning robustness, the high amount of parameters and the model complexity implied by the “nonparametric” approach can be relevant sources of overfitting. Additionally, several authors (Weiskittel et al. 2011; Sabatia and Burkhart 2014) found a significant trend to the observational mean in the predicted values resulting from rule-based SI-environment models. These results could be denoting a low extrapolability power for these approaches, which may be a concerning drawback for studies that aim at producing cartographic outputs of forest productivity at regional scales. Concerning interpretability, the model complexity of rule-based approaches may make response–predictor relationships hard to interpret (Aertsen et al. 2010). In this context, using these models for prediction without an adequate interpretation may lead to ecological inconsistencies regarding the theoretical basis of tree growth. Because of these potential disadvantages of rule-based models, parametric approaches that provide similar modeling benefits (automatic variable selection, nonlinear response–predictor relationships and interactions between predictors) have become an attractive line of research in growth–environment relationships modeling (Watt et al. 20152016; Zhu et al. 2019).

Moreover, the observed underperformance of many SI-environment models, including rule-based approaches, might not be entirely due to shortcomings in the chosen regression techniques but also to potential inconsistencies between SI and the environmental predictors used (as suggested by Bontemps and Bouriaud 2014). These potential inconsistencies might be revealing that the “direct” modeling of SI is not really the optimal approach for linking biophysical site attributes with the underlying ecophysiological processes of tree growth. The reason for this might be that the traditional primacy of SI as productivity indicator in even-aged forestry responds exclusively to practical reasons (Skovsgaard and Vanclay 2008), as it lacks any direct ecological meaning. For overcoming the uncertainties of predicting SI, some studies have successfully modeled it in an “indirect” way, usually relying on other growth indicators or parameters that have a more sounding ecological background. For instance, Swenson et al. (2005) mapped the SI of Douglas-fir forests in the USA as a function of the 3-PG process-based model outputs. In this regard, it is important to note that traditional height growth equations (Hossfeld 1822; Gompertz 1825; Richards 1959) already provide nonlinear parametric forms for describing growth processes over time, which are ecologically meaningful as they rely on theoretical assumptions regarding population dynamics and metabolic ecology. As these nonlinear functions are mainly driven by a certain set of growth parameters, it is reasonable to think that these parameters are significantly correlated with the environmental factors that determine tree growth. Considering that SI is an immediate corollary of any age-dependent height growth equation, analyzing the relationships between climate and these growth parameters might be a consistent approach for “indirect” SI-environment modeling.

Our primary objective in this study was to test whether SI could be effectively predicted using parametric approaches that relate ecologically meaningful parameters from growth equations with climatic factors. For testing this, we compared two different ways of predicting SI of Scots pine stands in the northwest of Spain. On the one hand, we performed a direct SI prediction using rule-based models, in a similar way to previous studies (Weiskittel et al. 2011; Sabatia and Burkhart 2014; Barrio-Anta et al. 2020). On the other hand, we proposed a new method for developing SI-environment models by combining simple parametric models with the nonlinear assumptions behind the Hossfeld growth equation.

2 Materials and methods

2.1 Stand height data

The source of height growth data was a network of permanent research plots established by the Sustainable Forest Management Unit (UXFS) of the University of Santiago de Compostela, Spain. Overall, these plots correspond to pure, even-aged stands of Scots pine (Pinus sylvestris L.) located in the provinces of Lugo and Ourense, in the region of Galicia, and consisted of plantations in communal forests mainly located in mountainous sites (Fig. 1). For this study, we used only the first measurements of the network, carried out in 1996-1997, in which stem analyses of dominant trees was carried out for a set 41 of the plots (González-Rodríguez and Diéguez-Aranda 2021).

Fig. 1
figure 1

Geographic extent and locations of the 41 Scots pine inventory plots where dominant trees were sampled

Diameter at breast height (dbh) of all trees and total height were measured in each plot. Core samples were also bored in order to count the growth rings. The stand age (t), the number of stems in a hectare (N, trees/ha), the basal area (G, m\(^2\)/ha) and the dominant height (H, m; mean height of the 100 largest-dbh trees per hectare) were calculated from the previous measurements. A summary of the stand variables of the 41 plots considered in this study is shown in Table 1.

Table 1 Summary of stand variables for the 41 plot measurements

For each plot, the SI was estimated using the algebraic difference equation developed by Dieguez-Aranda et al. (2006) for this region, which is based on the Hossfeld growth function (Hossfeld 1822). This was done by projecting the observed dominant height (H) at the measurement age (t) to the reference age of the species (\(t_{ref}\) = 40 years):

$$\begin{aligned} H_{2}= \frac{51.39}{ 1- \left( 1-\displaystyle \frac{51.39}{H_1}\right) \left( \displaystyle \frac{t_1}{t_2}\right) ^{1.277}}, \end{aligned}$$
(1)

being \(H_{2}\) the site index when \(t_2\) = \(t_{ref}\), \(H_1\) = H and \(t_1\) = t.

Stem analyses were carried out following the same procedure described by Dieguez-Aranda et al. (2005ab). In each of these plots, a subset of two dominant trees per plot (i.e., a total of 82 trees), with heights and diameters differing, respectively, less than 5% to H and D (dominant diameter of the stand), were destructively sampled. Then, five to ten stem slices per tree were extracted along the trees’ height in order to count the growth rings. This allowed for relating tree heights (h) with ages (t), which were corrected using the method proposed by Carmean (1972) and modified by Newberry (1991). From the h-t dataset obtained through stem analysis, we fitted the Hossfeld growth equation for each plot:

$$\begin{aligned} h(t)= \frac{a}{1+bt^{-c}}, \end{aligned}$$
(2)

being a, b and c the growth parameters for each plot.

Once these basic parameters were estimated, we calculated another six growth-related derived parameters from Eq. 2: the age, height and growth (\(t_{ip}\), \(h_{ip}\), \(g_{ip}\)) at the inflexion point of the h-t curve; and the age, height and growth (\(t_{mmg}\), \(h_{mmg}\), mmg) at the point of maximum mean growth in height. We selected both points in the growth curve because of their potential eco-physiological significance. An example of a fitted growth curve with inflexion and maximum mean growth points represented is shown in Fig. 2. The inflexion point is reached at the age of maximum growth rate, and corresponds to the second derivative of (2). It represents the turning point between “juvenile” growth and “mature” growth. The maximum mean growth occurs at the age when \(\frac{d}{dt}\left( \frac{h(t)}{t}\right) =0\), and represents the moment from which the height/age ratio starts decreasing. The values and the specific methods to calculate each one of the nine alternative growth parameters estimated are summarized in Table 2.

Fig. 2
figure 2

Scatterplot of observed height vs age for one of the dominant trees included in the stem analysis dataset. The line represents the height predicted by Hossfeld growth equation. Filled markers represent the inflexion and the maximum mean growth points

Table 2 Summary of statistics and calculation methods of the nine growth parameters estimated for the dominant trees of the subset of 41 plots. a, b, and c are the growth parameters from Hossfeld Eq. (2). \(t_{ip}\), \(h_{ip}\) and \(g_{ip}\) are, respectively, the age, height, and growth rate at the inflexion point in (2) curve. \(t_{mmg}\), \(h_{mmg}\) and mmg are the age, height, and growth rate at the maximum mean growth point in (2) curve, respectively

2.2 Climatic data

As a source of climatic data, we used the Worldclim 2 bioclimatic dataset (Fick and Hijmans 2017). This dataset consists of a collection of raster maps with a spatial resolution of 1 km of climatic historical means for the period 1970-2000. The variables included in the maps are 19 bioclimatic indicators often used in species distribution modeling because of its biological significance. Some of them represent annual trends, such as the annual precipitation (BIO12), whereas others represent differences between seasons, such as the temperature seasonality (BIO3), or extreme climatic events, such as the minimum temperature of the coldest month (BIO6). From these raster maps, we extracted the values of the 19 bioclimatic indicators corresponding to the 41 inventory plot locations, so each site was characterized by a SI value and a set of potential climatic predictors of forest productivity. A summary of the 19 bioclimatic variables proposed as potential predictors of SI is shown in Table 3.

Table 3 Summary of the Worldclim 2 bioclimatic variables corresponding to the 41 Scots pine plot locations considered in this study. BIO2 is calculated as the monthly mean of \(t_{max}-t_{min}\), being \(t_{max}\) and \(t_{min}\) respectively the maximum and minimum monthly temperatures. BIO3 is calculated as \(100 \frac{BIO2}{ BIO7}\). BIO4 is \(100 sd_t\), being \(sd_t\) the standard deviation of the annual distribution of daily tempreatures. BIO7 is calculated as \(BIO5 - BIO6\). BIO4 is the standard deviation of the annual distribution of daily precipitation

2.3 SI direct prediction

We used three different rule-based learning techniques for directly relating SI to the bioclimatic predictors: Random Forest, Boosted Trees and Cubist. The first and second approaches have been successfully used for SI prediction in other studies (Aertsen et al. 2010; Weiskittel et al. 2011; Sabatia and Burkhart 2014). In contrast, to our knowledge, the Cubist algorithm has never been used before in forest growth modeling. In order to provide a methodological background, we present hereunder a description of the basics and used procedures for fitting these techniques in the current study.

2.3.1 Random forest regression

Random Forest (Breiman 2001) is a rule-based ensemble technique that performs a bagging procedure for developing an unbiased collection of regression trees. At each split of each regression tree, a randomly selected subset of the predictors is used for defining the node, thus granting a unique structure to each tree and avoiding between-tree correlation. This technique has been previously used for SI-environment modeling in other studies (Weiskittel et al. 2011; Sabatia and Burkhart 2014; Barrio-Anta et al. 2020).

In this work, a Random Forest model was fitted using the R package randomForest (Liaw and Wiener 2002). For calibrating the number of trees, we fitted models with trees ranging from 100 to 10000 until we determined that 1000 trees were enough for the model results to be roughly stable, independently on the random seed. We calibrated the predictors’ subset size at each split, mtry, by trying different values ranging from 1/2 to 1/6 of the total amount of predictors. In order to enhance the stability of these fittings, we performed a repeated tenfold cross-validation. For selecting the least amount of necessary predictors, we carried out a recursive variable elimination (RVE) procedure similar to the one applied by Weiskittel et al. (2011). At each step of this procedure, the least important predictor (measured by the decrease in accuracy due to its removal from the model) was dropped off from the set. Then, a new model with the remaining predictors was fitted. This was repeated until there were only two predictors left. The best alternative along the RVE path was considered to be the one that provided a reasonably high predictive performance, subjected to have the least amount of predictors. The criteria we used for this purpose was to look for the model previous to a significant decrease in \(R^2\) (square of the Pearson’s correlation between observed and predicted SI) due to the removal of a certain predictor. Specifically, we selected the model for which dropping off one predictor produced a decrease in \(R^2\) higher than the 90th percentile of all the decreases distribution in the RVE path.

2.3.2 Boosted trees regression

Similarly to Random Forest, Boosted Trees (Valiant 1984) is a rule-based ensemble technique that combines multiple weak learners (regression trees) to enhance performance in the final prediction. Unlike Random Forest, Boosted Trees ensemble is hierarchized, being each additional tree a regressor of predictive residuals of previous trees.

In this study, we fitted a Boosted Trees model using the R package gbm (Greenwell et al. 2019). Firstly, we set the optimum values of the calibration constants through a 10 times repeated tenfold cross validation (maximising \(R^2\)). These constants were: number of iterations (i.e., number of trees), interaction depth (maximum number of levels of nested nodes in each tree), and shrinkage (penalization applied to each tree’s residual prediction). Once the calibration constants were defined, we performed an RVE identical to the one applied to Random Forest for selecting the minimum necessary number of predictors.

2.3.3 Cubist regression

Cubist (Quinlan 1992) is a rule-based technique that performs a boosting-like procedure for enhancing the predictive performance by nestedly re-predicting the residuals throughout a defined number of iterations usually known as “committees”. The significant difference with Boosted Trees is that in Cubist the fitted models for re-predicting residuals are not proper regression trees but a tree-like hierarchized ensemble of linear models. At each tree split, a linear model (which may not include the node predictor) is fitted for predicting the response. As the linear models at the branch ends of trees can make predictions outside the response observed range, a correction factor usually called extrapolation correction can be applied for controlling the coherency of the predicted values. For allowing this approach to be comparable to the Random Forest and Boosted Trees fitted models, we set the extrapolation correction as a fixed value of 100, which means that the predicted values will be strictly forced to be inside the observational range. Cubist algorithm also includes a nearest-neighbor parameter that allows for correcting each prediction based on similarity to observations used in the training set.

We fitted a Cubist model using the R package Cubist (Kuhn and Quinlan 2018). Firstly, we calibrated the number of committees and the number of neighbors for correction through a 10-times repeated 10-fold cross validation (based on \(R^2\) maximization). After this, we performed an RVE procedure similar to the ones applied to Random Forest and Boosted Trees. The importance of each predictor was measured by the usage rate, which indicates the frequency of each predictor as node-ruler or as linear explainer throughout the committees.

The RVE path of the three rule-based models fitted is shown in Fig. 3.

Fig. 3
figure 3

Recursive variable elimination paths for the three rule-based models fitted

2.4 SI indirect prediction

We proposed an alternative indirect methodology for SI prediction based on multivariate linear models. This methodology consisted of two stages: (1) predicting the nine growth parameters we previously estimated (a, b, c, \(h_{ip}\), \(t_{ip}\), \(g_{ip}\), \(h_{mmg}\), \(t_{mmg}\), mmg) as a function of the WorldClim 2 bioclimatic variables, and (2) to estimate SI as a function of the new climate-sensitive predictions of growth parameters (\(\hat{a}\), \(\hat{b}\), \(\hat{c}\), \(\hat{h}_{ip}\), \(\hat{t}_{ip}\), \(\hat{g}_{ip}\), \(\hat{h}_{mmg}\), \(\hat{t}_{mmg}\), \(\hat{mmg}\)).

For the first stage, we used stepwise regression technique using the R language package stats (R Core Team 2018) for carrying out a generalized least squares coefficients estimation (generalized linear model, GLM) and automatic variable selection. This technique has been recurrently used in forestry, and specifically for SI-environment modeling (Codilan et al. 2015; Tange and Ge 2020). For attaining model parsimony, we used Akaike’s information criterion (AIC), so the fitting procedure involved the minimization of the following loss function:

$$\begin{aligned} \mathrm {AIC}=k\cdot p+N\log {\mathrm {MSE}}, \end{aligned}$$
(3)

being k a penalty p the number of parameters in the model, N the number of observations, and MSE the mean square error. For calibrating k, we used a tenfold cross-validation procedure. Within this procedure, we chose the best k for each case that maximized the \(R^2\), submitted to provide significant coefficients at a minimum confidence level of 90%.

Regarding the second stage of indirect SI modeling, we developed different transformations derived from Eq. (2) for extracting SI out of the predicted values of the nine alternative growth parameters. From Eq. (2), we can estimate the site index as:

$$\begin{aligned} SI= \frac{a}{1+bt_{ref}^{-c}}, \end{aligned}$$
(4)

being a, b and c the previously fitted parameters for each plot. Considering this, we could also perform an indirect climate-sensitive SI prediction (\(SI_{abc}\)) as follows:

$$\begin{aligned} SI_{abc}= \frac{\hat{a}}{1+\hat{b}t_{ref}^{-\hat{c}}}, \end{aligned}$$
(5)

being \(\hat{a}\), \(\hat{b}\) and \(\hat{c}\) the growth parameters predicted as a function of the bioclimatic explainers.

Combining Eq. (4) with the calculation methods summarized in Table 2 allowed us for predicting SI from different sets of three growth parameters. We accomplished this by isolating a and b parameters from equations in Table 2 and substituting in Eq. (4). As a result, we developed six different methods for estimating SI from the nine alternative growth parameters predicted, which are summarized in Table 4.

As a “control” method for allowing a reliable comparison between direct rule-based and indirect parametric models, we also fitted a direct stepwise regression (GLM) for predicting SI as a function of the 19 WorldClim predictors using the same procedure described in this section for predicting the nine growth parameters.

Table 4 Summary of the equations used for indirectly predicting SI as a function of the ten growth parameters

2.5 Model evaluation

For testing the robustness of the fitted models we performed a bootstrap validation procedure based on the 632+ rule (Efron and Tibshirani 1997), which we already tested in a previous study (González-Rodríguez and Diéguez-Aranda 2020). We calculated the bootstrap error of each model fitted through a resampling set of 100 realizations per model using the R package boot (Canty and Ripley 2017). Then, we estimated the overall predicitive error of each model by doing a weighted mean of statistics between apparent and bootstrap performance:

$$\begin{aligned} \mathrm {MSE}_{632+}= (1 - w) \mathrm {MSE}_{training} + w \mathrm {MSE}_{bootstrap}, \end{aligned}$$
(6)

where MSE\(_{training}\) is the apparent mean square error (MSE), MSE\(_{bootstrap}\) is the mean bootstrap MSE, MSE\(_{632+}\) is the corrected MSE, and w is the weight parameter that accounts for the observed relative overfitting and the no information rate (i.e., the potential error if observed and predicted values were completely uncorrelated).

Once we carried out the bootstrap validation, we examined the plots of residuals to detect possible patterns of heteroscedasticity or regression to the mean in the fitted models. In addition, we assessed the role of each predictor within the SI model with the best validation performance in order to evaluate its ecological coherence. Considering the potential difficulty of directly understanding the behaviour of the fitted models, we based our interpretation on the visualization of standardized predictors against predicted SI values. To facilitate this task, we preferred to focus on LOESS fitted curves rather than directly analysing the point clouds. This interpretation method gave us an overview of the role of each predictor throughout the different site conditions that exist in the training dataset.

3 Results

A summary of goodness-of-fit statistics for the SI models fitted is shown in Table 5. Regarding model calibration, Random Forest had an optimum mtry value of 2/3 and Boosted Trees had optimum values of 20 for the number of iterations (trees), one for the interaction depth and 0.1 for the shrinkage penalization. The optimum constants for the Cubist model where 20 committees and nine neighbors for correction. The parameter estimates and p values of the stepwise models fitted for predicting the nine growth parameters are presented in the Appendix (Table 6). All the slope parameters estimated for these models were significant at least at 95% level of confidence. The average amount of predictors per models was three, having a maximum of seven for the \(t_{ip}\) model and a minimum of zero for the c model, which resulted a null model.

Concerning model performance, the rule-based approaches provided the highest apparent values of \(R^2\), having the Random Forest model the best performance (\(R^2\)=0.802), followed by Boosted Trees and Cubist. In contrast, the indirect parametric models had significantly smaller values of apparent \(R^2\), with a maximum value of 0.384 for the case of MMG1. The direct GLM had apparent performance slightly higher than MMG1 (\(R^2\)=0.414), but much lower than the rule-based approaches. Overall, the models fitted had a number of predictors ranging from five (MMG1, MMG2, MMG and Boosted Trees models) to nine (IP1, IP2 models). Considering our variable selection criterion, the three rule-based models fitted reached the maximum cross-validated performance using three predictors, which were, in all cases, BIO1, BIO2 and BIO3. The AIC-based variable selection for the direct GLM also produced an optimum number of predictors equal to three, which where BIO7, BIO8, BIO9.

Table 5 Summary of the number predictors and parameters, calibration constants and goodness-of-fit statistics of the models fitted. p is the number of predictors included in the model an R is the relative overfitting rate

Concerning bootstrap validation, rule-based models presented a very high relative overfitting rate, ranging from 54% to 72%. Among these models, Cubist resulted in the model with the best performance both in NRMSE\(_{632+}\) (0.206) and \(R^2_{632+}\) (0.285) and also with the lowest relative overfitting rate (54%). The direct GLM seemed extremely prone to overfitting, with a very low validation performance (\(R^2_{632+}\) = 0.136) and a relative overfitting rate reaching 80%. In comparison, the fitted indirect parametric models produced more variable bootstrap performances, with \(R^2_{632+}\) values ranging from 0.187 to 0.298, and relative overfitting rates (R) ranging from 0.311 to 0.581. Models ABC, MMG2, and MMG3 produced low performances but also had very high overfitting rates. Model IP2 had a poor apparent performance but also showed a low relative overfitting rate. Models MMG1 and IP1 were the ones with the best overall bootstrap performance, presenting a high \(R^2_{632+}\), a low NRMSE\(_{632+}\) and a moderate relative overfitting rate. Considering that the performance estimates of MMG1 were slighltly higher than IP2, we selected this model as the best parametric alternative for subsequent analyses. The final form of MMG1 model, after combining the corresponding equation in Table 4 with the linear models fitted was as follows:

$$\begin{aligned} SI= \frac{a_0 + a_1 BIO3 + a_2 BIO4 + a_3 BIO9}{1 + a_4 (a_5 BIO2 +a_6 BIO4 + a_7 BIO7)^{a_8}}, \end{aligned}$$
(7)

where \(a_{0}\)=-117.677 , \(a_{1}\)= 5.132701, \(a_{2}\)=0.159120 , \(a_{3}\)=-8.09051 , \(a_{4}\)= 0.00283399, \(a_{5}\)=46.4528 , \(a_{6}\)=1.34853 , \(a_{7}\)=-47.092005 , and \(a_{8}\)= 1.69057.

Plots of observed SI vs predicted SI and predicted SI vs residuals for Cubist and MMG1 models are shown in Fig. 4. After analyzing the model residuals of both approaches, we did not found any significant trace of heteroscedasticity. In order to support model interpretation, a plot of LOESS curves that represent predicted SI vs. standardized predictors for MMG1 model is presented in Fig 5.

Fig. 4
figure 4

Plots of: a observed SI vs predicted SI for MMG1; b predicted SI vs residuals for MMG1; c observed SI vs predicted SI for Cubist; and d predicted SI vs residuals for Cubist. The dashed line in plots a and c represents the linear trend of observed vs predicted SI, while the solid line in plots b and d represents the LOESS-smoothed trend of residuals vs predicted SI

Fig. 5
figure 5

LOESS curves of predicted SI vs standardized predictors for MMG1 model. Fore ease of visualization lines are represented in different styles (dark grey + dash-dot line: BIO2, black + solid line: BIO3, black + dashed line: BIO4, grey + dotted line: BIO9, and light grey + dash-dot line: BIO7)

4 Discussion

The fitted models explained from 10% to 80% of the SI variability. The apparent performance range of the rule-based models fitted is similar to the observed in other studies for Random Forest (\(R^2\)=0.68, Weiskittel et al. 2011; \(R^2\)=0.59, Barrio-Anta et al. 2020) and Boosted Trees (\(R^2\) from 0.44 to 0.64, Aertsen et al. 2010). Similarly, the performance of the parametric models fitted (\(R^2\sim\) 0.27-0.38) was on average near the spectrum that can be found in literature (24%-27%, Monserud et al. 2006, 31%-52%, Aertsen et al. 2010, 34%-42%, Sabatia and Burkhart 2014 of explained variability).

Among the rule-based approaches, Random Forest showed the best apparent \(R^2\). However, its bootstrap validation performance was very similar to Boosted Trees and slightly lower than Cubist, which, despite being the model with the lowest apparent performance of the three, resulted in the most robust alternative. Though the differences in \(R^2_{632+}\) between the three tested techniques might be too slight to undoubtedly conclude that Cubist is the most suitable rule-based technique for SI-environment modeling, its lesser tendency to overfitting should be considered an important advantage. However, even considering the better robusticity shown by Cubist, the relative overfitting rates found for rule-based techniques are still very high in comparison with the ones provided by some indirect parametric models. We think that this could be an important concern for using these rule-based models for predicting SI out of the frame of this study. Using a complementary dataset for validation would be a necessary line of action for addressing this uncertainty in further research.

Regarding indirect parametric models, we observed that models based on the combination of growth + age (IP2 and MMG2 models) had lower performance. In contrast, combinations of height + age (IP1 and MMG1) had the highest performance. The fact that MMG1 outperformed the rest of the parametric models, both in apparent and validation statistics, may suggest that \(h_{mmg}\) and \(t_{mmg}\) are more effective than the rest of growth parameters for representing growth–climate relationships. In this context, using \(h_{mmg}\) and \(t_{mmg}\) directly for dominant height projection—as \(t_1\) and \(H_1\) inside Eq. 1—would be a reasonable modeling alternative to explore in further research.

Despite the higher apparent performance of the Random Forest model, the observed robustness of MMG1 may make this parametric approach a better alternative for SI prediction. This finding runs parallel to the results found by Sabatia and Burkhart (2014), where a parametric model outperformed Random Forest, in terms of robustness and reliability. After checking the observed vs predicted SI values (Fig. 4), we found a moderate regression to the mean in both models. Besides, the range of predicted values was slightly narrower for Random Forest than for the parametric approach. This regression to the mean was also found in previous studies (Hamel et al. 2004; Weiskittel et al. 2011), and it has been suggested to be a common characteristic of the SI-environment models (Sabatia and Burkhart 2014).

The comparison between performance statistics of the direct GLM and MMG1 revealed that the indirect approach was more robust, being less prone to overfitting than the direct model. We think this finding might imply that, indeed, the indirect methods could be more effective at capturing the existing relationships between tree growth and climate and, hence, more ecologically consistent than direct SI modeling approaches. However, the proposed indirect method has got also some potential drawbacks. Regarding data acquisition, stem analysis is much more expensive and technically complex than a simple forest inventory of temporal research plots. We think that this is a crucial aspect to consider for practical applications of indirect SI modeling.

Concerning the ecological interpretation of MMG1 model, assessing the role of each predictor in Eq. 7 is not a straightforward task, as some predictors appear both in the numerator and in the denominator of the model form, apparently with a similar behaviour. The analysis of LOESS curves in Fig. 5 revealed that, overall, BIO2 and BIO3 had a strong positive influence on SI. As these variables are proportional to thermal diurnal ranges, and therefore potential indicators of altitude and continentality (Oliver 2005), we propose two possible reasons for their impact on SI: (1) altitude can imply cooler winter and night temperatures, which is positive for satisfying the chilling needs of the species, and (2) continentality may imply the absence of salty sea winds, potentially harmful for Scots pine (Øyen et al. 2006; Savill 2013). A special comment about the chilling effect is needed. Admittedly, certain conifer species, especially those naturally distributed in cold regions, such as Scots pine (Øyen et al. 2006), might suffer from a certain stress on carbon balance due to high respiration rates during mild winters (Pâques 2013; Smith et al. 1995). We already found this to be an important restriction for growth of radiata pine in the same region in a previous study (González-Rodríguez and Diéguez-Aranda 2020). We think this may be a particular feature of the studied region, characterized by a very humid and temperate-warm oceanic climate (predominantly Csb climate with Cfa and Cfb local variations, according to the Köppen-Geiger classification, updated by Kottek et al. 2006).

The predictor BIO9 showed, overall, a very slight positive influence on SI, occurring mostly on the second half of its range. We hypothesize that this slight trend may represent the positive influence of temperatures for growth during the growing season, specially for coldest sites at higher altitudes (corresponding to Cfb local variants). Thought this predictor should also capture the negative effect of summer drought stress on growth, the high precipitation registered in the set of sampled sites—with a minimum of 1246 mm—may make the latter a not significant constraint for growth. BIO4 and BIO7 showed a similar influence on predicted SI, having a maximum at the middle of their ranges, showing, respectively, a positive and negative contribution on growth towards the extremes. In the case of BIO7, low values of this predictor may have a negative influence on SI because of the same reason explained for BIO2 and BIO3. High values of BIO7 may contribute negatively to SI representing the effect of frost stress factors in very contrasted temperature regimes. Finally, concerning BIO4, the positive effect on SI of its left tail may be also related to the influence of frost stress in sites with very regular precipitation regimes, associated with high altitudes (Cfb climates).

5 Conclusion

We fitted a set of rule-based and indirect parametric models for predicting site index (SI) of Scots pine stands as a function of bioclimatic variables. The models fitted explained from \(\sim\)10% to \(\sim\)80% of the response’s variability. The rule-based approaches tested showed very high apparent performance statistics, being Random Forest the one with the highest \(R^2\). However, the bootstrap validation procedure carried out revealed that these techniques were also very prone to overfitting, and besides their actual differences in performance were little. In contrast, indirect parametric models showed a much lower overfitting rate. Two of these models, MMG1 and IP1, had better bootstrap error estimates than the rule-based approaches. Specifically, MMG1, a parametric model derived from height and age at the maximum mean growth point, showed the highest validation \(R^2\) among the set of fitted models, explaining up to 38% of the SI variability. According to this model, SI was mainly conditioned by different measures of continentality, but also by heat and rainfall variables. We concluded that, for the specific scope of our study, the use of an indirect approach based on easily interpretable parametric models was a better modeling alternative than the direct prediction of SI using rule-based techniques.