Machine learning approach for higher-order interactions detection to ecological communities management
Introduction
Extensive experimental and observational evidence supports species interactions as responsible for shaping community structures [1]. Identifying species interactions’ helps us to interpret growth processes and population control, which constitutes the basis for establishing useful natural resources management, conservation and restoration criteria [1], [2], [3], [4]. Here, we focus particularly on species interactions called higher-order interactions (HOI), which have been analyzed from numerous perspectives, from explaining biodiversity levels in ecological communities while considering differences in the interspecific competitive ability to modeling competition based on Lotka-Volterra models [5], [6], [7].
Lotka–Volterra competition models are vastly employed in experimental ecology [8]. There is evidence in favor of and against these tools. For instance, in some laboratory experiments, the Lotka-Volterra model fits well for two species [9], [10]. However, the model does not fit empirical observational data and fails to explain why species may coexist in fluctuating environments [11]. A considerable number of works show that a real multi-species system cannot be explained in terms of simple linear equations such as the Lotka-Volterra model [12]. It is not surprising that the theoretical assumptions of Lotka-Volterra model do not hold for empirical multi-species data, since they do not account for indirect effects or HOI which shape multi-species communities [13], [14], [15], and which are undoubtedly important from a theoretical and empirical point of view [16].
Since HOI and indirect effects modify the interactions of the system among species, these terms, although different, commonly appear in the literature as synonyms, as is the case of the final result from a direct interaction between two species altered by the impact of a third species. An indirect effect occurs when the end result of a direct interaction between two species is altered by a third species effect. A clear example of this is when a dominant species may stop being one due to the interference of a predator that prefers it before other species [13], [17], [18], [19], [20]. On the other hand, HOI occur when a species affects the nature of the interaction between two species without altering the final result of the first two in terms of competitive dominance [14]. HOI emerge when a third species modifies the nature of the direct interaction between two species into the community, either by a competition interaction or a predator-prey relationship. In [21], it is explained that, for this reason, these two terms have been employed indistinctly in the wrong way. In essence, HOI modify the intensity of the competition/depredation coefficients, whereas indirect effects alter the final result of the interactions. Also, HOI can act in numerous directions while indirect effects not. One of the results of the HOI is the nullification among each other, which makes it appear as only direct interactions are present. That is conceptually false if only abundances are observed [14].
Specific community composition may be altered due to these HOI by changing species abundance, competition coefficients, predation intensity, or long-term behavior [14], [21]. The community structure and function are the results of the biotic and abiotic relationships that constitute it. Failure in identifying these relationships leads to make mistakes when trying to predict the dynamical behavior of the community with results obtained in species pairs subsets [16], [21], [22], [23], [24], [25]. The challenge is to find consistent theoretical models that guide the management, protection, and conservation practices of entire communities [26]. Thus, there is a growing necessity in analyzing the resource as a whole and employing predictive models based on principles from theoretical ecology [16], [27].
Recent studies tackle the study of HOI from multiple perspectives. For instance, robustness in mathematical models of ecological communities is addressed in works like [28] and [29]. In [26], the authors explored the mathematical conditions for the coexistence of three-species models concerning the presence or absence of HOI. Also, in [30], the authors used numerical simulations of mathematical models to deduce that HOI can be interpreted as emergent properties of Lotka-Volterra models. Finally, in [31] the question about which is the most appropriate definition for HOI is discussed. In the big picture of the long-studied indirect effects research, HOI research poses interesting open questions. Particularly, questions arise in regard to in which indirect effects are present, how to detect them, and even the expected contribution of these effects on community structure [16]. In contrast, HOI have gone unnoticed compared to indirect effects, mostly due to misconceptions of theoretical, experimental, and mathematical modeling of the concept of HOI, which have surpassed ecologists’ efforts. If a test determines the presence of HOI in sample data from laboratory experiments, the next step is to propose simplified mathematical models that incorporate HOI terms to describe such data.
Recently, machine learning (ML) approaches have been aiding ecological and biomedical modeling [32], [33], [34], [35], [36]. According to [37], ML focuses on identifying patterns from complex data, and predicting outcomes for additional data. In this sense, for instance, in [38] a supervised ML approach is proposed based on Lotka-Volterra equations to derive a new general-purpose classification algorithm. In [39], the authors propose a ML approach to infer trait-matching for species interactions prediction in plant-pollinator systems. And in [40] the authors review ML approaches for ecological interaction networks reconstruction based on data. However, to the best of our knowledge, there is a lack of works that employ ML approaches for automatic HOI detection. As pointed out previously, population dynamics patterns in ecological communities behind HOI and indirect effects are difficult to capture. The ML models have shown to be robust in a wide variety of applications, and exceed in automatic identification of patterns. Thus, ML models could be more robust tools than classical statistical tests to detect the presence or absence of HOI by training these models with enough synthetic data that resembles experimental data.
In this work, we propose a ML approach to detect the presence of HOI for three species communities. Our approach is based on sampling synthetic data from ordinary and stochastic differential equation models, where NON-HOI and HOI are considered. The sampling is based on estimated Lotka-Volterra parameters from single- and pairwise-species communities experiments. Experimental data of the genus Drosophila is used as a case study. As an additional contribution, in this work we present a novel methodology based on stochastic differential equations to compare classical tests for detecting HOI with our ML approach.
Section snippets
Mathematical background
In this Section, we present the mathematical base model that describes population dynamics in ecological communities in the presence or absence of HOI. Afterward, we introduce classical statistical tests found in the literature that determine the presence or absence of HOI from experimental data.
Case study: Drosophila communities
In this Section, we describe the first method used in this work to detect HOI in the Drosophila data with three species interactions. We will first start by explaining the experimental methods. Experimental exploration of species population dynamical aspects, such as HOI, can be done with communities that allow the observations of multiple generations in a short period. Such is the case for multi-specific systems of Drosophila genus [14], [47].
In this work, we obtained experimental data ,
Synthetic noisy sample generation
To compare the predictive results of the classical statistical tests, we first generated synthetic noisy samples that resemble experimental data , Eq. (21), through a stochastic differential equation (SDE) approach [51], [52]. Namely, the SDE system, for generating NON-HOI noisy samples, is the modified Lotka-Volterra model Eqs. (2) to (4), given by
Proposed HOI recognition based on supervised machine learning
In this work, we propose an approach based on the ML methodology for detecting the presence of HOI in three-species experimental data . Our main contribution is that this approach could serve as a precursor for more robust tests in assessing the presence of HOI.
Discussion
The results of the 10-fold cross-validation showed that our methodology (described in Section 4.1) displayed robustness for automatic HOI detection, since performance on both the training and testing sets are similar (Fig. 5). For visualization purposes, we did not include the standard deviation in Fig. 5(a), but the maximum standard deviation was 0.36 (in specificity for GLM) and the minimum was 0.00 (for GBM and C5.0). GBM and C5.0 HOI recognition models presented good performance and
Conclusions
This work presents a novel approach based on machine learning for automatic detection of higher-order interactions in ecological communities. To compare our approach with classical statistical tests found in the literature, we also proposed an in silico methodology for generating noisy test sets. The classical tests failed to preserve robustness across the test sets, showing a low performance in identifying higher-order interactions, achieving a mean of 0.58 accuracy for the Bender test and
Code availability
The R code used in this work is available at the following URL: https://github.com/arielcam27/MLHOIs.
Acknowledgment
This work was partially supported by UABC internal project 400/2368. Ariel Camacho would like to thank SEP-SES-PRODEP-UABC for his Post-doctoral Fellowship support. We thank Gerardo Tovar for providing technical support. We also thank the anonymous reviewers, whose comments helped us to improve the article considerably.
References (66)
Diversity in interaction strength promotes rich dynamical behaviours in a three-species ecological system
Appl. Math. Comput.
(2019)- et al.
Dispersal strategies in patchy environments
Theor. Popul. Biol.
(1984) - et al.
Competition between two species for two complementary or substitutable resources
J. Theor. Biol.
(1975) - et al.
Towards a probabilistic understanding about the context-dependency of species interactions
Trends Ecol. Evol.
(2020) - et al.
Competition between species: theoretical models and experimental tests
Theor. Popul. Biol.
(1973) - et al.
Competition, disturbance and local diversity patterns of substratum-bound clonal organisms: a simulation
Ecol. Model.
(1984) - et al.
Robustness of the pollination-herbivory system with high-order interactions to habitat loss
Ecol. Model.
(2019) Defining higher-order interactions in synthetic ecology: lessons from physics and quantitative genetics
Cell Syst.
(2019)Applications of machine learning to ecological modelling
Ecol. Model.
(2001)- et al.
A novel support vector sampling technique to improve classification accuracy and to identify key genes of leukaemia and prostate cancers
Expert Syst. Appl.
(2011)