Original research article
Reducing a model of sugar metabolism in peach to catch different patterns among genotypes

https://doi.org/10.1016/j.mbs.2020.108321Get rights and content

Highlights

  • We propose a model reduction scheme for application to genetic studies.

  • It yields a unique reduced model that is adapted to the whole expected genetic diversity.

  • I maintains network structure and variable identity, to facilitate biological interpretation.

  • The model is succesfully applied to a kinetic model of sugar metabolism in peach fruit (Desnoues et al. 2018).

Abstract

Several studies have been conducted to understand the dynamic of primary metabolisms in fruit by translating them into mathematics models. An ODE kinetic model of sugar metabolism has been developed by Desnoues et al. (2018) to simulate the accumulation of different sugars during peach fruit development. Two major drawbacks of this model are (a) the number of parameters to calibrate and (b) its integration time that can be long due to non-linearity and time-dependent input functions. Together, these issues hamper the use of the model for a large panel of genotypes, for which few data are available. In this paper, we present a model reduction scheme that explicitly addresses the specificity of genetic studies in that: (i) it yields a reduced model that is adapted to the whole expected genetic diversity (ii) it maintains network structure and variable identity, in order to facilitate biological interpretation. The proposed approach is based on the combination and the systematic evaluation of different reduction methods. Thus, we combined multivariate sensitivity analysis, structural simplification and timescale-based approaches to simplify the number and the structure of ordinary differential equations of the model. The original and reduced models were compared based on three criteria, namely the corrected Aikake Information Criterion (AICC), the calibration time and the expected error of the reduced model over a progeny of virtual genotypes. The resulting reduced model not only reproduces the predictions of the original one but presents many advantages including a reduced number of parameters to be estimated and shorter calibration time, opening new promising perspectives for genetic studies and virtual breeding. The validity of the reduced model was further evaluated by calibration on 30 additional genotypes of an inter-specific peach progeny for which few data were available.

Introduction

Plants are sessile organisms endowed with the capacity to alter their development, physiology, and morphology depending on the context. Plant phenotype is the result of the interaction between the environment, cultural practices and plant’s genetic background (genotype). In the context of agronomy, increasing efforts have been made to select varieties that better meet consumers’ expectations. Today it is clear that future breeding should account for complex plant phenotypes, responding to a large panel of criteria, including increased yield, abiotic and biotic stress tolerance, and quality of food products.

Genotype-phenotype models have been considered as the tools of the future to design new genotypes since they can help to test the performance of new genotypes (G) under different Environments (E) x Management (M) conditions. The challenge is to build ecophysiological models that integrate genetic information associated to specific processes (traits). In general, genotypes are defined by a set of parameters, which depends on gene expression or allelic combination, depending on the genetic complexity of the considered trait as well as the available information [2]. Genetic-improved ecophysiological models can then be used to capture GxExM interactions. They can also be used to design “ideotypes” i.e. real or virtual plant cultivars expressing an ideal phenotype adapted to a particular biophysical environment, crop management, and end-use [3], [4]. For this, it is necessary to combine the genetic-improved ecophysiological model with a multi-objective optimization algorithm to identify the best genotypes for specific conditions [5].

Construction of gene-to-phenotype models is challenging. First, the approach requires that a sole and unique model can reproduce the behavior of all genotypes, in multiple environments, the diversity observed being supported by different sets of parameters. Second, calibration of the models for a large number of genotypes is generally difficult, due to a large number of parameters (typically from 50 to 200 in whole-plant ecophysiological models) along with a restricted number of observations [6], [7]. Due to the model complexity and non-lineairities, evolutionary and bio-inspired algorithms are increasingly used both for parameter estimation and ideotype design. These methods can explore high-dimensional parameter space efficiently but they rely on a large number of model evaluations, that can rapidly increase the computational time required to find a solution. Third, the genetic architecture of complex traits can be very complex, due to epistatic and pleiotropic effects. In this sense, the presence of biologically-meaningful parameters can considerably help the interpretation of the resulting genetic architecture, facilitating the breeding process. Ideally, most the model is close to omics data, the easier the linkage between the parameters and the underlying physiological processes.

Kinetic modeling has been successfully applied to several metabolic pathways in plants [8], [9], [10]. In this spirit, a kinetic model of sugar metabolism has been developed in [1] to simulate the accumulation of different sugars during peach fruit development. The model correctly accounts for annual variability and the genotypic variations observed in ten genotypes derived from a larger progeny of inter-specific peach cross. At term, the objective of the research is to integrate the genetic control of sugar metabolism in this kinetic model and develop a methodology to design ideotypes by virtual breeding. To achieve this, it is necessary to estimate accurately the values of the influential parameters of the model for the whole progeny of 106 genotypes for which few data are available. Unfortunately, the size of the parameter space and the non-linearity of the reaction rates make the calibration of the model unreliable and time-consuming.

One way to face these weaknesses is to reduce the complexity of the model [11]. Several reductions and approximation approaches exist in the literature, each one addressing a specific aspect of model complexity [12], [13]. A number of methods, such the lumping method [14], [15] or the classical quasi-steady-state (QSS) approaches, aim at reducing the number of variables based on chemical or time-scale considerations [16], [17]. Methods from sensitivity analysis may help to reduce the parameter space by identifying non-influential parameters, whose values can be fixed by broad literature data [18], [19], [20], [21]. Last but not least, the structure of the model itself can be simplified. Methods for model decomposition [22], [23], [24] aim to separate the system into sub-networks or sub-models, that are easier to analyze and parameterize. The choice of reaction kinetics is also very important for model complexity. In this perspective, the use of simplified enzyme kinetics [25], [26], [27] may be useful to avoid the emergence of numerical and identifiability issues.

Different reduction methods can be combined together. In [28] for instance, model decomposition is associated to variable transformation, resulting in a low-dimensional description of the “exterior” part of the system, whereas in [15] time scale analysis is used to identify a cluster of fast variables to be lumped together.

In the work of Apri et al. [29] different reduction steps (parameter removal, node removal, variable lumping) are sequentially tested following a practical scheme: at each step, if the reduced model, after parameter re-estimation, can reproduce some target outputs, the modification is selected, and rejected otherwise. From the point of view of genetic applications, a major drawback of the approach of Apri et al. [29] is that the selection of acceptable reduction results depends on the specific target dynamics.

As a consequence, different target outputs (i.e. genotypes) can give rise to reduced models with different structures or parameters number, making their comparison difficult in the perspective of genetic studies.

The objective of this work was to provide a method to build a reduced model that is adapted to the specificity of genetic studies in that: (i) it yields a reduced model that is adapted to the whole expected genetic diversity (ii) it maintains network structure and variable identity, in order to facilitate the biological interpretation of the reduced model.

Similarly to the approach of Apri et al. [29], our reduction strategy tests different methods in several parallel steps that, if retained, are combined together into a final reduced model (Fig. 1).

First, multivariate sensitivity analysis was attempted to reduce the parameter space [30]. Second, we tried to simplify the structure of the model by reducing non-linearity and time-dependent forcing, and finally, a quasi-steady-state approximation based on time-scale separation was tested to reduce the size of the system. Particular attention was devoted to the systematic evaluation of the different reduction methods. Three main criteria were used to assess the interest of the reduction: (i) the corrected AIC value, evaluating the relative gain between model simplification and loss of accuracy over an experimental dataset, (ii) the calibration time, as a measure of model efficiency, (iii) the expected error between the original and the reduced model over a population of virtual genotypes, as a measure of the reliability of the simplification scheme.

As a case study, the proposed reduction scheme was applied to the model of sugar metabolism proposed by Desnoues et al. [1]. The resulting reduced model correctly reproduces data on the original 10 genotypes with only 9 estimated parameters (out of 14 in the original model) and a gain in calibration time over 40%. In addition, the reduced model was successfully calibrated on 30 new genotypes of the same inter-specific peach progeny, for which fewer data points were available.

The paper is organized as follows. In the next section, we briefly present the original model of sugar metabolism developed by Desnoues et al. [1]. Section 3 is devoted to the description of the individual reduction methods, whereas Sections 4 and 5 present, respectively, the datasets and the numerical methods used for the assessment of the proposed model reduction. The results of the application of our reduction scheme to the model of sugar metabolism are reported in Section 6. A general discussion on the advantages and limitations of our approach closes the paper.

Section snippets

Description of the peach sugar model

The model developed by Desnoues et al. [1] describes the accumulation of four different sugars (sucrose, glucose, fructose, and sorbitol) in peach fruit during its development over a progeny of ten peach genotypes with contrasting sugar composition. The fruit was assumed to behave as a single big cell with two intra-cellular compartments, namely the cytosol and the vacuole. Carbon enters the fruit from the plant sap which is transformed by a metabolic network, including enzymatic reactions and

Model reduction methods

In this section, we present a reduction scheme explicitly dedicated to genetic studies that combines different methods in several parallel steps as shown in (Fig. 1) and explained in the next subsections.

Experimental data

The 106 peach genotypes used in this study come from an inter-specific progeny obtained by two subsequent back-crosses between Prunus davidiana (Carr.) P1908 and Prunus persica (L.) Batsch ‘Summergrand’ and then ‘Zephyr’ [34]. They were planted in 2001 in a completely randomized design in the orchard of the INRAE Research Centre of Avignon (southern France). Experimental monitoring of peach fruit growth and quality has been conducted in 2012, as described in [32]. The concentration of different

Mathematical notations

  • x(t, p(k)): original model associated to parameters p(k) (i.e. genotype k).

  • x˜(t,p˜(k)): reduced model for the genotype k.

Note that the notation x˜(t,p˜(k)) can apply to different versions of the reduced model, depending on the considered reduction method.

  • TS(k): set of the NS simulation times for the genotype k.

  • TM(k): set of the NM measurement times for the genotype k.

  • X(k)(tj): N experimental observations for the genotype k, with tjTM(k). Note that N=4×NM×r, where r is the number of replicates

Strategy 1: Identification of low sensitive parameters

The objective of the sensitivity analysis was to identify parameters having a significant influence on the outputs of the model, over the whole dynamics and for all tested genotypes. A multivariate sensitivity analysis [30] was used for this purpose. The aggregate generalized sensitivity indices (aGSI) (see Section 3.1) shown in Fig. 4 give a common ranking of model parameters according to their influence on the whole sugar phenotype, as it is made up by the four output sugars (sucrose,

Discussion

Models of metabolic systems are usually very complex. Complexity stems from the number of components and the high degree of non-linearity included in both the network structure and the individual reaction rates. As a consequence, metabolic models usually suffer from numerical and identifiability issues that seriously hamper their application in the context of genetic studies, especially when they have to be calibrated for hundreds of genotypes. In this paper, we present a reduction scheme that

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

HK was founded by a scholarship of the Lebanese government. We would like to thank V. Signoret for her help in maintaining the inter-specific peach progeny. We are grateful to the IE-EMMAH UMR1114 and IE-GAFL UR1052 teams for taking care of the experimental orchard, and to Dr Olivier Martin for his help in statistics.

References (44)

  • D.E. Goldberg

    Genetic algorithms in search

    Optim. Machine Learn.

    (1989)
  • E. Desnoues et al.

    A kinetic model of sugar metabolism in peach fruit reveals a functional hypothesis of a markedly low fructose-to-glucose ratio phenotype

    Plant J.

    (2018)
  • J.W. White et al.

    Gene-based approaches to crop simulation

    Agron. J.

    (2003)
  • V. Letort et al.

    Parametric identification of a functional–structural tree growth model and application to beech trees (fagus sylvatica)

    Funct. Plant Biol.

    (2008)
  • B. Quilot-Turion et al.

    Optimization of allelic combinations controlling parameters of a peach quality model

    Front. Plant Sci.

    (2016)
  • N. Bertin et al.

    Under what circumstances can process-based simulation models link genotype to phenotype for complex traits? case-study of fruit and grain quality traits

    J. Exp. Bot.

    (2010)
  • N. Curien et al.

    The music industry in the digital era: toward new contracts

    J. Media Econ.

    (2009)
  • T. Nägele et al.

    Mathematical modeling reveals that metabolic feedback regulation of SnRK1 and hexokinase is sufficient to control sugar homeostasis from energy depletion to full recovery

    Front. Plant Sci.

    (2014)
  • B.P. Beauvoit et al.

    Model-assisted analysis of sugar metabolism throughout tomato fruit development reveals enzyme and carrier properties in relation to vacuole expansion

    Plant Cell

    (2014)
  • M.S. Okino et al.

    Simplification of mathematical models of chemical reaction systems

    Chem. Rev.

    (1998)
  • A.N. Gorban et al.

    Model Reduction and Coarse-Graining Approaches for Multiscale Phenomena

    (2006)
  • T.J. Snowden et al.

    Methods of model reduction for large-scale biological systems: a survey of current methods and trends

    Bull. Math. Biol.

    (2017)
  • Cited by (0)

    View full text