figure a

Introduction

In utero exposure to gestational diabetes mellitus (GDM) is a key risk factor of obesity at birth and beyond [1, 2]. In addition to its detrimental effects on offspring adiposity, maternal GDM is also associated with poor metabolic health in offspring later in life, including greater insulin resistance, impaired glucose tolerance or type 2 diabetes, low insulin secretion and alterations in adipokines [3,4,5,6]. The effects of maternal GDM on biomarkers of metabolic risk in offspring have been detected as early as 3 years of age [7] and many studies have been able to establish associations of GDM exposure with metabolic biomarkers independent of the child’s body composition [6,7,8], suggesting a unique effect of maternal GDM on in utero ‘programming’ of the fetus’ future metabolic health.

Metabolomics, the systematic and comprehensive study of low-molecular-weight compounds in biological tissues and fluids, has emerged as a powerful tool to shed light on pathways that link exposures to health outcomes [9]. The majority of published studies on metabolomics in relation to GDM have focused on identifying compounds in maternal fluids (plasma, serum, amniotic fluid, urine) that are associated with and/or predictive of GDM [10,11,12,13,14,15,16,17]. When reviewing the literature, we also identified two studies that compared cord blood metabolite profiles of infants exposed vs those not exposed to GDM [18, 19]. Both studies leveraged metabolomics data assayed on targeted platforms, which provide information on absolute concentrations of metabolites belonging to a specific but limited set of biochemical pathways hypothesised to be relevant to the research question based on a priori knowledge. Given the scanty knowledge on metabolite profiles associated with GDM exposure in offspring, metabolomics studies employing untargeted, data-driven approaches are needed to comprehensively assess GDM-related metabolic alterations in offspring. Moreover, beyond evaluating associations of GDM with the offspring metabolome during infancy, understanding the influence of maternal GDM on metabolite profiles during childhood and adolescence is important, given that these are vulnerable life stages for the development of excess adiposity [20] and metabolic disease risk [21]. A better understanding of the mechanisms and pathways that link early-life exposures to adverse health outcomes during these time frames will aid in identifying modifiable determinants of these pathways.

In this prospective analysis of over 500 multi-ethnic mother–child dyads, we extend current knowledge surrounding the pathways associated with and consequences of in utero exposure to GDM via three research objectives. First, we sought to identify fasting serum metabolite profiles during late childhood (age 6–14 years) that differ with respect to in utero exposure to maternal GDM. Next, we assessed the extent to which the relationship between maternal GDM and the metabolites of interest persisted over 5 years of follow-up, into adolescence (age 12–19 years). Finally, in light of evidence in adults that certain metabolite patterns precede worsening of conventional metabolic-disease risk factors [22], we investigated the associations of GDM-associated metabolites with adiposity and conventional biomarkers of metabolic risk from late childhood through adolescence, a vulnerable life stage for the development of excess adiposity [20] and metabolic disease risk [21]. The conceptual model for our research study is depicted in Fig. 1.

Fig. 1
figure 1

Conceptual diagram of associations among exposure to maternal GDM, fasting metabolite patterns and health outcomes (adiposity and metabolic biomarkers) throughout childhood and adolescence

Methods

Study population

Study participants were from the Exploring Perinatal Outcomes among Children (EPOCH) study, a historical prospective cohort of youth with the original aim of characterising long-term consequences of in utero exposure to maternal diabetes. Details on eligibility and recruitment have been previously published [23]. Between 2006 and 2009 (‘T1’), we recruited 604 participants whose mothers were members of the Kaiser Permanente of Colorado (KPCO) health plan. Of them, we excluded children of seven women who had type 1 diabetes, followed by five without sufficient blood volume for untargeted metabolomics profiling for the present study. The analytic sample comprised 592 youth, aged 6–14 years (mean ± SD age, 10.4 ± 1.5 years; interquartile range [IQR], 9.4–11.5 years), with metabolomics data. Of the 592 participants at T1, 403 returned for a follow-up visit approximately 6 years later, from 2012 to 2015 (‘T2’), when participants were 12–19 years of age (mean ± SD age, 16.7 ± 1.2 years; IQR, 15.9–17.6 years) and had adequate blood volume for metabolomics assays. Figure 2 shows the study population flow. This study was approved by the Colorado Multiple Institutional Review Board (protocol no. 05-0623). All participants provided informed consent.

Fig. 2
figure 2

Study participant flow diagram. T1D, type 1 diabetes

In utero exposure to maternal GDM

Exposure to maternal GDM (yes vs no) was defined as a physician’s diagnosis of gestational diabetes during the index pregnancy. Diabetes status was ascertained from the KPCO perinatal database, an electronic database that links prenatal and neonatal medical records. All pregnant women who are members of the KPCO Health plan are routinely screened for GDM at 24–28 weeks of gestation using the standard two-step protocol [24]. As previously described [25, 26], GDM was diagnosed when two or more glucose values during the diagnostic 3 h 100 g OGTT met or exceeded the criteria for a positive test [24].

Offspring blood collection

At T1 T2, trained research assistants collected an 8 h fasting blood sample from the antecubital vein. All samples were refrigerated immediately, processed within 24 h and stored at −80°C until the time of analysis. These samples were used for untargeted metabolomics profiling and metabolic biomarker assays.

Untargeted metabolomics profiling

Metabolon (Morrisville, NC, USA) carried out untargeted metabolomics profiling in fasting plasma collected at T1 and T2 via a multi-platform mass spectroscopy (MS)-based technique. The procedure identified 1193 unique features at both time points. A key strength of this study is that we sent samples from two research visits for metabolomics profiling at the same time so that technicians were able to balance batches by research visit, thereby enabling comparability of relative metabolite concentrations across the two time points. The electronic supplementary materials (ESM) Methods provides information on sample preparation and laboratory procedures.

We sent participants’ serum for metabolomics profiling in two separate batches: the first batch comprised a pilot sample of 100 participants to assess feasibility and the second batch comprised the remaining participants who were not included in the pilot. Prior to formal statistical analysis, we removed metabolites with ≥20% missing values [27] separately for each batch (as is the current recommendation [28]), then imputed the rest of the missing values using the k-nearest neighbour algorithm (k = 10 neighbours used for imputation) [29]. The first batch of participants had 913 metabolites after removing those with high missingness, and the second batch had 898 metabolites. We then merged the two batches for subsequent data processing. Following the merge, we retained metabolites that were present in both batches (767 compounds) and performed log10-transformation, followed by metabolite normalisation and correction for batch effects (as well as other biological and technical variability) using the ‘remove unwanted variation’ (RUV) method (the number of factors of unwanted variation estimated from the data [k] = 2), which has proven utility for high-dimensional biological data [30]. All metabolite processing was performed using R (Version 3.5.3; Vienna, Austria).

Conventional biomarkers of metabolic risk

Using fasting blood collected at T1 and T2, we assayed fasting glucose enzymatically, and fasting insulin leptin, and adiponectin via radioimmunoassays (Millipore, Darmstadt, Germany). We calculated the HOMA-IR [31]. Fasting insulin, glucose, and estimated insulin resistance are indicators of glycaemic homeostasis in children [31,32,33] and are associated with future cardiovascular risk [34]. Leptin and adiponectin are adipocyte-derived peptide hormones that regulate weight and metabolism through complementary actions [35]. Increasing evidence suggests that altered adipocytokine levels, specifically elevated leptin, may actually predict subsequent weight gain [36].

Anthropometric and body composition assessment

At T1 and T2, trained research staff measured height (m) and weight (kg) of participants in light clothing and without shoes. We used these values to calculate BMI (kg/m2) and evaluated this variable in native units, as well as standardised as an age- and sex-specific z score using the WHO growth reference for children aged 5–19 years [37]. As previously described [26], research staff also measured waist circumference (cm) (a correlate of central visceral adiposity [38, 39]) according to the National Health and Nutrition Examination Survey (NHANES) protocol [40]. They also measured subscapular, tricep and suprailiac skinfold thicknesses using Holtain callipers (mm; Crymych, UK) the sum of which (‘skinfold sum’) was used in the analysis as a proxy for subcutaneous adiposity [39].

A trained technician performed magnetic resonance imaging (MRI) of the abdominal region with a 3 T HDx Imager (General Electric, Waukesha, WI, USA) with the participant in the supine position. A series of spin-lattice relaxation time coronal images were taken to locate the L4/L5 plane. One axial, 10 mm, spin-lattice relaxation time image at the umbilicus or L4/L5 vertebrae was analysed per participant to determine visceral adipose tissue (VAT) and subcutaneous adipose tissue (SAT).

Covariates

We calculated maternal pre-pregnancy BMI (kg/m2) using clinically recorded pre-pregnancy weight from KPCO-derived medical records and measured height at the T1 visit. At T1, the women filled out a questionnaire regarding treatment received for GDM. Treatment was categorised as diet and/or exercise only (n = 61), diet and/or exercise with insulin (n = 20), and insulin only (n = 5). Due to the small sample size for insulin only treatment, we combined this category with diet and/or exercise when using this variable as a covariate in the analysis. At T1, the women also reported on their education level and smoking habits during pregnancy (smoked while pregnant with index child, yes vs no) via a self-administered questionnaire. In the analysis, we categorised maternal education as a three-level variable: less than high school, high school diploma or equivalent, and higher than high school.

We calculated participants’ age at T1 and T2 as the difference between date of each research visit and delivery date. Participants self-reported on their race/ethnicity at T1 as non-Hispanic white, non-Hispanic black, Hispanic and non-Hispanic other. At both research visits, participants reported their pubertal development based on pictorial diagrams of the Tanner stages [41], which had been validated against physician-assessed Tanner staging and puberty-related hormones [42]. We based pubertal status on pubic hair development in boys and breast development in girls. For bivariate analysis, we categorised a child as prepubertal (Tanner stage = 1), pubertal (Tanner stage = 2 or 3), and late/postpubertal (Tanner stage = 4 or 5). In multivariable models, we entered this variable ordinally. We obtained information on the participants’ physical activity levels at T1 and T2 using the 3-Day Physical Activity Recall (3DPAR) Questionnaire, which captures habitual physical activity based on a 3-day reference period [43]. Using 3DPAR, we derived mean energy expenditure (mean metabolic equivalents [METs]/day over a 3-day period). At both visits, participants completed the Block Kid’s Food Frequency Questionnaire (FFQ) [44]. We used these data in conjunction with the United States Department of Agriculture (USDA) Food Composition Database to estimate total energy intake (kJ/day).

Data analysis

Prior to formal analysis, we examined bivariate associations of in utero GDM exposure with maternal, perinatal and child characteristics to identify potential confounders. This step, in conjunction with our a priori knowledge of determinants of metabolic health in youth [45, 46], informed the selection of confounders (variables associated with the exposure and a potential determinant of the outcome) and precision covariates (physiological or lifestyle factors that may account for variability in the outcomes of interest) for multivariable analysis. We then carried out the main analysis in three steps, conforming to our study aims.

Step 1: identification of metabolites during late childhood (T1) that differ by GDM exposure

To reduce dimensionality of the metabolomics data, we used principal components analysis (PCA) to consolidate the 767 metabolites into latent variables, known as factors, that may be interpreted as metabolite patterns. The procedure generates as many factors as there are original metabolites, so we used visual inspection of the Scree plot for a ‘break’ (ESM Fig. 1), and standard criterion of eigenvalues >1 to determine the number of factors to retain (see ESM Methods for details on PCA).

Next, we examined associations of GDM exposure with the retained factor scores at T1, which were operationalised two ways: (1) as continuous outcomes (via linear regression); and (2) as a dichotomous outcome (via logistic regression), categorised as quartile (Q)4 vs Q1–3 of the factor score, to allow for potential threshold effects commonly observed in the analyses of potential biological pathways in relation to exposures and health outcomes (e.g., DNA methylation in relation to adiposity [47]). Given the relatively small number of factor scores retained from the PCA, we considered GDM to be associated with a factor score if its β value or OR coefficient was significant (p < 0.05). In addition, we valued consistency in estimates across multivariable adjustment as this suggests a true relationship between GDM and the factors, as opposed to a spurious association arising from inappropriate covariate adjustment.

We then explored these associations using a series of multivariable models. Figure 3 shows a directed acyclic graph (DAG) of our modelling strategy. Model 1 accounted for key sociodemographic confounders: age, sex, and race/ethnicity. Using this model, we tested for effect modification by sex, for which we observed no evidence (pinteraction > 0.30 for all), so we included both boys and girls in subsequent models. In Model 2, we accounted for covariates in Model 1 plus mother’s smoking habits during pregnancy, an important variable to consider when assessing the impact of the gestational environment on health outcomes in paediatric and adolescent populations [48]. Model 3 further included variables that could affect metabolite composition and thus could account for variability in metabolites (precision covariates): pubertal status, physical activity levels and total energy intake. In Model 4, we further adjusted for GDM treatment. Finally, in Model 5, we included Model 2 covariates plus pre-pregnancy BMI, which is partly a confounder to the relationship of interest but likely also an overlapping exposure due to the shared intrauterine milieu between maternal overweight/obesity and GDM [49].

Fig. 3
figure 3

Directed acyclic graph (DAG) of confounders and precision covariates in relation to Step 1 of the analysis, which involved investigating the associations of exposure to maternal GDM with metabolite patterns at T1 (age 6–14 years)

Step 2: investigation of whether associations of GDM with metabolites measured during childhood (T1) persists into adolescence (T2)

Based on the findings from ‘Step 1’, we examined the associations of GDM with key metabolites (i.e., those with factor loading >|0.50|) within factors of interest at T1. We also examined associations of GDM with the same metabolites, but measured in plasma collected at T2. The rationale behind this step was to assess for consistency of associations of GDM with individual metabolites within a given factor, and to assess whether associations of GDM with metabolites is similar across T1 and T2. For this step, we used multivariable linear regression models that accounted for maternal smoking habits during pregnancy, and child’s age, sex and race/ethnicity. We noted all associations with p < 0.05, but further focused on those that were statistically significant after Bonferroni correction. Findings from this step provided information on inter-individual consistency in associations of GDM exposure with metabolite profiles over time. We also calculated the intra-class correlation (ICC), which assesses intra-individual stability of metabolite concentrations over time.

Step 3: examination of associations of T1 metabolite patterns with adiposity and metabolic biomarkers throughout T1 and T2

Finally, we examined relationships between the GDM-related metabolite patterns derived at T1 with adiposity and metabolic biomarkers from T1 to T2 via mixed-effects linear regression models. The outcomes were the repeated measurements of each adiposity indicator or metabolic biomarker across the two research visits (up to two measurements per outcome per child). The explanatory variables included the metabolite factor score of interest at T1, longitudinal assessments of age, a random effect for the individual study identifier and an unstructured covariance matrix. We chose this approach over evaluating associations of factor scores with health outcomes at T1 and T2 separately, or with change in associations between T1 and T2, for model efficiency (i.e., if a participant only had outcome data at T1, their information still contributed to estimation of standard errors in the model). Thus, all 592 participants were included in this analysis. Using this model, we tested for an interaction with between-factor scores and age and sex and noted several significant pinteraction values with respect to sex, even after Bonferroni correction (ESM Table 1), so we ran subsequent multivariable models separately for boys and girls.

In multivariable analysis, we accounted for confounders followed by precision covariates (Fig. 4). Model 1 included child’s age, sex and race/ethnicity, Model 2 further accounted for pubertal status and Model 3 included Model 2 covariates plus physical activity and energy intake. When assessing results, we considered all associations that were statistically significant, at α = 0.05, but also noted significance after Bonferroni correction to account for the number of outcomes assessed by each set of multivariable models; these analyses were carried out separately for boys and girls. In sensitivity analyses, we also evaluated the impact of adjustment for GDM treatment. Doing so did not change the results (data not shown), thus we did not include this variable in the final models.

Fig. 4
figure 4

Directed acyclic graph (DAG) of confounders and precision covariates in relation to Step 3 of the analysis, which involved investigating the associations of metabolite factors at T1 (age 6–14 years) with health outcomes from T1 (age 6–14 years) to T2 (age 12–19 years)

We carried out all analyses using Statistical Analyses System software (version 9.3; SAS Institute, Cary, NC, USA), unless otherwise stated.

Results

The mean (± SD) age of participants at T1 was 10.4 ± 1.5 years (range, 6.0–13.9 years), and at T2, it was 16.3 ± 1.2 years (range, 12.6–19.6 years). Half of the participants were female, and 48.5% (n = 287) were non-Hispanic white. Approximately 16% (n = 92) of participants were exposed to maternal GDM. Table 1 shows the characteristics of mother–child dyads stratified by GDM exposure status. The GDM-exposed group contained a higher proportion of smokers and non-Hispanic white women vs the unexposed group. With respect to child characteristics, we noted lower fasting glucose (p = 0.006), insulin (p = 0.05) and HOMA-IR (p = 0.02) among GDM-exposed vs unexposed participants at T1. The opposite (albeit, non-significant) trends were observed at T2.

Table 1 Bivariate associations of in utero GDM exposure with background characteristics of 592 EPOCH participants

We retained six factors (metabolite patterns) that accounted for 50% of variance in the original metabolomics data at T1 (Fig. 5), which is similar to other studies of untargeted metabolomics data that used PCA for dimension reduction [50]. When we examined the associations of maternal GDM with continuous factor scores using multivariable linear regression, we did not detect any significant associations at α = 0.05 (i.e., no p values were <0.05). However, when we assessed relationships between GDM and odds of being in Q4 vs Q1–3 of each factor score, we noted a consistent positive association with ‘Factor 4’. In comparison with unexposed participants, GDM-exposed youth had 1.64 (95% CI 0.98, 2.76; p = 0.06) greater odds of being in Q4 vs Q1–3 for Factor 4 after adjusting for age, sex and race/ethnicity (Model 1; Table 2 and Fig. 5). Adjustment for maternal smoking (Model 2) did not materially change this estimate. Further adjustment for pubertal status, physical activity, and total energy intake (Model 3) slightly strengthened this association, such that it was statistically significant at α = 0.05 (OR 1.78 [95% CI 1.04, 3.04]; p = 0.04). As expected, this estimate was attenuated to non-significance after accounting for maternal pre-pregnancy BMI (Model 5) (see Fig. 5 and Table 2), suggesting that a large portion of the relationship between maternal GDM and Factor 4 is accounted for by variability in maternal weight status prior to pregnancy. For the above results, none of the estimates were significant after Bonferroni correction.

Fig. 5
figure 5

PCA was used to consolidate 767 metabolites into latent variables (‘factors’) that may be interpreted as metabolite patterns. Factor scores were selected based on visual inspection of the Scree plot and eigenvalues >1. Six factors were retained for use in analysis. Associations of in utero GDM exposure (yes [n = 92] vs no [n = 500]) with odds (OR) of being in Q4 vs Q1–3 of each factor score at T1 (age 6–14 years) among the 592 EPOCH participants are shown. The variance accounted for by each factor is indicated on the graph. Model 1: adjusted for child’s sex, age and race/ethnicity; Model 2: Model 1 + maternal smoking habits during pregnancy (smoking, yes vs no); Model 3: Model 2 + pubertal status (Tanner stage), physical activity (mean METs/day over a 3-day period) and total energy intake (kJ/day); Model 4: Model 3 + GDM treatment (diet and/or exercise only, diet and/or exercise with insulin or insulin alone, or no treatment); Model 5: Model 2 + maternal pre-pregnancy BMI (kg/m2). The estimates are plotted on a loge scale and the horizontal line represents the null (OR 1.00). *Significant at α = 0.05

Table 2 Associations of in utero GDM exposure (yes [n = 92] vs no [n = 500]) with metabolite factors at T1 (age 6–14 years) among 592 EPOCH participants

The metabolite composition of Factor 4 is shown in Table 3; this comprised phosphatidylcholines, lysophospholipids, diacylglycerols and phosphatidylethanolamines. All top-loading metabolites within Factor 4 were positive, indicating that a higher score corresponds with higher serum concentrations of the component metabolites.

Table 3 Metabolite composition of the GDM-related metabolite pattern (‘Factor 4’) identified at T1

Table 4 shows the relationship of maternal GDM with key metabolites within Factor 4 at T1 (i.e., those with factor loading values >|0.50| in the PCA conducted at T1), as well as associations of GDM with the same metabolites at T2. The β coefficients for GDM in relation to individual metabolites were similar to the association of GDM with the T1 factor score with respect to the direction of effects. Additionally, GDM was related to the majority of metabolites across T1 and T2 in a consistent manner with respect to direction and magnitude of associations. Of the 21 GDM–metabolite associations considered at each time-point, several were significantly different with respect to GDM exposure at both time points using α = 0.05, but only the estimate for palmitoyl-arachidonoyl-glycerol (16:0/20:4) reached statistical significance after Bonferroni correction (0.09 [95% CI 0.03, 0.15]; p = 0.002). We noted that, although the direction and magnitude of GDM–metabolite associations exhibited preservation of inter-individual rank across the two time points, the ICCs were low-to-modest, with a median (range) of 0.21 (0.08–0.38).

Table 4 Associations of in utero GDM exposure with Factor 4 metabolites at T1 (n = 592) and T2 (n = 403)

The associations between Factor 4 at T1 and health outcomes across T1 and T2 within strata of sex are shown in Table 5. In boys, Factor 4 was not associated with adiposity or metabolic risk. In girls, a higher Factor 4 score corresponded with significantly higher adiposity at α = 0.05 according to all six adiposity indicators (BMI, BMI z score, waist circumference, skinfold sum, SAT fat area and VAT fat area) as well as with higher fasting insulin (β [95% CI] per 1-unit factor score: 8.75 [4.97, 12.53] pmol/l; p < 0.0001), higher HOMA-IR (0.27 [0.13, 0.40] units; p = 0.0002) and altered adipokines, as indicated by higher leptin (4.54 [1.96, 7.11] nmol/l; p = 0.0006) and lower adiponectin (−4.00 [−6.76, −1.23] μmol/l; p = 0.005) (Table 5, Model 1). Further adjustment for pubertal status and lifestyle characteristics in Models 2 and 3 did not greatly change these findings. All above-mentioned estimates remained statistically significant after Bonferroni correction (α < 0.05/20 = 0.003) except for adiponectin.

Table 5 Associations of Factor 4 with adiposity and metabolic risk biomarkers from T1 (age 6–14 years) through T2 (age 12–19 years) among 592 EPOCH participants

Discussion

Statement of principal findings

In this study of 592 mother–child dyads, associations of in utero exposure to maternal GDM with offspring metabolite profiles derived from PCA were largely null after correcting for multiple comparisons and following adjustment for maternal pre-pregnancy BMI. However, across several multivariable models that accounted for key perinatal, sociodemographic, biological and lifestyle covariates, we noted that GDM exposure was consistently associated with a fasting serum phospholipid metabolite pattern during late childhood. This metabolite pattern was, in turn, associated with higher adiposity and a worse metabolic risk profile from late childhood through adolescence. We discuss these findings, below.

Associations of maternal GDM with offspring metabolite profiles at age 6–14 years

When assessing offspring metabolite profiles during late childhood, we observed a consistently positive relationship between maternal GDM and a phospholipid metabolite pattern (‘Factor 4’) that persisted after accounting for maternal smoking habits during pregnancy, and offspring pubertal status, physical activity and energy intake. However, none of the estimates qualified as statistically significant after correcting for multiple testing, and the specific relationship between GDM and Factor 4 was attenuated to non-significance after accounting for maternal pre-pregnancy BMI. While it is difficult to derive a concrete reason for the former other than limited statistical power, the latter is likely due to the existence of overlapping biological pathways linking maternal overweight/obesity and GDM to offspring metabolism [51] given that maternal glucose is elevated among overweight/obese women, even if they do not qualify as having overt diabetes [49]. Thus, such adjustment is likely to obscure, rather than isolate, the association of maternal glucose levels with offspring outcomes. Additional mechanistic work is required to untangle the unique influence of maternal blood glucose levels on offspring health outcomes, independent of weight status entering pregnancy.

The top metabolites in Factor 4 included phospholipids of the palmitoleic acid moiety, including 1-palmitoyl-2-palmitoleoyl-glycero-phosphocholine (GPC) (16:0/16:1), palmitoyl-arachidonoyl-glycerol (16:0/20:4), 1-myristoyl-2-arachidonoyl-GPC (14:0/20:4), 1-palmitoyl-2-arachidonoyl-glycosylphosphatidylinositol (GPI) (16:0/20:4), 1-palmitoyl-2-oleoyl- glycerophosphoethanolamine (GPE) (16:0/18:1) and 1-palmitoyl-GPE (16:0). This pattern also captured some diacylglycerides, including palmitoyl-linoleoyl-glycerol (16:1/18:2), which is a common emulsifier used in bakery products, shortening, whipped toppings and other confections [52]. Endogenously, alterations in phospholipid composition could be indicative of either lipolytic or lipogenic activity, both of which have implications for metabolic health, but we are not able to make this distinction in this study. However, we note that palmitate enrichment of fatty acids in this metabolite pattern may be driven by stearoyl-CoA desaturase (SCD), an endoplasmic reticulum enzyme that catalyses the rate-limiting step in monounsaturated fatty acid formation, including palmitoleate from stearoyl-CoA and palmitoyl-CoA [53]. Given that insulin resistance is related to disruptions to the SCD pathway [54, 55], it makes sense that this metabolite pattern differed with respect to status of in utero exposure to GDM.

In a recently-published meta-analysis of four cohorts of diverse ancestry, Lowe et al [18] analysed targeted metabolomics data from cord blood of 1600 mother–infant pairs in relation to maternal fasting glucose, glucose at 1-h and 2-h post OGTT, and insulin sensitivity. Key findings included associations of higher 1-h post-OGTT glucose with higher cord blood levels of the ketone body 3-hydroxybutyrate and its carnitine ester, as well as glycerol and 3-hydroxy-decanoyl carnitine, even after adjustment for maternal BMI at the time of OGTT. Additionally, lower maternal insulin sensitivity was differentially associated with cord blood levels of branched chain amino acids and acylcarnitines. In another study of mother–infant pairs (n = 119 from the PREOBE cohort in Spain), Shokry et al [19] leveraged targeted metabolomics data and identified associations of maternal GDM with elevated levels of cord blood hexoses, and lower levels of free carnitines, acylcarnitines, long-chain non-esterified fatty acids, phospholipids, Kreb’s cycle metabolites and β-oxidation markers. While the metabolites in Factor 4 do not markedly overlap with those identified in either of the above-mentioned studies (with the exception of the relevance of compounds on phospholipid pathways to GDM in Shokry et al [19]), it is difficult to reconcile findings from untargeted platforms, which capture relative concentrations of all detectable metabolites in a given biotissue or fluid, with those of targeted platforms selected based on an investigator’s a priori hypothesis of specific biological pathways. Further, it may not be appropriate to compare metabolites in fasting serum of youth with that of metabolites in cord blood, the latter of which are affected by a multitude of factors surrounding circumstances of delivery, and variability in metabolite concentrations across the maternal/fetal unit.

Associations of GDM with metabolites throughout late childhood and adolescence

GDM exposure was associated with key metabolites within Factor 4 at a similar direction, magnitude, and precision of estimates at T1 and T2, indicating a persistent and consistent inter-individual effect of GDM on metabolite profiles from late childhood through adolescence. On the other hand, within-individual stability of the metabolites across T1 and T2 was modest at best, as indicated by relatively low ICCs. Our results suggest that, while the metabolites we examined may be informative for the study of disease aetiology given that the inter-individual rank of exposure/metabolite associations was preserved over time, concentrations of the specific metabolites are not likely to be reliable biomarkers of disease risk in youth within the age range of this study population due to high intra-individual variability.

Associations of metabolite patterns with adiposity and conventional metabolic risk biomarkers

We found that a higher score for Factor 4 corresponded with greater adiposity, reduced insulin sensitivity, higher leptin and marginally lower adiponectin from late childhood through adolescence among girls. Furthermore, all associations except for that of GDM with adiponectin passed Bonferroni correction.

In a recent study in non-human primates, Polewski et al identified differences in plasma phospholipid composition, most notably perturbations in diacylglycerols that are desaturation products of palmitic acid, as markers of worsening insulin resistance and onset of the metabolic syndrome [56]. While the specific metabolites identified differed from those of the present study, phospholipids are integral components of the cell membrane and plasma and their metabolism influences numerous biochemical pathways, including those involved in glycaemic regulation [57]. The compounds of particular interest in Factor 4 include diacylglycerol derivatives of palmitoleic acid (e.g., palmitoleoyl-linoleoyl-glycerol [16:1/18:2]), an unsaturated monounsaturated fatty acid that has been found to exhibit similar detrimental effects to palmitic acid on cardiovascular traits in adult men [58]. Although, this lipokine has also been linked to beneficial effects on other aspects of metabolism (e.g., amelioration or prevention of insulin resistance and diabetes) [59].

Given the established literature on the puberty-related increase in adiposity [60] and insulin resistance [61], the female-specific nature of our findings is likely to be related to an association of maternal GDM with earlier adrenarche and pubarche in daughters. Recent cohort studies, including one in EPOCH [62], demonstrated that in utero exposure to GDM was associated with earlier puberty onset, with stronger [62] or sole effects among female offspring [63,64,65,66]. This phenomenon may operate through a programming effect of maternal hyperglycaemia on offspring adrenal hormone production [67]. Future studies with longer term follow-up are necessary to establish whether the relationships of GDM exposure with adiposity and metabolic risk in the offspring persists beyond puberty. Additionally, studies should further investigate potential sex-specific programming effects of in utero GDM exposure on offspring adiposity and metabolic health.

Strengths and weaknesses

Our study has several strengths. First, we were able to examine long-term effects of intrauterine exposure to maternal GDM on fasting serum metabolites at, not just one, but two time points, from late childhood through adolescence. Our repeated metabolomics measurements enabled us to assess both inter-individual preservation of the effect of GDM exposure over time, as well as intra-individual variability of metabolites, of which the former is informative for understanding disease aetiology, while the latter has important implications for use of metabolites as prognostic biomarkers [9]. Second, we had a large sample size, especially in comparison with other studies of metabolomics in youth, where N is less than 300 [50, 68,69,70,71,72]. Third, we were able to use rich data on sociodemographic and lifestyle characteristics to account for variables that may confound associations of interest. Finally, we had a diverse study sample, with ~50% of participants of racial or ethnic minorities.

This study also has several limitations. First, we are not able to make inference on metabolic flux given that the repeated metabolic assessments were separated by a 5-year period. Second, the metabolite factor scores were based on metabolites assayed at the same time as the first measurement of adiposity and metabolic biomarkers and, thus, estimates could be driven by cross-sectional relationships rather than being true prospective associations. Third, missing values and recall errors in self-reported covariates can introduce bias into estimates of association. Fourth, removal of metabolites with high missingness (≥20%) from our dataset resulted in retention of ~64% of the original features detected by the untargeted platform. Since unsupervised dimension reduction techniques, like PCA, rely on the intercorrelations among variables in the dataset, removal of features is likely to have had an impact on the composition of the factor scores in ways that we are not able to directly assess given that many of these procedures (including PCA) also require that there are no missing values in the input dataset. Fifth, GDM status was based on a clinical diagnosis of GDM following an OGTT, for which we do not have access to actual blood glucose concentrations. Thus, clinical misdiagnoses of GDM status could introduce non-differential bias into estimates of association pertaining to the relationship between maternal GDM and the metabolite factor scores. Sixth, while EPOCH participants were asked to arrive for the research visit having fasted for at least 8 h, we do not know for certain how long they had truly been fasting; this is another source of non-differential bias, but this time with respect to the associations of factor scores with adiposity and metabolic risk from childhood through adolescence. Seventh, we cannot discount the possibility of false-positive findings given the large number of models tested, especially given that our initial assessment of the relationship between GDM exposure and offspring metabolite patterns was largely null (with the exception of the association of GDM with Factor 4 when dichotomised as Q4 vs Q1–3 using a nominal α = 0.05). However, we put forth precautionary measures to prevent type 1 error, including: (1) use of PCA as a data reduction method to identify metabolite factors in offspring that differed with respect to maternal GDM using a nominal p value cut-off before assessing associations of GDM with individual metabolites; and (2) employment of more stringent p value cut-offs in analyses of associations between GDM and metabolites, and between the metabolite pattern factor score and health outcomes. Finally, as with all cohort studies that span a long period of time between assessment of the exposure and measurement of the health outcome, there is potential for residual confounding due to unmeasured lifestyle characteristics (e.g., nutritional factors, more nuanced assessments of physical activity and energy expenditure), and shared behavioural and environmental characteristics between the mother and child.

Conclusions and future directions

In the EPOCH cohort, we identified a phospholipid metabolite pattern following intrauterine exposure to maternal GDM that was detectable from late childhood through adolescence. This metabolite pattern was associated with higher adiposity, worse insulin sensitivity and altered adipocytokines across the adolescent transition among girls exposed to in utero GDM vs those unexposed. Future research is required to evaluate the extent to which these metabolite patterns are clinically relevant and meaningfully associated with health outcomes beyond the adolescent transition.