Introduction

The World Health Organization (WHO) predicts that infertility will be the third most serious disease worldwide in the 21st century.1 In the past decade, assisted reproductive technology (ART) has gradually become a common practice in reproductive medicine clinics, and over 1.0% of children in China are conceived through ART.2 Although ART has been widely used, emerging evidence has suggested that children conceived through ART have an elevated risk of adverse perinatal outcomes3,4,5,6 and congenital diseases.3,7 However, the underlying reasons remain largely unknown.

Germline de novo mutations (gDNMs) are new mutations that have occurred within one generation, provide offspring with genetic variants in addition to those inherited from their parents,8 and account for nearly 90% of total DNMs.9 Multiple studies have reported a considerable role of deleterious DNMs in unfavorable perinatal outcomes as well as in congenital and developmental diseases,10,11,12,13,14 such as congenital heart defects (CHDs).15,16,17,18 Recent evidence has shown that DNMs can accumulate with DNA replication during spermatogonium division or the double-strand break repair process of oocytes.8 Thus, it is necessary to evaluate whether ART, which includes the artificial manipulation of both gametes and early embryos, contributes to the occurrence of DNMs and the risk of subsequent diseases. However, such evaluation is challenging because the risk factors of infertility may also increase the occurrence of DNMs. For example, parental age has been linked to the risk of infertility19 and the accumulation of DNMs.9,20,21,22,23 Thus, the effect of parental factors on DNMs should be carefully evaluated and controlled in the analyses of association between ART and DNMs.

Here, we performed whole-genome sequencing (WGS) of 1137 individuals from 365 families (including 407 offspring) from the China National Birth Cohort (CNBC) to investigate the effect of ART-related elements on gDNMs and their associations with CHD.

Results

General description of DNMs

Among 365 parent-offspring families, 202 offspring (including 86 twin offspring from 44 families; Supplementary information, Fig. S1) were born from ART pregnancies (ARTP) and 205 from spontaneous pregnancies (SP). The information of all offspring and their parents were listed in Supplementary information, Table S1. After stringent quality control (QC) of WGS data (see Materials and Methods), we identified a total of 15,862 high-confidence point autosomal gDNMs in all offspring for the subsequent analysis, with an average of 38.97 mutations per offspring. Then, we defined the parental origin of the DNMs by tracing the read pairs near the phased variants, resulting in a total of 4879 gDNMs (30.76%), including 3785 (23.86%) paternal DNMs and 1094 (6.90%) maternal DNMs in all offspring. The standardized DNM number of each parental origin was estimated by dividing the number of phased DNMs by the proportion phased in each trio and the callable base number, and then the number was standardized to the whole genome length (2.70 × 109) for the following association analyses.

Offspring conceived through ART showed significantly more gDNMs than those conceived spontaneously

We found that the ARTP offspring carried significantly more gDNMs than the SP offspring (ARTP vs SP, total gDNM: Beta = 7.29, P = 2.23 × 10–11; paternal DNM: Beta = 5.43, P = 1.17 × 10–7; maternal DNM: Beta = 1.90, P = 1.48 × 10–3; Fig. 1). As parental age at conception was the well-known risk factor of gDNMs9,20,21,22,23 as well as infertility,24 we investigated the association between parental age and DNMs and re-evaluated the effect of ART after controlling for parental age. In line with previous studies, the number of paternal DNMs specifically increased with paternal age (Beta = 0.91 per year, P = 2.96 × 10–18; Supplementary information, Fig. S2). The effect of maternal age on maternal DNMs was significant but smaller (Beta = 0.39 per year, P = 1.03 × 10–6; Supplementary information, Fig. S2). After adjusting for the parental age at conception, the ARTP offspring still carried significantly more DNMs than the SP offspring (ARTP vs SP, total gDNM: Beta = 4.67, P = 1.52 × 10–6; paternal DNM: Beta = 3.37, P = 4.99 × 10–4; maternal DNM: Beta = 1.29, P = 0.032).

Fig. 1: Offspring conceived through ART showed significantly more total (left), paternal (middle), and maternal (right) DNMs than those conceived spontaneously.
figure 1

ART assisted reproductive technology, ARTP assisted reproductive technology pregnancies, SP spontaneous pregnancies, DNMs, de novo mutations.

Parental lifestyle behaviors were associated with gDNMs of their offspring

Parental lifestyle data obtained from the baseline survey were presented in Supplementary information, Table S1. After adjustment for parental age and conception type, paternal drinking behavior was significantly associated with an increased number of paternal gDNMs, while paternal exercise was associated with decreased paternal gDNMs (Supplementary information, Table S2).

In the multivariable model, fathers’ drinking behavior was associated with increased paternal DNMs in the offspring (Beta = 1.88, P = 0.047), while fathers’ exercise habit was associated with decreased gDNMs (Beta = –2.20, P = 0.048, Table 1). We observed that there is no heterogeneity of the effects in the ARTP and SP groups (Cochran’s Q-tests for heterogeneity, P > 0.05; Supplementary information, Fig. S3). Paternal DNMs were specifically associated with paternal alcohol consumption and exercise, suggesting that these environmental factors might influence generation of the gDNMs during the spermatogenesis process. In our cohort, we observed a low proportion of maternal smoking (1.37%) and drinking (11.78%). Thus, the effect of maternal smoking and drinking behaviors could not be accurately evaluated in the current study.

Table 1 Multivariable regression of parental age at conception, lifestyles, and conception type in all offspring.

We noticed that the effect of ART remained significant in the model (ARTP vs SP, total gDNM: Beta = 4.59, P = 2.38 × 10–6; paternal DNM: Beta = 3.32, P = 6.52 × 10–4; maternal DNM: Beta = 1.26, P = 0.039; Table 1). Similar results were seen in the model with the correction for parental age as categorical variables (Supplementary information, Table S3). The results were consistent if we adjusted parental singleton mutation loads, which represent mutation accumulation in recent generations and parental germline mutation rates,25 as covariates in the multivariable model (Supplementary information, Table S4). Because the age of ARTP parents was much higher than that of parents in the SP group (Supplementary information, Table S1), we conducted a matching on the parental age, which resulted in 192 ARTP offspring and 152 SP offspring. In the matched datasets, there was no significant difference in parental age in ARTP and SP groups (Student’s t-test, Ppaternal age = 0.14, Pmaternal age = 0.30; Supplementary information, Fig. S4). The effect of ART in the matched data was similar to that in Table 1 (Supplementary information, Table S5).

Characterized mutation spectrum of gDNMs in the offspring

In all offspring, the classification of gDNMs by mutation type revealed that gDNMs predominantly comprised C>T and T>C transitions and that the relative frequencies of C>T transitions at the non-CpG sites were greater in maternal than paternal transmissions (Fisher’s exact test, PC>T at non-CpG = 2.30  ×  10–5), while C>T transitions at the CpG sites and C>A transversions were rarer (Fisher’s exact test, PC>A = 3.56 × 10–5, PC>T at CpG = 8.15 × 10–5; Fig. 2a). Moreover, we found that paternal C>T transitions at the CpG sites were significantly elevated in the ARTP group after Bonferroni correction (Padj C>T at CpG = 2.00 × 10–4; Fig. 2b). Interestingly, the ART effect on paternal gDNMs was also significant for C>T transitions at the CpG sites (Beta = 1.52, P = 0.001; Fig. 2c). These results suggested that the increase in paternal DNMs in the ARTP offspring was primarily characterized by C>T transitions at the CpG sites, which were mutation hotspots induced by cytosine deamination damage. For maternal DNMs, we did not observe a significant difference in the proportion of mutation classes between ARTP and SP groups (Fig. 2b).

Fig. 2: The influence of ART on the mutational spectrum.
figure 2

a Relative frequency of mutational classes by parental origins. b The odds ratio of proportion comparison for specific mutational classes between the ARTP and SP groups. The P value and odds ratios were adjusted by Bonferroni methods. c The association (mutation rate ratio) between specific mutation class and ARTP/SP in the multivariable models. d The odds ratio of proportion comparison for specific functional groups between the ARTP and SP groups (IG, intergenic; NG, nearby genes; IN, intron; S, synonymous; ML, missense mutations with CADD ≤ 15; MH, missense mutations with CADD scores >  15). e The proportion of C>T mutations at the CpG sites in each functional group. *P < 0.05.

The functional enrichment of paternal C>T mutations at CpG sites

We classified all mutations into three major classes: functional coding gDNMs (Functional), other gDNMs near genes (Gene near), and intergenic mutations (Intergenic). The majority of the gDNMs were annotated as Gene near (9787/15,862 = 61.7%) and Intergenic (5898/15,862 = 37.2%), while only 177 mutations (1.1%) were classified as Functional. We observed that the mutations in the Functional group showed 1.46-fold enrichment in the ARTP group (proportion of functional DNMs in ARTP group: 112/8606 = 1.30%; in SP group: 65/7256 = 0.90%; Fig. 2d). Importantly, we found that the loss-of-function mutations (LoFs) had the highest odds ratio (OR = 2.11), and the ratio increased with the functional CADD score of missense mutations (Fig. 2d). Interestingly, C>T mutations at the CpG sites were significantly enriched in the Functional class (Functional vs Intergenic enrichment, OR = 3.53, P = 2.52 × 10–13; Fig. 2e). The enrichment ratio increased in the analyses of paternal DNMs (Functional vs Intergenic enrichment, OR = 5.34, P = 6.02 × 10–7; Fig. 2e). We observed a similar enrichment trend for paternal DNMs in the ARTP and SP groups, and the enrichment ratio was higher in the ARTP (Functional vs Intergenic enrichment, OR = 6.25, P = 1.05 × 10–5) group than in the SP group (Functional vs Intergenic enrichment, OR = 3.95, P = 0.016; Fig. 2e).

The association between gDNMs and CHDs

We further investigated the associations between gDNMs and offspring diseases based on the cohort follow-up at age 1 year and information from the disease registry. CHD is the most common type of birth defect, with seven offspring from ARTP families (3.5%) and two from SP families (0.98%). We did not find any large copy number alterations (≥10 Mb) or pathogenic variants inherited from parents in the reported genes for CHDs. Interestingly, we found that paternal C>T gDNMs at CpG sites (Supplementary information, Table S6), which were enriched in the Functional mutation class, significantly increased the risk of CHD (OR = 1.20, P = 0.002). Functional mutations in known CHD-related genes contributed to a higher risk of CHD (OR = 22.95, P = 0.012; Table 2). Given that 98.2% gDNMs were located in the non-coding regions of the genome, we further determined a total of 128 potential pathogenic non-coding DNMs in CHD based on a deep learning method26 (Supplementary information, Table S6). The number of these DNMs was significantly associated with CHD (OR = 2.96, P = 0.005; Table 2). Among these mutations, 33 had defined parental origin and the majority of these mutations (29/33, 87.9%) were originated from the father. Additional structural equation modeling suggested significantly indirect effects of paternal C>T gDNMs at CpG sites and non-coding functional gDNMs as mediators between ART and the risk of CHD (paternal C>T gDNMs: OR = 1.17, P = 0.008; non-coding functional gDNMs, OR = 2.76, P = 0.016), and the indirect effects were independent (Fig. 3). When we included parental age in the model, we observed a similar indirect effect of paternal C>T DNMs at CpG sites in the association between ART and CHD (Supplementary information, Fig. S5a). Thus, we concluded that offspring conceived by ART carried more paternal C>T DNMs at CpG sites, which greatly increased the risk of CHD. The effects of ART and parental age on non-coding functional DNMs were not significant (Supplementary information, Fig. S5b), suggesting that we cannot determine the causal factors of non-coding functional DNMs with the current sample size.

Table 2 The association between CHD and the number of DNMs.
Fig. 3: Structural equation models showed that paternal CpG>TpG DNMs and non-coding functional DNMs acted as independent mediators between ART and risk of CHD.
figure 3

The solid lines denoted significant (P < 0.05) relationships and dashed lines non-significant. *P < 0.05. CHD congenital heart defect, OR odds ratio, CI confidence interval.

The relationship between ART-related elements and gDNMs

We further investigated whether ART-related elements, including twin pregnancy, ART procedures, and parental infertility reasons were associated with gDNMs. With correction for parental age and lifestyle factors, we found that paternal infertility reason and usage of follicle-stimulating hormone (FSH) were specifically associated with gDNMs (Supplementary information, Table S7) and thus were entered into the multivariate analyses. We also included controlled ovarian stimulation protocols, embryo transfer strategies, fertilization methods, and trigger strategies in the following models because they were important procedures of ART. We observed that usage of both recombinant and urinary FSH (Beta = 2.69, P = 0.023) and usage of high-dosage human chorionic gonadotropin (hCG) trigger (Beta = 2.76, P = 0.032) were independently associated with an increased number of maternal gDNMs (Table 3). Similar results were seen in the model with the correction for parental age as categorical variables (Supplementary information, Table S8). In addition, we found that offspring with a father diagnosed with oligoasthenoteratozoospermia carried significantly more paternal gDNMs (Beta = 2.91, P = 0.047; Table 3). The model without the correction for parental lifestyles showed similar results (Supplementary information, Table S9).

Table 3 Multivariable regression including parental infertility-related diseases and detailed information on ART procedures.

Discussion

By using WGS data in families from a prospective cohort, the present study showed that the application of ART was associated with the risk of CHD through accumulating functional gDNMs. However, the majority of the mutations were originated from the fathers rather than the mothers. The evidence suggested that ART procedures may not be the major reason for the increased number of DNMs that contribute to the risk of CHD, even though specific operations were associated with the increased trends of maternal gDNMs.

The present study found that the mutation class with the greatest increase in ART offspring was paternal C>T mutations at CpG sites, which are commonly generated by cytosine deamination damage during DNA replication. However, the sperm genome cannot replicate after meiosis until fertilization.27 It is possible, therefore, that the increase in paternal gDNMs may not be related to subsequent ART procedures but to paternal spermatogenesis abnormalities in fathers who underwent ART. Importantly, paternal gDNMs in the ARTP group are frequently located in the functional sites of genes and are associated with an increased risk of CHDs, suggesting the association between paternal genome and offspring diseases. Similarly, a recent study revealed the connection between paternal age, gDNMs, and offspring diseases,28 proposing the hypothesis that the number of mutations in the sperm increases with paternal age and the chance that a child would carry a deleterious mutation that could lead to diseases (e.g., autism or schizophrenia) increases significantly.23 Thus, careful evaluation of the sperm genome of fathers might be important to prevent disease-causing mutations from passing to their children through ART. Previous studies have widely reported an increase in sperm aneuploidy levels in infertile men.29,30 In line with these studies, our results suggested that paternal DNMs increased in the infertile father, which partially explained the increased paternal DNMs in the ARTP group. However, we should not ignore other mechanisms besides parental genetic causes. For example, a recent study found that the association between CHD incidence and ART pregnancies is largely mediated by twinning, which provided additional potential mechanisms.31 Since there is only one CHD case diagnosed in the families with twins in this study, a cohort with increased sample size is warranted for the evaluation of the mediation effect of twinning and DNMs in the association between ART and CHD. It is also interesting that some paternal behaviors, such as drinking and exercise, show the trends to influence the number of paternal gDNMs, providing the evidence for the association between environmental factors and the number of DNMs,32 though the associations observed in the current sample size require further validation. Recent proteomic analyses suggested that plasma protein patterns could be greatly altered by lifestyles, such as smoking, drinking, and exercise.33 Thus, the influence of parental lifestyles could be complicated and warranted further investigation in the cohorts with multi-omics data followed by solid experiments. Taken together, we emphasize the importance of the fathers’ prepregnancy factors, which may help prevent the accumulation of paternal DNMs.

Similarly, maternal DNMs also increased in the ARTP group, but we did not find any characterized mutation class. Thus, ART procedures may not generate maternal DNMs through a different mechanism. Specific ART procedures, such as ovarian stimulation and trigger protocols, indeed contributed to the number of maternal DNMs. Both processes stimulated the oocyte meiosis resumption34,35 and maturation.36 Consistently, a recent study found that meiotic recombination could play a role in the formation of maternal DNMs.20 However, we did not find evidence of the association between maternal DNMs and CHD. Although the non-coding functional mutations were important mediators between ART and CHD, maternal DNMs accounted for only a small proportion of these mutations. Thus, ART procedures may not directly contribute to the risk of CHD through the mechanisms of gDNM accumulation.

The primary strength of our study is the use of a well-designed birth cohort. First, our cohort recruited each family as a unit, and then prospectively and longitudinally collected comprehensive health information as well as biospecimens from parents and offspring, which enabled the study on the association between pre-pregnant exposures, parental diseases, ART procedures, and gDNMs. Second, the CNBC cohort was applied with the Cloud Based Cohort Management System, which provided the paperless questionnaire survey and intelligent follow-up arrangement. Thus, the efficiency of data collection, QC, and cohort participant management were greatly improved, promising the follow-up with high quality as well as high response rate. In addition, we applied identical follow-up schedules to ART and spontaneous births by investigators and physicians without knowing their conception mode to avoid bias during data collection and CHD diagnosis. With complete information on nuclear families and high-quality follow-up, we were able to elucidate the association between pre-pregnant exposures, parental diseases, ART procedures, gDNMs, and CHD in the offspring step by step. Third, our study emphasized the contribution of the ignored paternal genome in ART families to the risk of congenital diseases, and thus offered new insight into the evaluation of the safety of ART from the aspects of genomic landscapes.

This study has several limitations. First, we cannot provide a full explanation for the elevation of DNM burden in the ART children because of the limited sample size, though we found several important factors associated with DNMs in addition to paternal age. The completion of CNBC with a much larger sample size will allow a comprehensive evaluation of the effect of many other environmental factors (e.g., air pollution and other exposure factors before and during pregnancies) on gDNMs and their association with CHD as well as other birth defects with low incidence.37,38,39,40 It should be mentioned that our cohort may not be suitable for the evaluation of the effect of maternal smoking and drinking behaviors because of the low rate in our population. Second, a relatively low rate of origin inferences from haplotypes of DNMs by short-read sequencing also limited the power of the study. Advances in long-read sequencing41 in the cohort with a larger sample size will allow for more accurate origin inference of DNMs and enable a more comprehensive analysis to determine additional factors responsible for the increase of gDNMs in the ARTP group, and eventually guide the prevention of gDNMs and CHD in the clinics.

In summary, the increased gDNMs in offspring conceived by ART were primarily originated from fathers, indicating that ART itself may not be a major reason for the accumulation of gDNMs. These results emphasized the importance of evaluating the germline status of the fathers in ART families and provided new insight into the prevention of congenital diseases in the ART offspring.

Materials and methods

Study design and participants

All participating couples who had live-born infants conceived with and without ART between December 2014 and March 2018 were enrolled in two hospitals (i.e., the Women’s Hospital of Nanjing Medical University and Suzhou Municipal Hospital). Reproductive endocrinologists, obstetricians, or professional health workers explained the objectives, procedures, potential benefits, and confidentiality issues of the study to all eligible couples. Written informed consent was obtained from those who agreed to participate. All methods and protocols for information collection were approved by the institutional review board of Nanjing Medical University.

This prospective cohort was one of the sub-cohort of CNBC, and only families with live-born infants were studied (Supplementary information, Fig. S1). We included 181 families from the ARTP group and 231 families from the SP group. In the ARTP group, 19 were excluded from the sequencing experiments (18 were lost to follow-up and 1 was in vitro fertilized with donated semen). In the SP group, 21 families were excluded because they were lost to follow-up. Because only two SP families with twins participated in the cohort, they were also excluded from the study to avoid their influence on the following regression model. In two ARTP families with twins, blood was collected from only one child of the twins (Supplementary information, Fig. S1). Finally, we conducted WGS on 1152 individuals from 370 families, including 412 offspring.

Baseline information was collected with standardized and structured questionnaires before pregnancy, and blood specimens of the parents were collected on the day of recruitment. Clinical information, including infertility diagnoses, specific ART procedures at each cycle, and health status during pregnancy were extracted from medical records. All cohort children were offered free child health examinations from their infancy to early childhood as scheduled. Fingertip blood samples were collected at the children’s 1st year clinic visits.

Once reaching 1 year of age, we followed up with all participants and their infants received systematic physical examination at child health care clinics. All CHD cases were diagnosed by echocardiography and confirmed by two pediatric cardiologists.

Variables

ART was the main exposure in the study. Parental lifestyles (smoking (yes/no), drinking (yes/no) and exercise habit (yes/no)), infertility diagnoses in the ART group (oligoasthenoteratozoospermia, ovulatory dysfunction, tubal factors, endometriosis), and ART procedures were studied in the study. Participants who reported drinking at least once a month were defined as having drinking behavior. Those who reported exercising at least three times a week were defined as having exercise habits. Ovulatory dysfunction is typical of producing eggs abnormally and can be further sorted into polycystic ovarian syndrome (PCOS), diminished ovarian reserve (DOR), premature ovarian insufficiency (POI), and others. This diagnosis was evaluated by three senior chief physicians independently. Parental infertility diagnoses in the ART group were directly included in the multivariable model in Table 3.

ART procedures included: (1) controlled ovarian stimulation (COS) protocols: COS protocols aim to stimulate the ovaries to produce the available oocytes in a controlled manner, thus the protocols may influence the number and quality of oocytes. We classified COS protocols used in the clinics into four major classes: antagonist, short agonist, long agonist, and microstimulation. The dosage and type of drugs used were different across protocols. The detailed dosage of the drug (i.e., gonadotrophin-releasing hormone agonist (GnRHa), antagonist, gonadotropin (Gn)) and the selection of FSH, luteinizing hormone (LH), growth hormone, and trigger drug were also studied. hCG trigger with a dosage over 6000 IU was defined as high hCG usage. (2) Fertilization methods: in vitro fertilization (IVF) and intracytoplasmic sperm injection (ICSI). In IVF, the oocyte and sperm are left in a petri dish to fertilize on their own. In ICSI, one sperm is directly injected into an oocyte. The process included the manipulation of both sperm and egg. (3) Embryo transfer (ET) strategy: ET is a step to place embryos into the uterus of a female. A fresh ET cycle: embryos were selected and transferred immediately after the oocyte retrieval, fertilization and 2–7 days of embryo culture. In contrast, when embryos are frozen and thawed for transfer when the proper condition is available, this is referred to as a frozen ET cycle. The COS protocols, ET strategy, fertilization methods, and trigger strategies were directly included in the multivariable model in Table 3.

WGS

The genomic DNAs of the offspring and their parents were extracted from blood samples and sequenced with the Illumina HiSeq X Ten System, and these DNA samples were prepared for sequencing according to the manufacturer’s instructions (Illumina). Briefly, DNA was extracted and sheared into fragments that were then purified by gel electrophoresis. DNA fragments were ligated with adapter oligonucleotides to form paired-end DNA libraries. To enrich the libraries for sequencing with 5′ and 3′ adapters, we used ligation-mediated PCR amplification with primers complementary to the adapter sequences. The DNA libraries were sequenced to generate 150 bp pair-end reads and targeted ~30× coverage for each sample.

Alignment and variant calling

The FastQC package (http://www.bioinformatics.babraham.ac.uk/projects/fastqc) was used to assess the quality-score distribution of the sequencing reads. Read sequences were mapped to the human reference genome (GRCh37) using the Burrows-Wheeler Aligner (BWA-MEM v0.7.15-r1140)42 with the default parameters. The quality metrics (including coverage, median insert size, and percentage of chimeric reads) and contamination estimation of “Bam” files were obtained by Picard (v1.70) (http://broadinstitute.github.io/picard) and VerifyBamID2 (v1.04) (https://github.com/Griffan/VerifyBamID), respectively. Duplicates were marked and discarded using Picard (v1.70). Before variant calling, we applied Canvas43 for the identification of large copy number changes (≥10 Mb) in all trios.

Single-nucleotide and short insertion/deletion variants were jointly called using the Genome Analysis Toolkit (GATK v3.8.1)44 Best Practices for germline SNVs and indels. Briefly, samples were called individually using local realignment by HaplotypeCaller in the gVCF mode with default read filtering parameters. Individual gVCF files were jointly genotyped for high-confidence alleles using GenotypeGVCFs for all autosomes and X chromosome of genomes. We performed relatedness and sex examination based on a temporary list of high-confidence variants: biallelic, high-call rate (> 0.99), common SNVs (allele frequency > 1%). We estimated pairwise relatedness by KING (v2.2.4)45 and inferred the chromosomal sex of the samples based on the inbreeding coefficient (F) for common variants on X chromosome, excluding pseudo-autosomal regions.

QC

Sample QC

We excluded individuals with (1) average coverage < 15×; (2) median insert size < 250 bp; (3) excess chimeric reads: pct_chimera > 0.05; (4) high contamination: freemix > 0.05; (5) unmatched relatedness; and (6) inconsistent chromosomal sex and gender information. Families that failed to satisfy the “trios” design after sample QC were excluded from subsequent analyses. We excluded five families with members who failed in sample QC, and eventually included 1137 individuals from 365 families with 202 offspring born from ARTP and 205 from SP. The general information and detailed ART procedures for these families are listed in Supplementary information, Table S1. The detailed information of all offspring and their parents in our study and read-level QC metrics (i.e., PF HQ ERROR RATE and STRAND BALANCE) are listed in Supplementary information, Table S10.

Callable region

Because not all positions in the genome could be sequenced and this fraction varied across the trios, we defined the callable region as those that had a minimum depth ≥12 for each individual and the callable base number as the total base of the intersect callable region for each of the trios. We found that there was no significant difference in the callable base number between the ARTP and SP groups (ARTP mean: 26.11 Gb, SP mean: 26.14 Gb, Student’s t-test, P = 0.69). The callable base number was used as the offset in the subsequent Poisson regression model.

Variant QC

We only included point DNMs in this study. GATK variant quality score recalibration (VQSR) was used to filter variants. The SNP VQSR model was trained using HapMap 3.3 and 1000 Genomes Omni 2.5 SNP sites, and a 99.6% sensitivity threshold was applied to filter variants. We excluded variants if they (1) failed to pass VQSR filtering criteria; (2) had an inbreeding coefficient < –0.3; (3) were located in segmental duplication or simple tandem repeats downloaded from UCSC genome browser (http://genome.ucsc.edu/); and (4) were out of the callable regions.

Identification and filtering of gDNMs

We restricted our DNM analysis to the SNVs in the autosomes and called DNMs using DeNovoGear (v1.1.1)46 and the GATK Genotype Refinement workflow44 for each offspring-parents trio. Consistent DNMs called from two pipelines were further filtered by the following criteria: (1) minor allele frequency (MAF) <0.01 in the total gnomAD control samples; (2) MAF < 0.01 in a custom-curated WGS database with 6600 Chinese individuals; (3) MAF = 0 in all parents included in this cohort; (4) the offspring must be an alternative allele carrier; (5) minimum depth ≥12 for parents and offspring; (6) no more than one read supporting the alternative allele in the parent; (7) maximum allelic fraction for the parent of 0.05; and (8) minimum allelic fraction for the offspring of 0.20.

We evaluated our pipeline based on public data from 70 individuals of three-generation families.9 We collected a total of 4176 DNMs in this study. After we excluded (1) indel variants, (2) variants in X chromosome, (3) variants in segmental duplication/simple repeats, (4) variants with MAF ≥ 0.01 in the total gnomAD control samples, and (5) variants with MAF ≥ 0.01 in a custom-curated WGS database with 6600 Chinese individuals, we curated a dataset including 3807 DNMs as a set of true DNMs. We identified 3433 DNMs by our protocols and a detection rate was 90.2%. To assess the accuracy of our DNM calls, we randomly selected 92 DNMs (0.5% of total DNMs, including 15 paternal DNMs and 4 maternal DNMs) and sequenced both parents and offspring by Sanger sequencing. Of these, 86 DNMs were successfully validated (True positive: 93.47%) and all DNMs with clear parental origin were validated. We noticed that all 6 DNMs that failed in the validation had the problem of strand bias and homologous mapping, thus we applied a custom script to filter the called DNMs. After that, all 6 failed DNMs and 1 validated DNM were excluded. After applying the filtering to all the DNM mutations, we evaluated the sharing DNMs in a monozygotic twin family. We identified 59 and 60 DNMs for two offspring, respectively, 56 of which (94.9% and 93.3%) were shared.

To diminish the influence of post-zygotic mutations, we further excluded mutations with allelic fraction lower than 0.35. We evaluated the threshold in a publicly available database (CEPH/Utah pedigrees study),9 which included DNMs validated by three-generation pedigrees. We downloaded genotype data from dbGaP with accession number phs001872.v1.p1. The validated DNMs were obtained from the GitHub repository from a previous study (https://github.com/quinlan-lab/ceph-dnm-manuscript). The threshold excluded the majority (75%, 303/404) of post-zygotic DNMs at a cost of 7.54% (287/3807) germline mutations. We listed variant-level QC metrics of all DNMs in Supplementary information, Table S11.

The standardized DNM number of each parental origin was estimated by dividing the number of phased DNMs by the proportion phased in each trio and the callable base number, then the number was standardized to the whole genome length (2.7 × 109). The DNM number of specific mutation type was directly standardized to the whole genome length (2.7 × 109).

We evaluated the association between the standardized DNM number and other QC metrics that might influence the mutation rate. DNM number was not significantly associated with PF HQ ERROR RATE of offspring and their parents (offspring: Pearson’s correlation coefficient r = –0.09, P = 0.061; father: r = 0.00, P = 0.925; mother: r = –0.07, P = 0.150; Supplementary information, Fig. S6). There was no significant difference between ARTP and SP groups for PF HQ ERROR RATE of offspring and their parents (Student’s t-test, Poffspring = 0.052, Pfather = 0.112, and Pmother = 0.491; Supplementary information, Fig. S7). All data were strand balance (STRAND BALANCE = 0.500). In addition, MQ, FS, and GQ of variants identified in ARTP and SP groups were comparable (Student’s t-test, PMQ = 0.479, PFS = 0.760, and PGQ = 0.552; Supplementary information, Table S11).

We defined singleton variants25 of the parents if the SNV (1) passed the general QC for variants; (2) MAF = 0 in the total gnomAD control samples; (3) MAF = 0 in a custom-curated WGS database of 6600 Chinese; (4) was successfully transmitted to their offspring. Then, the singleton loads of the parents were divided by the callable base number of each of the trios and rescaled to the Mb level.

Phasing and origin classification of gDNMs

We applied a haplotype assembly strategy similar to that used in the previous studies9,20,21,22,23 to trace the origin of the gDNM allele. Briefly, we considered that the human autosomes were diploid and that a gDNM affected only one allele from either the father or the mother, and thus the other informative variants (uniquely attributable to a single parent) on the same haplotype could help distinguish the origin of the allele of the gDNM. Besides, we also applied a pedigree-based haplotype inference method47 to increase the inference rate of informative variants. For 2295 mutations with origin inferred from both informative variants themselves and pedigree-based haplotype, we compared the inference results. The consistent rate was 99.9% (2293/2295). In our cohort, we obtained the origin information for 4879 DNMs out of a total of 15,862 (30.76%). Among these gDNMs, 3785 (77.58%) were of paternal origin, and 1094 (22.42%) were of maternal origin. The percentages are comparable to those of phased mutations reported in recently published studies.9,20,21,22,23

Mutation spectrum

Similar to the approach used in the previous studies,9,20,21,22,23 we grouped DNMs into six classes according to nucleotide substitution. Besides, we considered dichotomized C>T substitutions depending on the CpG context, i.e., C>T on CpG sites and C>T on non-CpG sites, resulting in a total of seven classes. Then, we calculated the enrichment of the mutational classes between different (DNM origin or ARTP/SP) groups. For each mutational class, we performed Fisher’s exact test on a 2 × 2 table, tabulating the numbers of DNMs by group and by the mutational class. For the comparison between paternal and maternal DNMs, which have been reported by previous studies, we presented original Odds ratios and P values. To account for multiple tests of the comparison between ARTP and SP groups, we applied Bonferroni correction by dividing the obtained P values by the number of statistical tests.

Variant annotation and functional classification

We used SnpEff (v4.3.1, http://snpeff.sourceforge.net/) to annotate all sequence variants and their potential mutational effects on the associated genes, and used vcfanno (v0.3.2)48 to annotate CADD score v1.442 for each mutation. Mutations annotated as UTR variants and up- and downstream gene variants were reannotated as other mutations near genes. Then, we grouped all mutations into three major classes: (1) functional coding DNMs (Functional): LoFs, missense mutations with CADD scores > 15 (Missense High, MH), (2) other DNMs near genes (Gene near): missense mutations with CADD ≤ 15 (Missense Low, ML), synonymous mutations, intron mutations and other mutations of nearby genes; and (3) intergenic mutations (Intergenic). Similar to the analysis for the mutational spectrum, we calculated the enrichment of the functional classes for the ARTP and SP groups by using Fisher’s exact test. We also annotated the DNMs with ClinVar database (version 20191021, https://www.ncbi.nlm.nih.gov/clinvar/) and none of the variants was annotated as high-confidence (with two or more stars) pathogenic or likely pathogenic (P/LP) variants.

We also applied the HeartENN model, which predicts molecular effect differences between any two alleles for every regulatory feature by using convolutional neural networks, to predict the potentially pathogenic non-coding DNMs.26

Statistical analysis

The Student’s t-test was used to compare the differences of continuous baseline characteristics between the ARTP and SP groups. Fisher’s exact test was used to compare the differences of categorical baseline characteristics between ARTP and SP groups and investigate the enrichment of the mutational classes between different (DNM origin or ARTP/SP) groups. We fit linear mixed models (LMM) to evaluate the association of ART (or other exposures) with standardized gDNM numbers (or DNMs with specific origin), accounting for the twin information in the cohort. We fit logistic regression models to estimate the association between standardized gDNM numbers (or DNMs with a specific origin) and the risk of CHD. Structural equation modeling was conducted using the piecewiseSEM package49 according to the models constructed by nlme50 and lme4 packages.51 Two-sided P values < 0.05 were considered significant. All analyses were conducted in R (v3.6.0).