Introduction

In 1974, Nelson and Bunge1 noted an increase in the number of men consulting for infertility at their university hospital in Iowa City, USA, and were the first to question whether alterations were occurring in semen quality. In 1981, Leto and Frensilli2 reported a gradual decline in sperm count in donors at their sperm bank in Washington DC between 1973 and 1980. By contrast, MacLeod and Wang3 examined sperm production in three similar populations of men with infertility who were assessed at the same laboratory in New York in 1951, 1966 and 1976, and concluded that no substantial change had occurred.

In the late 1970s and early 1980s, the first studies connecting changes in semen quality with environmental exposure to pollutants were published. These mostly focused on occupational contexts; for example, two pioneering studies in men who had applied pesticides showed that the duration of exposure to the nematocide dibromochloropropane was associated with a decrease in sperm count4,5.

As research moved into the early 1990s, a Danish study by Carlsen et al.6 investigated a potential decrease in sperm production using an approach based on a linear regression of the mean values of sperm concentration of healthy men from 61 studies published between 1940 and 1990 (ref.6). The study concluded that sperm production had significantly decreased by about 50%, from 113 × 106/ml in 1940 to 66 × 106/ml in 1990 (P < 0.0001). The authors claimed that their results “may reflect an overall reduction in male fertility,” creating unprecedented concern about a possible decline in male fertility in the general population. Although the study did not show any evidence of a consequential effect on human fertility or fecundity, this conclusion was considered plausible by part of the scientific community with considerable coverage in the media.

Several subsequent studies confirmed that human fecundity might positively and linearly depend on sperm concentration; however, this relationship was shown only for sperm concentrations of <40 × 106/ml among 430 first pregnancy planners7 or <55 × 106/ml among 942 couples in four European cities who conceived without medical intervention8.

Since then, the number of studies examining both geographical and temporal variations in semen quality has increased considerably. The most methodologically rigorous study to date is a 2017 systematic review by Levine and colleagues9 that investigated the temporal trend in sperm counts, which was based on a meta-regression analysis using 244 estimates of sperm concentration and total sperm count from 185 studies worldwide in 1973–2011. The study reported a significant temporal decline in both characteristics in men from the Western world unselected for fertility status, with a decrease in sperm count of 1.6% per year and overall decline of 59.3% in ~40 years (P < 0.001). This study had a considerable impact in the scientific community, accruing >750 citations after 4 years according to Google Scholar.

The issue of a possible global decline in semen quality has generated considerable debate among fertility professionals10 and received extensive media coverage, amplified by the internet, which might have led to distorted conclusions and undue attention. However, assessment of the literature on human semen quality trends from the past four decades shows clear heterogeneity in the design of the studies as well as in the population studies. Some studies have more limitations than others and, therefore, yield contrasting and less solid results. Furthermore, during the same period, numerous studies have shown possible effects of environmental and lifestyle factors on human semen quality11,12, without considering whether these factors might explain the geographical contrasts or the temporal variation in sperm production.

In this Review, we consider the numerous studies reporting spatial and temporal trends in human semen quality over the past half century, provide an in-depth critical analysis of their methodological choices and criteria and highlight the findings of the studies with the most optimal designs and methodologies. Only by understanding the intricacies of the studies can we draw conclusions about trends in semen characteristics worldwide. This Review is, therefore, based on a critical analysis of the methodology of existing studies and does not cover the causes or risk factors possibly responsible for these contrasts, for which the literature is still modest.

Conceptual and methodological diversity

The literature reporting and discussing spatial and temporal trends in human semen quality is extremely heterogeneous in terms of study design and methods. Studies use various approaches, from comparisons of previously published data with contemporary data — sometimes in dissimilar study populations — to more homogeneous studies in terms of the groups of men studied, periods studied, data collection and statistical methodology. Three main characteristics can help to categorize their study designs: first, the retrospective or cross-sectional nature of semen data; second, the use of data from a single centre or from several centres from different regions; and third, the use of individual data or aggregated values such as means, medians or estimated values from semen characteristics (Fig. 1). Analysis of the literature must take this methodological heterogeneity into account; thus, this Review discusses the different types of study separately. These discussions take into account several quality criteria, such as the homogeneity of the populations of men studied (selected or not according to their fertility status — for example, fertile sperm donors, partners of pregnant women or men consulting for infertility), the appropriateness of the methods used to assess semen characteristics, and in particular, whether they follow the recommendations in the WHO manual for the standardized assessment of human semen quality (originally published in 1980 (ref.13) and updated in 1987 (ref.14), 1992 (ref.15), 1999 (ref.16) and 2010 (ref.17)), as well as the data analysis methodology. For example, ideally, studies that examine possible geographical variations in semen quality should include comparable groups of men and use a similar methodology for assessing semen in the different regions as well as a common period of study, to disentangle the temporal from the geographical dimensions.

Fig. 1: Study designs examining spatial and temporal trends in human semen quality.
figure 1

Several different study types have been used to analyse trends in sperm quality. Each of these is subject to benefits and drawbacks, for example, according to the type of data collected and the population included.

The literature investigating temporal trends in semen quality is mainly composed of retrospective studies based on semen data collected in a single centre, some less numerous multicentre studies that applied regression models on semen data from individuals, or mean, median or estimated values, and only a few cross-sectional studies. Multicentre studies based on aggregated data provide the greatest amount of information; however, they have substantial limitations if heterogeneity between studies — including spatial heterogeneity — is not carefully taken into account. Most studies investigating temporal trends in semen quality actually examine characteristics of sperm production, sperm concentration and/or total sperm count. By contrast, fewer studies have examined putative temporal trends in seminal volume and qualitative characteristics such as the percentage of motile spermatozoa and the percentage of morphologically normal spermatozoa.

Spatial and geographical differences in semen quality

In 1977, Smith and Steinberger18 introduced geography as a possible factor in observed differences in mean sperm count from comparable groups of partners in infertile couples in Iowa City, Houston, Philadelphia and New York. In the same period, MacLeod and Wang3 also noted marked contrasts in mean sperm counts from hundreds of men undergoing pre-vasectomy assessment in several North American cities.

Subsequently, in 1992, Carlsen and colleagues6, noting the number of countries represented among the publications selected for regression analysis examining a possible temporal trend in sperm production, presented historical data on mean sperm concentration. These data showed a wide range of mean sperm concentration values according to geographical origin: for example, a difference of 80 × 106/ml for the maximum mean value reported in Finland and the minimum mean value reported in India. A subsequent 1996 opinion paper19 reconsidering the work of Carlsen and colleagues6 was the first article to consider a possible confounding role of the geographical origin of the data in the temporal trend reported. Shortly thereafter, in 1997, a retrospective study investigating semen quality of comparable groups of male candidates for sperm donation with proven fertility in several French regions provided the first evidence of true geographical contrasts in semen quality within a country20.

At the time of writing, 27 studies over the past 25 years have scrutinized possible geographical variations in human semen quality; most considered primarily sperm production characteristics, sperm concentration and total sperm count20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46 (Supplementary Table 1). These studies compared cities at various geographical scales, with magnitude of geographical distance between the areas studied defined as continental or subcontinental when >700 km, as national when around 200–700 km and as regional within a country when <200 km. In-depth examination of these studies shows high heterogeneity in terms of the populations of men studied and/or the methodology design used, affecting the reliability of the contrasts reported.

Studies comparing cities on a continental scale

The first cross-sectional investigation of geographical differences in semen parameters was reported in 2001 (ref.21). This study involved 1,082 fertile male partners of pregnant women from four European cities (Copenhagen, Denmark; Paris, France; Edinburgh, Scotland; and Turku, Finland). Semen parameters were assessed using standardized methodology, with inter-laboratory quality control and centralized assessment of sperm morphology. The raw data indicated that Danish men had the lowest sperm concentrations and total counts, followed by French and Scottish men; Finnish men had the highest sperm counts. By contrast, men from Edinburgh had the highest proportion of motile spermatozoa, followed by men from Turku, Copenhagen and Paris. When the results were corrected for confounding factors (including age, abstinence period and season), differences in sperm concentration were found between Turku and Copenhagen (P = 0.00002), Turku and Paris (P = 0.0008) and Copenhagen and Edinburgh (P = 0.03). Differences in total sperm count were also observed between Turku and Copenhagen (P = 0.0001), Turku and Edinburgh (P = 0.001), Turku and Paris (P = 0.0001) and Copenhagen and Edinburgh (P = 0.03), while differences in percentage motility were shown between Turku and Paris (P = 0.003) and Edinburgh and Paris (P = 0.002). Percentage of normal sperm was not different between cities. In addition, this study highlighted seasonal variations in semen variables; over all the four cities studied, sperm concentrations in summer were only 70% of sperm concentrations in winter and, accordingly, total sperm count in summer was just 72% of that seen in winter. Seasonal sperm concentrations in a ‘standardized’ man (30 years old, fertile, sexual abstinence of 96 h) across all four cities were 132 and 93 × 106/ml for winter and summer, respectively, in Turku; 119 and 84 × 106/ml for winter and summer, respectively, in Edinburgh; 103 and 73 × 106/ml in Paris; and 98 and 69 × 106/ml in Copenhagen. Seasonal variations have also been observed in other studies with the lowest sperm counts detected during the summer season and highest during either autumn or winter season47. Several factors could contribute to these differences, including environmental temperature, pesticides and air pollution.

A year later, a second cross-sectional study from the same group reported geographical differences in semen data in the Nordic–Baltic area22. In total, 968 men aged 17–19 years who were being recruited into military service participated in the study: 324 men in Turku, Finland; 104 in Tartu, Estonia; 240 in Oslo, Norway; and 300 in Copenhagen, Denmark. All men answered questionnaires and collected semen according to the protocol and methodology of the group’s previous study21. Multivariable regression analysis accounting for abstinence period showed no difference between semen volumes of men from the four countries, but sperm concentrations and total sperm counts differed between the centres (all P < 0.0005). The Finnish and Estonian men had an adjusted median sperm concentration of 54 × 106/ml and 57 × 106/ml, respectively, and the Norwegian and Danish men both had an adjusted median sperm concentration of 41 × 106/ml. Corresponding total sperm counts in all four cities were 185 × 106, 174 × 106, 133 × 106 and 144 × 106 in Turku, Tartu, Oslo and Copenhagen, respectively, with statistically significant differences (P < 0.05) observed between men from Turku and those from Copenhagen, Turku and Oslo, Tartu and Copenhagen, and Tartu and Oslo; the differences between Tartu and Turku and Oslo and Copenhagen were not statistically significant. The percentages of motile sperm and morphologically normal sperm differed between the centres (both P < 0.005). Men from Tartu (74%) were found to have the highest percentage of motile sperm, followed by men from Copenhagen (66%), Turku (65%) and Oslo (64%). By contrast, men from Turku and Tartu had the highest percentage of morphologically normal sperm (9.2%) (P < 0.005). Overall, the authors concluded that an East–West gradient exists in the Nordic–Baltic area with regard to semen characteristics in these men, who are considered to represent the general population of young men in these locations as they were recruited to the study from individuals attending a compulsory medical examination and were not selected for known fertility issues or semen quality, with most of the participants having no prior knowledge of their fertility status. However, participation rates within the overall groups of men attending for their military medical examination were low (Turku: 13%, Tartu: 19%, Oslo: 17% and Copenhagen: 19%), with possible implications for the representativeness of the study population. Notably, the level of sperm production in the young military conscripts in this study were markedly lower than in the 25–40-year-old partners of pregnant women from the same European region, without any clear explanation21. For example, the mean total sperm count was 412 × 106 in Turku and 276 × 106 in Copenhagen in partners of pregnant women versus 221 × 106 and 172 × 106 in young men, respectively.

A cross-sectional study in 1,165 military conscripts aged 16–29 years (median age 19.8 years) reported semen data of men recruited in Estonia (n = 573; 301 men of Estonian origin and 272 men of Russian origin), Riga, Latvia (n = 278) and Kaunas, Lithuania (n = 314)23. Participation rates were low at 17% (Estonia), 13% (Latvia) and 15% (Lithuania). Semen volume, total sperm count and percentage of progressive motile spermatozoa and morphologically normal spermatozoa adjusted for age and sexual abstinence statistically differed between the groups studied (P = 0.035, P = 0.02, P < 0.001, P = 0.002, respectively). However, the authors concluded that semen quality among men from the neighbouring Baltic countries did not differ meaningfully, probably owing to the low participation rate (which introduces selection bias), lack of quality control and the high percentage of men who had a history of cryptorchidism.

In 2003, the first US study was published, a cross-sectional study that included 493 healthy male partners of pregnant women recruited through prenatal clinics in four cities across North America — Columbia (n = 176), New York (n = 38), Minneapolis (n = 155) and Los Angeles (n = 124) — during 1999–2001 (ref.24). The study used identical protocols across centres as well as standardized methods and strict quality control of semen assessment. Semen specimens were assessed for seminal volume, sperm concentration and motility at the centres themselves, whereas sperm morphology was centrally assessed. Mean sperm concentration was significantly lower in Columbia than in New York, Minneapolis or Los Angeles (58.7, 102.9, 98.6 and 80.8 × 106/ml; median: 53.5, 88.5, 81.8 and 64.8 × 106/ml, respectively). The total number of motile sperm was also lower in Columbia than in other cities: 113 versus 196, 201 and 162 × 106, respectively. However, semen volume and the percentage of morphologically normal sperm did not differ appreciably between centres. Observed inter-centre differences remained even with multivariable models that controlled for abstinence time, semen analysis time, age, race, smoking, history of sexually transmitted disease and recent fever (all P < 0.01). On the basis of these data, the authors suggested that sperm concentration and motility might be reduced in US semirural and agricultural areas relative to more urban and less agriculturally exposed areas. Although no explanation was provided by the authors, this finding highlights additional types of spatial contrast in sperm quality.

The US Study for Future Families (SFF) investigated semen parameters in men across the USA, recruiting partners of pregnant women who attended prenatal clinics in Los Angeles, Minneapolis, Columbia, New York City and Iowa City25. Semen samples were collected on site from 763 men (73% white, 15% Hispanic and/or Latino, 7% Black and 5% Asian or other ethnic group) using strict quality control and well-defined protocols. Analysis of the full cohort confirmed the findings of Swan et al.35 on US geographical differences in sperm production, which showed that sperm parameters were reduced in semirural and agricultural areas compared with urban and less agriculturally exposed areas. Mean sperm concentrations for men living in New York City, Minneapolis, Iowa City, Los Angeles and Columbia were 85, 72, 62, 55 and 48 × 106/ml, respectively (P < 0.0001 for difference between centres). Corresponding total sperm counts were 261, 264, 244, 176 and 167 × 106 (P < 0.0001). Of note, Black men had significantly lower sperm concentrations than white and Hispanic and/or Latino men.

The first study to examine possible geographical differences in semen quality in Japan26 was a cross-sectional study that compared semen parameters of 324 fertile male partners of pregnant women from the Kawasaki and Yokohama area with the published semen data for similar populations from four European cities21 carried out during the same period and according to the same protocol. After adjustment for confounding factors such as period of sexual abstinence and age, the lowest sperm concentrations were detected in men from Kawasaki and Yokohama, followed by men from Copenhagen, Paris, Edinburgh and Turku, but only the differences between men from Kawasaki and Yokohama and men from Edinburgh and Turku were statistically significant (P = 0.0008 and P < 0.0001, respectively). Total sperm count, percentage of motile sperm and percentage of normal sperm observed in Kawasaki and Yokohama were significantly lower (P < 0.02, P < 0.0001 and P < 0.0002, respectively) than those reported in all European centres, except for motile sperm in men from Paris. Japanese fertile men had semen quality of the same level as Danish men, which was reported to be the lowest among men studied in Europe.

A retrospective study of South American men compared semen characteristics of fertile men from Medellín, Colombia (n = 113) and Petrópolis, Brazil (n = 84)27 and evaluated the records of fertile men before vasectomy during the same period. All partners of the participants had given birth in the preceding year, with a time to pregnancy (TTP) of ≤12 months. Individuals with testicular alterations, testicular trauma, leukocytospermia, bacteriospermia, diabetes, hypertension, drug use or acute illness were excluded from the study, and the same method for semen assessment was used in both cities. Men from Medellín had a seminal volume lower than those from Petrópolis (P < 0.0001), whereas individuals from Petrópolis had a percentage of total progressive motility lower than in Medellín (P < 0.0001); no difference was found in sperm concentration.

Most published studies examining geographical differences in semen quality included men who were long-term or permanent residents of an area. However, over the past few decades, population mobility has been increasing. Changes in patterns of reproductive health among migrants or mobile populations might, therefore, reveal the influence of environmental factors and lifestyle on semen quality. For example, in China, all military personnel are posted to a region at some distance from their home province. As all these soldiers share a comparable living environment and lifestyle, they can be regarded as representative of a migrant population within China. Based on this assumption, a cross-sectional study comparing semen data from 1,194 Han Chinese military personnel aged 18–35 years at the time of inclusion, who had been in the ordinary land forces for more than 1 year was undertaken in six cities that are geographically representative of the country’s regional characteristics (Beihai, Lhasa, Germu, Xinzhou, Huhehaote and Mohe)28. Participation rates in the study were high, with little variation between the different regions and, across all regions, semen samples were assessed according to WHO 1999 guidelines16, with multivariable regression analysis used to account for possible confounders, initially for all six groups combined and then by individual location. In this study, despite the controlling of overall lifestyle and environment, seminal volume differed between cities (P < 0.0001) and the median value of the total sperm count for all the men studied differed between the six regions investigated (P = 0.006; unadjusted total sperm counts in millions: 169 for Beihai, 84 for Lhasa, 116 for Germu, 164 for Xinzhou, 113 for Huhehaote and 107 for Mohe). By contrast, sperm concentration and sperm motility were not significantly different between the six areas. The authors postulated that these geographical differences in sperm production might reflect current environmental conditions rather than lifelong influences, as the men were not born and raised in the regions under study. Interestingly, seminal volume, total sperm count and sperm motility in participants from Lhasa, at 3,700 m altitude, were lower than those of other centres not in altitude (P < 0.05). According to the authors, hypoxia and ultraviolet light exposure could possibly explain these results.

Comparing cities within a country

The first study to provide evidence of regional differences in human semen quality included semen data from 4,710 fertile French men who were candidates for sperm donation recruited at eight French regional sperm banks (CECOS)20. Semen data acquired using similar methods in each centre were analysed, accounting for covariates including age, sexual abstinence and centre, using Paris as the reference for comparison with the other cities. By comparison with Paris, the seminal volume was higher in Caen, Normandy (P < 0.001) and lower in Toulouse (P < 0.01), and the total number of spermatozoa was higher in Lille (P < 0.001) and lower in Toulouse (P < 0.05). A difference of 71 × 106 spermatozoa per ejaculate was found between men from Lille (which had the highest regional values) in northern France and men from Paris, only ~200 km away, and 139 × 106 spermatozoa per ejaculate between men from Lille and men from Toulouse (which had the lowest regional values obtained), which is <800 km further south. By comparison with Paris, the percentage of motile spermatozoa was higher in Bordeaux and lower in Tours (both P < 0.001).

A German cross-sectional study29 comparing semen quality of 791 military recruits raised in Leipzig (former East Germany, n = 457) and Hamburg (former West Germany, n = 334) used the same research protocol and method for semen assessment (with a centralized assessment of sperm morphology) as the Nordic–Baltic study22; statistically significant possible confounding factors were accounted for in multivariate regression analyses comparing the two German groups of men. No statistically significant differences were observed in adjusted sperm concentration and total sperm count (median 46 versus 42 × 106/ml and 154 versus 141 × 106/ml for men from Hamburg versus Leipzig, respectively). The adjusted semen volume, sperm motility and morphology were also different between men from the two areas. Increased morphologically normal spermatozoa (9.4% versus 8.4%, P = 0.005) and higher seminal volume (3.4 versus 2.8 ml, P < 0.0005) were observed in the Hamburg group versus the Liepzig group but, by contrast, frequency of motile spermatozoa was higher in the Leipzig group than the Hamburg group (81% versus 67%, P < 0.0005). According to the authors, Hamburg represents a typical urban West European area, and the region of Leipzig was characterized by a heavily polluted environment.

In 2013, a cross-sectional study was published that investigated semen quality of volunteer students aged 18–24 years from four Japanese cities — Kawasaki, Osaka, Kanazawa and Nagasaki30. Both the study participant and his mother had to have been born in Japan. In total, 9,374 leaflets were taken by the students and 1,559 young men (16.6%) participated. By city, the participation rate was 14.5% from Kawasaki, 11.7% from Osaka, 21.9% from Kanazawa and 33.3% from Nagasaki. Sperm concentrations did not differ between men from the four cities, but semen volume for men from Kanazawa was higher than in other centres (P < 0.0001); consequently, total sperm counts were also higher for these men, but this difference was only significant in the pairwise comparison with men from Kawasaki (P < 0.02). Percentages of motile spermatozoa differed significantly overall because men from Nagasaki had higher frequencies of motile spermatozoa than men from other centres (adjusted medians 64–75%). The percentage of morphologically normal spermatozoa for men from Nagasaki was higher than that for men from Osaka (P < 0.0001 in pairwise comparison).

A second Japanese cross-sectional study31 investigated semen data from 792 fertile male partners of pregnant women (who had conceived naturally) with a median age of 31.4 years in four Japanese cities — Sapporo (n = 264), Osaka (n = 222), Kanazawa (n = 266) and Fukuoka (n = 276) — during 1999–2002. The adjusted median seminal volume was significantly different between cities (P = 0.006), the highest in Kanazawa (3.2 ml) and lowest in Fukuoka (2.6 ml). The adjusted median sperm concentration was significantly different between cities (P = 0.04), the highest in Sapporo (95 × 106/ml) and the lowest in Osaka (76 × 106/ml). Although adjusted total sperm count did not differ between the four cities, the adjusted percentages of motile spermatozoa and morphologically normal spermatozoa did differ between locations (both, P < 0.0001). Overall, the authors concluded that semen quality of fertile Japanese men is comparable to that of the optimum parameters in fertile European men21. However, the results might be limited by low participation rates in the four cities (18.8% for Sapporo, 8.8% in Osaka, 16% in Kanazawa and 7.1% in Fukuoka).

A cross-sectional study in Poland32 examined semen quality of men aged 18–35 years in Poznan (n = 113) and Lublin (n = 89), two industrial cities ~400 km apart. Men in Poznan were recruited by the Andrology Unit of the University of Medical Sciences and through media notices, whereas men in Lublin were recruited through private infertility clinics. Semen assessment was performed according to the WHO 1999 guidelines16. Comparisons revealed differences in seminal volume (3.5 ml in Poznan versus 3.1 ml in Lublin, P = 0.003), sperm concentration (50 versus 41 × 106/ml, respectively; P = 0.04), total sperm count (209 versus 121 × 106, respectively; P = 0.003) and percentage normal sperm morphology (32% versus 35%, respectively; P = 0.0004).

A 2019 Swiss cross-sectional study33 included 2,523 military conscripts from all regions of Switzerland. Data on seminal volume, sperm concentration, percentage sperm motility and normal sperm morphology were analysed using standardized methods. Men were stratified into groups according to where they lived in the three geographical regions characteristic of the country — Jura n = 142, Plateau n = 1,892 and Alps, n = 489 — and data were corrected for duration of sexual abstinence. Disparities in semen quality across the different regions were limited: only the adjusted medians of percentage sperm motility and percentage normal sperm morphology differed between the three regions studied: 58%, 53% and 47% (P = 0.02) and 5.7%, 5.0% and 4.7% (P = 0.03) in Jura, Plateau and Alps, respectively, and the authors concluded, therefore, that only slight differences exist in semen quality of young Swiss men. However, they also stressed that the average sperm concentration was among the lowest observed in Europe, with only 38% having sperm concentration, motility and morphology values that met WHO semen reference criteria34. Of note, despite the low participation rate of only ~5%, the sample size achieved was large (n = 2,734), which might prevent major selection bias.

Comparing cities within a region

Studies have also assessed potential differences in sperm characteristics in cities within the same region of a country. For example, a French study34 compared mean and median total sperm counts assessed using the same method in the same centre between districts of the Paris and Ile de France region, for comparable populations of healthy fertile men. Total sperm count was shown to be 80 × 106 higher in the administrative districts of residence furthest from central Paris compared with Paris itself and its adjacent administrative districts (P < 0.001). This single-centre study using adjusted sperm production data evaluated according to a standardized semen analysis methodology suggests that geographical contrasts in semen production might exist even at a regional level, in this case the extended Ile de France region, which covers ~50,000 km2. A similar conclusion was drawn from a prospective study of two cities within the Flanders region of Belgium35, the urban area of Antwerp and the rural area of Peer, which are 75 km apart (Supplementary Table 1). In this study, young men aged 20–40 years were selected randomly from the two municipal population registries to receive a short questionnaire. Overall, the mean total sperm count corrected for confounding variables was lower in Peer than in Antwerp (80 × 106 versus 136 × 106 spermatozoa, P = 0.02) as was the percentage of normal spermatozoa (12% versus 18%, P < 0.001). The authors noted a relatively low response rate of 30%, suggesting that their study was at risk of selection bias.

Overall, of the 27 published studies identified in this Review on geographical trends, 16 met the minimal quality criteria (Supplementary Table 1) that we discuss in the next section. In summary, of these 16 studies identified, 13 provide evidence of a statistically significant geographical contrast in sperm production, 7 of 8 at a continental and/or subcontinental level, 4 of 6 at a national level and 2 of 2 at a regional level (Supplementary Table 1). Furthermore, 12 of 13 studies showed a spatial contrast in qualitative semen characteristics (Supplementary Table 1), percentage sperm motility and/or morphology, 6 of 7 at a continental or subcontinental level, 5 of 5 at a national level and 1 of 1 at a regional level.

Limitations of studies of geographical differences

Studies in infertile men often mean ‘men from infertile couples’, rather than men with an identified cause of infertility. This approach can be problematic when considering spatial (or temporal) trends in semen quality, as the data presented might actually include results of semen analyses performed as part of a couple’s infertility and/or assisted reproductive technology (ART) management, men with already proven infertility (for example, those with azoospermia), men whose fertility status is unknown and men with normal sperm characteristics, some even with proven previous fertility. The development of modern ART approaches, including in vitro fertilization (IVF) in the 1980s48 and intracytoplasmic sperm injection (ICSI) in the 1990s49 is an important, often uncontrolled, covariate in most spatial and temporal studies performed during this period. Some studies36,37,38,39 were based on such potentially biased populations and might also introduce uncontrolled heterogeneities into the study population. Other limitations in studies of geographical contrasts are related to a hybrid study design, for example, comparing historical data in one place and period with cross-sectional data in another place and period40,41,42. Furthermore, some studies compare male populations with different fertility status in different locations42,43. Some studies include several of these limitations and/or do not use standardized methods for assessing semen between the compared groups, making them suboptimal38,39,40,41,42,44,45. One study compares regional temporal changes in sperm count but not the actual sperm count values46.

Ideally, proper appraisal of reported spatial variations in semen quality should rely only on appropriately designed studies, either prospective or retrospective, with the following quality criteria: inclusion of homogeneous and comparable groups of men in each area studied, carried out within a common period of time, using standardized methods for assessing semen samples in each area, and following a standardized protocol, if possible accounting for known cofactors, such as age or sexual abstinence. However, despite meeting these criteria50, conclusions of the studies might be not totally devoid of potential biases.

In studies of fertility, only small proportions of men are usually willing to volunteer, which might introduce selection bias, with some men having specific social and/or reproductive backgrounds that prompt them to participate (or not)51,52,53,54. In addition, the location for sample collection is likely to affect the participation rate (for example, collection at home versus a laboratory), and the conditions of semen collection might affect arousal and, in turn, semen quality, although the literature regarding this specific issue is conflicting55,56,57,58,59.

Findings of some studies might be limited by modest sample sizes24,27,28,32,35, as the normal ranges for semen data, particularly sperm production, are large. However, the limitation of a modest sample size might be balanced by the benefit of a homogeneous and controlled study design, such as in the Belgian study that included only 50 men per group, but was restricted to nonsmokers with lifelong residency in the same areas35.

Some authors concluded that young men have markedly low sperm production representing an indirect sign of a recent deterioration in human semen quality23,30,60. However, this assertion must take into account that the WHO reference values for human semen data61 (which are usually used for comparison) were obtained from partners of pregnant women, which is a different population of men who have proven fertility and are, on average, >10 years older than the young men included in these studies21,22,23,24,25,29,30,31,33,35. Accordingly, studies of sperm characteristics in young adults, students or military conscripts aged ~17–20 years report markedly lower sperm production than studies of older men, typically a total sperm count in the range 100–200 × 106 (refs.22,23,29,30,33,35) compared with 200–300 × 106 in 30–40-year-old partners of pregnant women21,24,28,31 or a median of 255 × 106 in young fathers aged 31 ± 5 years, who provide the WHO reference population for normal semen data61. That the processes of spermatogenesis and sperm maturation are simply not yet optimal in younger men emerging from adolescence cannot be ruled out, although only a few studies partly support this conclusion. For example, Schwartz et al.62 reported that the percentages of normal and motile spermatozoa peaked at 30–35 years in fertile candidates for sperm donation, and total sperm count was reported to increase markedly with age in a population of sperm donors in a separate study63, from a mean value of 263 × 106 in men aged 20 years to 431 × 106 in men aged 34 years. Seminal volume, total sperm count and sperm motility have also been reported to be lower in men <21 years old than in male partners aged 21–50 years in infertile couples64. A longitudinal follow-up study in young men reported only slight differences in semen quality with age for the age range 18–22 years65, and a second longitudinal follow-up study66 with more data and a wider age range (19–29 years) showed that the percentages of motile and morphologically normal spermatozoa increased significantly during the 10-year follow-up period, although data concerning the change in sperm production were conflicting. Overall, these data provide some evidence to suggest an increase in sperm production as well as quality (motility and morphology) in the years after adolescence. Thus, the age of participants in semen quality studies must be taken into account when considering the relevance of the data.

Geographical contrasts in fertile men assessed using WHO guidelines

In 2010, updated, standardized and evidence-based procedures and recommendations for the examination and processing of human semen were described in the fifth edition of the WHO laboratory manual17; the methods described in this manual should be applied to semen studies in a clinical or research setting. When the WHO manual was updated in 2010, no solid reference data regarding human semen quality were available; thus, a study was performed to determine reference intervals for semen characteristics assessed using the WHO standardized methodology61, and distributions of semen characteristics generated from data from fertile men whose partners had a TTP of ≤12 months in 14 countries on four continents were subsequently endorsed by WHO. These reference values were updated in 2021 to include semen data from >3,500 participants from five continents to provide updated distributions and 5th centile threshold values for seminal volume, sperm concentration, total sperm count, percentage motility and percentage normal morphology67. Since the WHO guidelines were published, several studies have reported distributions of semen characteristics of fertile men or their distribution in a given geographical area following the WHO recommended methodology for semen assessment. The similarity in design of these studies to the reference study67 provides a basis for assessing possible contrasts in semen quality by geographical area (Table 1).

Table 1 Geographical variations in semen values (5th percentile – median) in fertile men assessed following WHO guidelines (2010)

Since 2010, four studies have reported semen data obtained using WHO guidelines in fertile male partners of pregnant women with a TTP of ≤12 months. A US study22 in 763 men showed 5th percentile thresholds of sperm concentration, total sperm count and percentage motility lower than the 2021 WHO references67 (12 versus 16 × 106/ml, 32 versus 39 × 106 and 28% versus 42%, respectively). Similarly, a Japanese study31 in 792 men found a 5th centile threshold lower than the reference values67 for seminal volume and normal sperm morphology (1.0 versus 1.4 ml and 1.5% versus 4%, respectively) whereas an Egyptian study68 in 240 men reported a 5th centile for total sperm count notably lower than the reference values and percentage motility higher than reference values62 (30 versus 39 × 106 and 50% versus 42%, respectively). By contrast, Chinese semen data collected in 1,213 men69 did not reveal marked contrasts with reference data67. However, extensive data from China accounted for more than one-third of the reference data in the 2021 WHO study, suggesting that these data are actually more representative of Chinese men than other populations67 (Table 1).

Overall, robust evidence supports geographical contrasts in semen characteristics, even over short distances within the same country. Most of the studies observing these spatial contrasts speculate a possible role of environmental exposures or lifestyle factors, although ethnic differences related to genetic variations or combinations cannot be ruled out.

Temporal trends in human semen quality

In the 1970s and early 1980s several articles raised the possibility of a temporal deterioration in human semen quality1,3,70,71,72; however, these articles were not based on a true sequential analysis of semen. The first studies analysing semen data over time were published in the early 1980s and, since then, 87 articles have reported analysis of temporal trends in human semen quality. These studies used different designs and methodologies (Fig. 1): five were repeated cross-sectional studies, 68 were single-centre retrospective studies, five were multicentre retrospective studies based on individual data, and nine were multicentre retrospective studies based on mean, median or estimated values and not on individual data (Supplementary Table 2).

Most studies focused on the temporal evolution of sperm production (that is, sperm concentration and/or total sperm count), the assessment of which is, by nature, more objective than the assessment of sperm motility or sperm morphology. A much smaller number of studies also reported trends for other semen characteristics, including seminal volume, percentage of motile spermatozoa and morphologically normal spermatozoa.

Single-centre studies with repeated cross-sectional data

Only five studies have examined temporal trends in semen quality using repeated cross-sectional data, four in military conscripts from Scandinavia supposed to represent the general population and one in US male partners from infertile couples60,73,74,75,76 (Supplementary Table 2).

A Swedish study73 in 295 young men (age 17–20 years; median 18 years) born and raised in Sweden assessed men being recruited to military service. The participants delivered an ejaculate during 2008–2010 and their semen characteristics were compared with those of a similar cohort of Swedish military recruits aged ~18 years (n = 216) recruited in 2000–2001. Linear regression analyses estimated mean differences with 95% confidence intervals (CIs) between cohorts A and B with abstinence time (five categories), smoking status and BMI included in the models as potential confounders. No significant changes were found between 2000–2001 data and 2008–2010 data in sperm concentration (78 × 106/ml versus 82 × 106/ml; P = 0.54), semen volume (3.1 ml versus 3.0 ml; P = 0.26) or total sperm count (220 × 106 versus 250 × 106; P = 0.18). The proportion of progressively motile spermatozoa also remained unchanged.

A separate Finnish study74 in 858 volunteer young men (participation rate 13.4%) during 1998–2006 examined temporal trends in semen quality, using lists of Finnish young men who were required to attend a medical examination when they were 18–19 years old, irrespective of whether they were fit for military service. Participants had to live in the Turku area and their mothers had to be born in Finland. Semen samples were assessed using standardized methodology by a single technician across all study years. Temporal trends according to investigation period or birth cohort were tested by linear regressions adjusted for several confounders. Results showed a decrease in sperm concentration, total sperm count and percentage normal morphology compared with earlier time periods (P = 0.02, P = 0.03 and P = 0.03, respectively).

A similar cross-sectional study75 examined temporal changes in semen data in 4,867 Danish military conscripts with a median age of 19 years from 1996 to 2010. Inclusion criteria were place of residence in the Copenhagen area and that both the man and his mother had to be born and raised in Denmark. Seminal volume, sperm concentration, total sperm count, sperm motility and sperm morphology were assessed using a standardized method and study participants were divided into three groups according to the investigation period: 1996–2000, 2001–2005 and 2006–2010, with temporal trends tested by linear regression adjusted for confounders. Over the 15 years, median sperm concentration increased from 43 × 106/ml in 1996–2000 to 48 × 106/ml in 2006–2010 (P = 0.02) and total sperm count from 132 × 106 to 151 × 106 (P = 0.001) in the same periods. The median percentage of motile spermatozoa and abnormal spermatozoa were 68% and 93%, respectively, and did not change during the study period. However, the authors highlighted that the seminal volume, sperm concentration, total sperm count, total number of morphologically normal spermatozoa and percentage of normal spermatozoa were all lower in this group of young men (all P < 0.0005), than in a previously examined group of 349 fertile Danish men who had a median age of 31 years42.

This Danish study was later extended to total of >6,000 young Danish men recruited using the same protocol during a 21-year study period (1996–2016)60 and the same population of military conscripts following the same methodology. Overall, no major changes were seen in adjusted semen data except for percentage motility, which significantly increased (P < 0.001) between 1996 and 2016. Differences in semen parameters over the study period were small and similar in unadjusted and adjusted models.

A US cross-sectional study76 included men aged 18–56 years from couples seeking infertility treatment at the Massachusetts General Hospital between 2000 and 2017. The primary aim of the analysis was to identify environmental determinants of fertility, but the design of the study enabled examination of temporal trends in semen quality. Of note, semen quality did not differ between men who ultimately enrolled in the study and those who did not. The final study sample included 936 men who provided a total of 1,618 semen samples, and sperm concentration and motility were assessed using a computer-aided semen analyser with quality control and monitoring from andrologists who were trained in semen analysis. A multivariable generalized linear mixed model was used to estimate the differences in semen parameters, adjusting for abstinence time. Sperm concentration, total sperm count, percentage motility and percentage morphologically normal sperm decreased significantly over the study period: sperm concentration and total sperm count declined by 2.6% per year and 3.1% per year, respectively, corresponding to an overall decline of 37% and 42%, respectively, between 2000 and 2017. Trends towards a decrease were also observed for percentage motility and morphologically normal spermatozoa, with percentage declines of 15% and 16%, respectively, over the 17-year study period. Seminal volume remained stable over the study period. Of note, this particular study in male partners from infertile couples differed from other studies in the field as it included certain physical and reproductive factors as well as a set of data on environmental exposure parameters such as urinary concentrations of bisphenol A, parabens and phthalates. Interestingly, the negative temporal trends were found to be attenuated when examining the simultaneous changes in reproductive characteristics and urinary phthalates during the study, but, unfortunately, the lack of data on all potential predictors in all study participants during the study period prevented simultaneous evaluation of the possible combined role of all potential contributors to semen quality trends.

Strengths and limitations of these studies

The principal strength of most of these repeated cross-sectional studies is their study population; most are based on young military recruits. In practice, these studies benefited from access to lists of young men invited to undergo a physical examination for military service, who are considered to represent the general population. However, participation rates of such men in these studies are quite low (typically <25%), as is frequently the case for volunteers. Thus, the question arises as to whether these populations can still be considered representative of the general population54,61,77. Volunteer military conscripts are unlikely to have prior knowledge of their fertility potential, which means that this is unlikely to be the main determinant for their participation, which is essential to avoid introducing selection bias. Assessment of testosterone levels in men who agree or refuse to voluntarily give a semen sample has been proposed as a mean of assessing possible participation bias78, but similar concentrations of testosterone have been reported in volunteer military recruits agreeing or refusing to give a semen sample, suggesting no or minimal participation bias79. Of note, volunteers for research studies are often slightly better educated than those who do not volunteer80; although educational level itself is not directly relevant to sperm parameters, it can be related to other factors such as diet or smoking, that are likely to be relevant to sperm parameters.

The participation rates in studies of partners of pregnant women or occupational studies (32–54%21,81,82,83) are often higher than those reported for military recruits (typically <20%). For some studies, high participation rates might, at least partly, result from the home collection of semen samples, which are known to be of better quality than samples collected in a clinic or a laboratory56,57,58. In partners of pregnant women, agreement to provide a semen sample is not associated with age, socio-professional status, TTP, financial compensation or history of urogenital disease52. However, the possibility that social or reproductive history or sexual behaviour influences participation cannot be excluded.

Single-centre studies with retrospective data from individuals

Overall, 68 retrospective single-centre studies, using data from individuals, have been published across various geographical areas. Historically, most studies have come from from the Western world (41% Europe, 13% North America), but subsequently, studies have been published from other parts of the world (21% Asia, 9% South America, 7% Middle East, 7% Oceania and 1% North Africa), although data are still lacking for Russia and Sub-Sahelian Africa.

North America

To date, nine single-centre retrospective studies using data from individuals have examined temporal trends in semen quality in the USA: three in sperm donation candidates with unknown fertility status, five in mixed populations and one in infertile men.

The first, pioneering study in the field2 was published in 1981 and examined temporal trends in semen quality from US sperm donors of unknown fertility status during 1973–1980. All potential and accepted donors were requested to collect a semen sample after 3 days of sexual abstinence, and samples were assessed using a standardized method throughout the study period. The study reported a temporal decrease in sperm concentration when comparing data from 1977–1980 with data from before 1977 (P < 0.05). By contrast, percentage motility remained remarkably constant over the years. Percentage of normal spermatozoa decreased significantly from 1977 to 1980 (P < 0.05). However, the trends reported in this study were questionable, as the study design mixed intra-individual and inter-individual data from accepted and rejected donors over the study period.

A subsequent study in potential sperm donors of unknown fertility status from Wisconsin over a 10-year period (1978–1987)84 failed to detect any significant change over time for sperm concentration and percentage motility but the methodology of this study was poorly described, rendering the data unreliable. However, a separate study of 1,283 men who banked sperm before vasectomy in US sperm banks in Roseville, New York and, Los Angeles over a 25-year period (1970 to 1994)44 showed a slight, but significant, increase in mean sperm concentration for the total population (P = 0.04) as well as by individual centre in New York (r = 0.15, P = 0.002) and Roseville (r = 0.11, P = 0.006) but not in Los Angeles (r = 0.003, P = 0.06) after controlling for age and abstinence. No change in motility and a slight decrease in seminal volume (r = −0.07, P = 0.001) were found for the total population of the three centres over the 25-year period.

Semen data from 510 healthy adult men in Seattle and Tacoma, Washington area, were analysed between 1972 and 1993 (ref.85). Sperm concentration was measured by Coulter counter with a validated method and serial samples were collected from each individual, usually at 2-week intervals. Linear regression of mean sperm concentrations indicated a slight increase with time (P = 0.014) as well as slight, but significant, increases in seminal volume, total sperm count and percentage of normal spermatozoa.

Similarly, a retrospective study of 551 semen analysis records reported trends in semen characteristics in New England from 1972 to 1993 (ref.86). After age adjustment, sperm concentration showed a small upward trend of 0.2 × 106/ml per year (P < 0.01), and the authors also reported a 2.3% per year increase in percentage sperm motility and a 0.3% per year decrease in morphologically normal spermatozoa; however, no P values were reported.

By contrast, a decrease in sperm concentration was reported in a study of semen data for all men who applied to be a sperm donor in the Boston metropolitan area during 2003–2013 (ref.87). A total of 489 young adult men and 9,425 specimens were included in the analysis; specimens were collected by masturbation in a private room at the facility and were analysed using a standardized methodology. A general linear mixed model was used to evaluate the yearly trends, showing a statistically significant decrease in sperm concentration (−3.6 × 106/ml per year; 95% CI −4.9 to −2.2; P < 0.001) and percentage sperm motility (−11 × 106 per year; 95% CI −16.0 to −5.5; P < 0.001), as well as a significant decrease in percentage motility of −1.2% per year (95% CI −1.7 to −0.8). According to the individual’s year of birth, the P trend and β (95% CI) demonstrated a statistically significant decline in sperm concentration, P < 0.0001, 95% CI −1.1 (−1.6 to −0.7); total sperm count, P = 0.0008, 95% CI −3.6 (−5.7 to −1.5); and motility, P = 0.005, 95% CI −0.2 (−0.4 to −0.07), suggesting a possible decrease in sperm quality in association with both birth cohort and time period.

The temporal trend in total motile sperm count (TMSC) was evaluated using semen analyses of 119,972 subfertile men who presented to selected infertility centres in New Jersey in the USA and Valencia in Spain between 2002 and 2017 (ref.88). Semen analyses were categorized into three clinically relevant groups — group 1: TMSC >15 × 106; group 2: TMSC 5–15 × 106; and group 3: TMSC <5 × 106 — and relationships between male age, TMSC, trend and TMSC group by year were assessed. Overall, the proportion of men in group 1 was found to have declined approximately 10% over the past 16 years in the analysis that combined data from both centres. Although the choice to separate men into three groups is questionable, the authors acknowledged that several unknown factors might have influenced the findings.

South America

The first published study on semen trends in a non-Western country came from Venezuela89. Semen volume and sperm concentrations of 2,313 men from infertile couples from Merida between 1981 and 1995 were categorized in four groups according to sperm count. The frequency of azoospermia and oligozoospermia did not change over the 15 years of study. However, when an analysis of mean sperm concentrations was made in each group separately, a significant decrease was seen in men with high sperm counts (>200 × 106/ml) (P < 0.05) and a significant increase in men with sperm counts between 20 and 200 × 106/ml (P < 0.01).

Excluding the azoospermic group, the analysis of pooled data did not show a significant change in the mean sperm concentration through time.

A study in Sao Paulo, Brazil, analysed semen data from 182 sperm donors during 1992–2003 (ref.90). Semen analyses were performed by the same three laboratory technicians during the whole 10-year period, and the same laboratory methods were used to perform the semen analysis. Using multiple linear regression to evaluate the relationship between the year of semen collection and each seminal parameter controlling for potential confounders, sperm concentration was found to decrease (P < 0.0001) as did percentage normal sperm morphology (P < 0.0001) regardless of whether the semen sample analysed was the first or second donated sample. The seminal volume showed a slight increase (P = 0.038), whereas percentage motility did not change (P = 0.38). A second study in Sao Paulo91 reviewed semen data from 2,300 male partners from subfertile couples attending an assisted fertilization centre during 2000–2002 (n = 764) and 2010–2012 (n = 1,536). In this study, mean sperm concentration decreased significantly from 62 × 106/ml in 2000–2002 to 27 × 106/ml in 2010–2012 (P < 0.001). Mean total sperm count also decreased significantly over the same time period from 183 × 106 to 83 × 106 (P < 0.001) as did the mean percentage of morphologically normal spermatozoa, from 4.6% to 2.7% (P < 0.001). In addition, the incidence of severe oligozoospermia and azoospermia significantly increased from 16% to 30% (P < 0.001) and 4.9% to 8.5% (P = 0.001), respectively.

Also in Sao Paolo, semen data from 23,504 infertile men were evaluated over 7.5 years from 2010 to 2017 according to WHO 2010 guidelines17,92. A decreasing trend of 0.05 ml in seminal volume was observed over the period, alongside a tendency towards reduction in sperm concentration by 1 × 106/ml over the 7.5 years (mean of 34.3 × 106/ml). Over the entire study period, percentage sperm motility decreased by 0.7% (mean, 47.3%) and the percentage of morphologically normal spermatozoa decreased by 0.33% (mean, 2.8%), although no P values were reported.

A 2020 study reported temporal trends in semen characteristics in men admitted for infertility testing between 1995 and 2018 at Campinas University93. Only the first semen sample collected for each man was analysed (n = 9267), and the data were analysed using linear regression for the median values. In line with the previous study, overall, a significant decrease in the motile total sperm count (−2.8 × 106 per year, P < 0.001) and median percentage of normal spermatozoa (−0.52% each year, P < 0.001) was observed.

A study of Uruguayan men collected semen data from 317 healthy sperm donor candidates in Montevideo between 1988 and 2019 (ref.94). Semen samples were obtained by masturbation after 3–5 days of sexual abstinence and analysed according to the WHO 1980 and 2010 guidelines13,17, before linear regression and multiple regression analyses were used to calculate changes in sperm concentration and total sperm count per year. Similarly to the Brazilian data, sperm concentration decreased significantly over the 30 years by 0.9 × 106/ml per year, but total sperm count was unchanged (P = 0.1194). A significant change was also seen in percentage normal morphology over the study duration, but the other semen characteristics remained unchanged.

Scandinavia

In 1984, a study in 185 men from Malmö, Sweden95 examined semen quality in 1980–1981 and compared these data with semen analyses of age-matched control men from 1960–1961. By comparison with the earlier data, mean seminal volume and sperm concentration decreased from 3.8 ml in 1960–1961 to 3.4 ml in 1980–1981 (P < 0.05), and from 125 × 106/ml to 78 × 106/ml (P < 0.001), respectively, suggesting a decrease in semen parameters over the 20 years between the sample collections.

A subsequent study from Stockholm96 compared temporal changes in semen data of partners in infertile couples recorded in 1956 (n = 141), 1966 (n = 201), 1976–1979 (n = 219) and 1986 (n = 224) excluding azoospermia samples. In accordance with the previous work, mean total sperm count decreased from 467 × 106 in 1956 to 305 × 106 in 1986 (P < 0.0001), and percentage of morphologically normal spermatozoa also decreased, from 53% in 1956 to 37% in 1986 (P < 0.0001).

Considering the 1985–1995 period, a study from Lund, Sweden97, investigated semen quality in 718 male partners of infertile couples. Time-related changes were analysed using linear regression. In contrast to the previous studies from Stockholm, this analysis showed a significant increase in mean sperm concentration from 46 × 106/ml in 1985 to 64 × 106/ml in 1995 (P < 0.001); mean percentage of morphologically normal spermatozoa also increased from 58% to 66% (P < 0.001). Mean total sperm count did not change, whereas mean seminal volume decreased significantly during this period, from 3.6 ml to 2.7 ml (P = 0.002).

A study of 5,481 Finnish men from infertile couples in Turku98 examined changes in sperm count during a 28-year period, 1967–1994. Mean semen volume, sperm concentration and total sperm count in normal men were 3.3 ml, 134 × 106/ml and 397 × 106, respectively; multiple linear regression analysis revealed a significant decrease in semen volume (P < 0.001), whereas sperm concentration and total sperm count did not change. Of note, no change in sperm count was associated with the men’s year of birth.

In Denmark, a study of 1,055 men born between 1950 and 1970 in Odense99 reviewed semen data at the time of their female partner’s first IVF cycle between 1990 and 1996. These men were assumed to represent a random sample of the Danish male population of fertile age. Semen analyses were performed by the same six technicians using the same counting chambers throughout the study period, minimizing both the intra-assay and inter-assay variations. Mean sperm concentration was 183.7 × 106/ml and mean semen volume was 3.9 ml but, although considerable variation in both parameters was found from year to year, no significant change occurred in either parameter throughout the entire period. When men were stratified according to their birth year, a later year of birth was not associated with any change in sperm concentration or semen volume.

Another study from Denmark investigated whether semen quality changed between 1977 and 1995 in a group of 1,927 unselected semen donor candidates from Copenhagen47. Donors were recruited through advertisements in student periodicals and had to be between 18 and 35 years old, but no other selection criteria were specified. Multiple linear regression analysis using year, sexual abstinence and season as covariates, showed a significant increase in mean sperm concentration from 53 × 106/ml in 1977 to 72.7 × 106/ml in 1995 (P < 0.0001) and in mean total sperm count from 166 × 106 to 228 × 106 (P < 0.0001) and these data showed significant variation between seasons (P < 0.0001 for both parameters). However, the authors indicated that they were unable to control for variation in donor age and, therefore, cannot exclude the possibility of selection bias, whereby participants were accepted as donors by other semen donor services in Copenhagen.

In Norway, potential secular trends in semen characteristics from men of Bergen were assessed according to previous or subsequent paternity during the period 1975–1994 (ref.100). Samples were collected from men under investigation for infertility — 1,108 men who had fathered at least one child before the analysis, 1,786 men who had at least one child after the analysis and 2,286 men with no children registered. When analysed by year of evaluation, registered childless men had a significant decrease in sperm concentration (P < 0.015) and total sperm count (P < 0.001) over the study period. Likewise, the group with subsequent children had significant temporal decrease in sperm concentration (P < 0.015) and total sperm count (P < 0.047), whereas no significant changes were found for the group with previous children. Analysed by year of birth, a significant decrease in sperm concentration (P < 0.025) and total sperm count (P < 0.003) was found for the childless group and for the group with subsequent children (P = 0.012 and P = 0.015, respectively). Otherwise, no significant trends were found.

In a separate study, semen analysis records were studied for all men (n = 5,739) who attended the fertility clinic of Tromsø from 1993 to 2012 (ref.101). Semen samples from men who all resided in the Northern region of Norway were assessed following WHO 1987, 1992 and 2010 recommendations14,15,17. Using multiple regression models accounting for the effect of men’s age and calendar year on semen characteristics, a gradually decreasing trend of mean total sperm count per ejaculate was observed during the study period (P < 0.001), and mean sperm concentration and seminal volume were also found to significantly decrease.

Germany

The first German study to assess sperm parameters102 was published in 1997 and included 187 young male volunteers from Munster, who were recruited via bulletin boards in universities and local newspapers. Samples were collected by masturbation after a requested period of abstinence ranging from 2 to 7 days and analysis was performed as recommended in the WHO manual (1980, 1987, 1992)13,14,15. In this study, no obvious trend over time was observed for sperm concentration, total sperm count or total motility.

By contrast, a subsequent study103 investigated mean sperm concentration and motility of 5,149 men in Magdeburg from 1974 to 1994. The laboratory methods used and the criteria applied to analyse sperm count and motility did not change during this 20-year period and participants were not preselected. Between 1974 and 1976 the mean sperm concentration was 48 × l06/ml, decreasing by 2.1% per year to 26 × 106/ml between 1992 and 1994 (P < 0.001). Likewise, the mean percentage of motile spermatozoa decreased from 38% to 22% and the mean percentage of morphologically normal spermatozoa from 64% to 42% in the same period (both, P < 0.001).

A later study from Leipzig, which assessed characteristics of the first semen specimen obtained from 3,432 patients aged 24–35 years who had attended the Department of Andrology during 1975–2000, showed mixed temporal trends in sperm parameters104. Notably, the population studied was characterized by very low geographical mobility and relocation because of the social and political situation in East Germany. Semen analyses were performed using a standardized method that remained unchanged during the study period. No changes in sperm count or percentage motility were found when analysed by year of semen analysis or age at time of examination; however, by contrast, sperm concentration and total sperm count showed a negatively significant correlation with the year of birth between 1958 and 1968 (both, P < 0.01).

Scotland

A 1996 study of men in Scotland provided early evidence of deteriorating semen quality, using semen data from 577 men from a sperm donation programme in Edinburgh between 1984 and 1995 (ref.63). All samples were analysed in one laboratory according to a standardized method, and relationships between variables were examined using linear and stepwise multiple linear regression. In addition, donors were divided into four roughly equal cohorts of 5 years according to year of birth. Ejaculate volume did not correlate with either year of birth or age at donation. By contrast, sperm concentration decreased by 2.1% per year and total sperm count by 2.0% per year. Overall, motility was weakly positively correlated with a later year of birth, increasing by 0.18% per year. No relationship was observed between the year of donation and any measures of semen quality except overall motility, which increased by 1.2% per year. The median sperm concentration (x106/ml) among donors born in the 1950s was 98, falling to 78 among those born in the 1970s (P = 0.002). The overall percentage of motile sperm did not show any change from the 1950s’ to the 1970s’ birth cohorts.

A second study in the northeast of Scotland — an area where migration rates are low and where andrology services for a population of 500,000 are centralized — examined population-based trends in semen quality between 1994 and 2005 in a cohort of 4,832 men with a sperm concentration of >20 × 106/ml attending for routine semen analysis at the Aberdeen Fertility Centre105. Data adjusted for age and period of abstinence showed a decreasing trend in sperm concentration during the study period (P = 0.017), but no such trend was seen in sperm motility or motile density (total count of motile spermatozoa (millions/ejaculate)). The authors indicated that this trend should be interpreted with caution owing to fluctuations in semen parameters, population bias and the retrospective nature of the analysis.

Belgium

A 1996 study reviewed semen data from 416 candidate sperm donors at University Hospital, Ghent during 1977–1995 (ref.106). The men were recruited through advertising in local journals and student periodicals and most were students or paramedical personnel who had not fathered any children. Semen was analysed using conventional techniques described in the WHO Laboratory Manual (1987)14, and most analyses were performed by the same technician using the same method. A slight, but not significant, decrease in sperm concentration was observed (P = 0.08), alongside a slight linear increase but not significant in ejaculate volume with time (P < 0.06), whereas the total sperm count did not change. Percentage sperm motility and percentage normal morphology were found to significantly decrease (respectively, r = −0.42, P < 0.0001 and r = −0.23, P < 0.0001).

A 2021 study analysed semen data from 439 candidate donors in Antwerp over a 23-year period (1995–2017)107. Over the entire study period, a temporal decrease was observed only for normal sperm morphology (P < 0.0001), whereas all other parameters remained largely unchanged. The mean clinical pregnancy rate per effective donor recruited (n = 104) did not change according to year of donation, as the donors recruited had normal sperm parameters.

France

A 1995 study analysed 1,351 candidate semen donors in a Parisian university sperm bank between 1973 and 1992 to investigate a possible temporal trend108. The men were all healthy, unpaid volunteers who had previously fathered at least one child and all samples were assessed following a standardized methodology. Ejaculate volume did not change during the study period. However, sperm concentration decreased by 2.1% per year, from 89 × 106/ml in 1973 to 60 × 106/ml in 1992. The percentages of motile and normal spermatozoa decreased by 0.6% and 0.5% per year, respectively (P < 0.001 for both). In addition, multiple regression analyses after adjustment for age and sexual abstinence revealed a 2.6% yearly decline in sperm concentration and a 0.3% and 0.7% yearly decline in the percentages of motile and normal spermatozoa, respectively, associated with each successive calendar year of birth (all P < 0.001).

A separate study that used similar recruitment modalities and procedures for semen assessment was subsequently carried out in Toulouse109. The study assessed first ejaculate from 302 fertile candidates for sperm donation whose semen was collected between 1977 and 1992. Linear regression analysis between sperm count and year of donation adjusted on donor’s age did not reveal any changes in this variable (r = 0.09, P > 0.05).

Another French study also reported the results of temporal trends in semen in 1,114 fertile men candidates for sperm donation from the sperm bank in Tours between 1976 and 2009 (ref.110). Only the first semen sample was taken into account, and semen was assessed according to 1980, 1987, 1992 and 1999 WHO guidelines13,14,15,16. A weak decline in total sperm count (r = −0.12, P < 0.0001) was observed, as well as a decline in percentage motility (r = −0.45, P < 0.0001). The results for the percentages of normal spermatozoa and the location of morphological defects were split into two periods, 1976–1997 and 1998–2009, as the method for assessing sperm defects was modified in 1997. Analysis showed a decrease in the percentage of normal spermatozoa (mean decrease of 2.9% per year) from 1976 to 1997 (r = −0.69, P < 0.0001), with a stable rate of −0.7% per year (r = −0.24, P < 0.0001).

A large 2012 study assessed temporal trends in semen quality in 10,932 infertile men who underwent infertility work-up in a university laboratory in Marseille during 1988–2007, with semen samples obtained after 3–6 days of sexual abstinence111. The mean seminal volume did not change over the 20-year study period. However, decreases in adjusted mean sperm concentration and mean total sperm count were observed over the study period (from 74 to 57 × 106/ml, and from 232 to 166 × 106, respectively, both P < 0.001); thus, these two characteristics decreased by 1.5% and 1.6% per year, respectively. The mean percentage of motile spermatozoa declined from 1988 to 2007 (from 57% to 52%, P = 0.008) and the percentage of spermatozoa with normal morphology declined between 1988 and 2002 from 43% to 35% (P < 0.001) with a decrease rate of 2.2% per year.

Austria

A 2005 study examined 7,780 semen samples collected by masturbation at home between 1986 and 2003 and analysed at the Andrology Clinic of the Medical University of Vienna112. Semen analyses were performed by two trained technicians according to 1987, 1992 and 1999 WHO guidelines14,15,16, with sperm concentration and percentage motility being carried out using computer-aided sperm analysis (CASA). An overall decline in sperm concentration was observed during the study period (P = 0.0001), but, by contrast, the percentage of motile sperm and the percentage of morphologically normal spermatozoa increased during this period (P = 0.001 and P = 0.0001, respectively).

Slovenia

A Slovenian study assessed semen data from 2,343 men who were partners of women with tubal infertility and who were included in a IVF–embryo transfer (ET) programme at a university hospital in Ljubljana from 1983 to 1996 (ref.113). To avoid bias due to the increasing proportion of IVF–ET procedures performed for male factor subfertility, only the population of normozoospermic men was studied. Whole population data were analysed, alongside four subgroups of men according to their year of birth, and all semen samples were analysed in the same laboratory according to standardized methods throughout the study period by the same four technicians. The mean volume of seminal fluid did not change significantly during the study period; however, the year of birth influenced semen volume, which increased by 0.018 ml per year. The mean sperm concentration of 81.1 × 106/ml in the study population did not change significantly with time and total sperm count (mean of 273 × 106) did not decrease significantly during the study period. However, percentage sperm motility (which was only analysed from 1988) decreased by 0.94% every year overall, but was also affected by the year of birth of the men (0.13% increase per year). The authors suggested that the dramatic political events between 1987 and 1994 might have induced stress known to alter sperm motility.

Spain

A large retrospective study in 20,411 infertile men assessed changes in semen quality in Barcelona, Spain between 1960 and 1996 (ref.114). Multiple linear regression models were used to assess the effect of independent variables on semen characteristics revealing a 0.2% (P < 0.001) yearly decline in semen volume. No significant changes were seen in sperm concentration and total sperm count over the 36-year period, but the percentage of motile spermatozoa increased significantly by 0.4% during this period (P < 0.001) and the percentage of normal spermatozoa declined significantly in the same period (P < 0.001).

A subsequent population-based study assessed possible temporal trends in semen quality over the previous 30 years (1978–2007) in Salamanca, Spain115. Semen data from 612 consecutive healthy individuals with normospermia attending a single andrology unit for andrological evaluation were analysed by a single highly experienced technician. In this study, seminal volume and sperm counts were found to decrease, whereas percentage sperm motility increased over this period.

A study of semen data in southern Spain pooled data from 488 university students aged 18–23 years in the Murcia region during 2010–2011 with semen data from a previous study116 in Almeria during 2001–2002 to analyse temporal trends in semen quality for this Spanish region42. Semen samples were assessed following standardized procedures and multiple linear regression analyses controlling for appropriate covariates assessed a year-of-birth effect over the combined study period (2001–2011). Notably, sperm concentration and total sperm count were significantly lower in Murcia study participants than in the men from Almeria; however, other semen variables did not differ significantly between the two cities. Even so, adjusted sperm concentration and total sperm count on pooled samples from the two cities declined significantly with year of birth (β = −0.04 and β = −0.06, respectively, both P < 0.01), whereas no temporal trend was found by year of birth for sperm motility or morphology.

Finally, semen data from 992 vasectomy candidates who had fathered at least two children in a single centre in Madrid over three decades (1985–2009) were evaluated in a 2015 study117. Semen samples were analysed using a standardized procedure, and cryopreserved before surgery. Semen characteristics were analysed for the periods 1985–1990, 1990–2000 and 2000–2009. All parameters showed a decline over the study periods: the corresponding mean sperm concentrations were 27.7, 20.7 and 20.1 × 106/ml, respectively (P < 0.0001); mean percentage motility for each period was 53.2%, 47.2% and 40.6%, respectively (P < 0.0001); the mean percentages of morphologically normal spermatozoa for each successive periods were 67.7%, 58.8% and 51.0%, respectively (P < 0.0001). Multivariate analysis revealed significant decreasing trends of sperm concentration, progressive motility and the percentage of morphologically normal spermatozoa (P < 0.01 for all).

Italy

A 1996 retrospective study analysed 20-year trends in semen data from 4,518 infertile Italian men from 1975 to 1994 (ref.118). Semen samples were analysed by the same two technicians over the 20-year period and participants were divided into three groups according to year of sample collection: 1975–1979 (n = 1,492), 1983–1986 (n = 1,506) and 1991–1994 (n = 1,520). Over the study period, mean seminal volume slightly decreased from 3.2 to 2.9 ml, mean sperm concentration decreased from 72 to 65 × 106/ml and percentage motility decreased from 50% to 32%. P values were not reported and no adjustment factors were considered, preventing conclusions based on these findings.

In a subsequent study from 2021, changes in semen characteristics were assessed in men at an andrology reference centre in Catania (Sicily) during 2011–2020 (ref.119). During this 10-year period, 1,409 semen analysis reports were randomly selected, and data on sperm concentration, total sperm count, percentage sperm motility and percentage of normal spermatozoa were analysed. A slight but nonsignificant decline in total sperm count was observed (−2.3 × 106 per year, P = 0.07) and the percentage of spermatozoa with normal morphology (−0.08% per year, P = 0.06) were found. However, by contrast, the mean percentage sperm motility significantly increased (+0.28% per year, P = 0.008).

Greece

A 1996 study reported temporal trends in semen quality in men living permanently in the Greater Athens area over the period 1973–1993 (ref.120), using retrospective analysis of records from three andrology laboratories using the same method for semen evaluation. Of 23,850 men being assessed for couple subfertility, 2,385 (10%) were randomly selected. Mean total sperm count decreased from 154 × 106 to 130 × 106 (P < 0.01), and multiple regression analysis of seminal volume and total sperm count adjusted for age and year of assessment revealed a significant decline in both characteristics over the 17 years of the study (P < 0.05 and P < 0.0001, respectively).

Africa

The only study of semen quality trends in Africa came from Sfax, southern Tunisia121 and investigated temporal trends in semen characteristics for the period 1996–2007 in 1,835 men from infertile couples. Semen analysis was performed by a specially trained laboratory technician according to the standardized procedures recommended by 1992 WHO guidelines15. Linear regression analysis of semen data adjusted for age and sexual abstinence revealed a decrease in mean total sperm count of 6.0 × 106 spermatozoa per year, from 328 × 106 in 1996 to 260 × 106 in 2007 (P = 0.0004) and a decline in the percentage of normal sperm by 2.6% per year, from 43% to 17% (P < 0.0001). During the same period, the mean percentage motility and seminal volume did not change.

Israel

Changes in semen characteristics in 188 young healthy sperm donors were measured between 1980 and 1995 in a sperm bank in Jerusalem122, using linear regression analysis to assess the changes in semen characteristics. This study did not reveal any significant changes in sperm concentration or percentage motility during the study period; however, the mean semen volume increased by 0.1 ml (5.1%) per year (P < 0.0001) and the percentage of morphologically normal spermatozoa decreased by a mean of 1.04% per year during the entire period (P < 0.0001).

A second study, also in Jerusalem, investigated whether semen quality changed during 1990–1999 among infertile men123. Although the study claimed to have performed both cross-sectional and longitudinal analyses, this was a retrospective study based on semen data from 2,638 male partners in couples who underwent infertility treatment by intrauterine insemination. Linear regressions of each of the continuous outcome measures (count and motility) by year of examination indicated that total sperm count decreased by 5.2 × 106 (P < 0.0001) each year and percentage motility declined by 0.5% each year (P = 0.0003).

A third trial in Jerusalem analysed temporal trends in sperm concentration and percentage motility using 2,812 semen samples collected on a weekly basis from 58 young, healthy, fertile, university-educated, paid donors in 1995–2009 (ref.124). Over the study period, the mean of all sperm characteristics studied declined, from 106 to 68 × 106/ml for mean sperm concentration (P < 0.0001), from 79% to 66% for sperm motility (P < 0.0001) and from 66.4 to 49 × 106 (P < 0.005) for TMSC per ejaculate.

A study in Tel Aviv assessed trends in semen characteristics of 1,833 men who underwent semen analysis between 1991 and 2010 at the Andrology Laboratory at the Institute for the Study of Fertility125. The study group was heterogeneous in terms of age, place of residence, reason for referral, health status and aetiology of fertility concern. Mixed models analyses were used to describe changes in each characteristic as a function of follow-up time and the men were divided into three groups according to sperm concentration, overall sperm motility and percentage of normal sperm. Sperm concentrations and motility values were significantly lower over time in the group who initially had the best sperm characteristics, who originally had normal semen values according to the WHO 2010 criteria41 (P < 0.001). The percentage of morphologically normal spermatozoa was also significantly reduced over time in this group (P < 0.001).

Iran

A 2020 study is the only published work to have assessed temporal changes in semen quality among Iranian men, in this case men referred for infertility in Yazd between 1990 and 1992 (n = 707) and also between 2010 and 2012 (n = 1,108)126. During the study period, a significant increase in mean sperm concentration from 84 to 96 × 106/ml (P < 0.0001) was observed, as well as a decrease in the mean percentage of normal spermatozoa from 62% to 44% (P < 0.0001); no change was seen in seminal volume or percentage motility.

South Korea

A large retrospective study in 22,249 men presenting with infertility investigated whether semen quality changed in South Korea between 1989 and 1998 (ref.127). Interestingly, the mean sperm concentration was 60 × 106/ml for the entire study period and no significant changes in sperm concentration were found, and neither the seminal volume nor percentage sperm motility changed. Furthermore, no significant association was found between either age or year of birth and semen quality.

Japan

Changes in semen quality of Japanese men were reported in a 2001 study that involved normal healthy volunteers who lived in the Sapporo area in 1975–1980 (n = 254) and 1998 (n = 457)128. No change was observed in semen volume between 1975–1980 and 1998, neither did mean sperm count change significantly during the study period (70.9 × 106/ml in 1975–1980 versus 79.6 × 106/ml in 1998). Furthermore, rates of individuals with oligozoospermia and azoospermia were the same in both periods.

India

The first report to examine temporal trends in semen quality in India was based on semen data of men (n = 1,176) attending a fertility clinic (men with azoospermia or severe oligospermia were excluded) in New Delhi during 1990–2000 (ref.129). No significant decline in sperm counts was observed in any year during the entire study period. A separate study investigated temporal trends in semen quality in 7,770 South Indian men evaluated at a university infertility clinic in Manipal during 1993–2005 (ref.130). Semen samples were assessed according to the WHO 1992 and 1999 recommendations15,16. Unlike the New Delhi study, comparison of mean sperm concentration in 2004–2005 with 1993–1994 indicated a significant drop, from 38.2 × 106/ml to 26.6 × 106/ml. Changes were also observed for percentage sperm motility (47% versus 61%) and the percentage of morphologically normal spermatozoa (20% versus 41%), which both increased. Regression analyses exploring the relationship between semen characteristics and time period showed an inverse (linear) association with sperm count (r = −0.14), motility (r = −0.20) and morphology (r = −0.58). Of note, no P values were reported.

A 2010 study investigated semen quality of male partners of couples attending an andrology laboratory for infertility-related problems in Kolkata, India — one of the most polluted cities in the world — compared the periods 1981–1985 (n = 1,752) and 2001–2006 (n = 1,977)131. Only men with an initial sperm concentration of >20 × 106/ml were selected for the study, and analysis was carried out according to WHO 1980, 1987, 1992 and 1999 guidelines13,14,15,16. This study showed a significant decrease in mean seminal volume and mean percentage sperm motility in the 2000s compared with the 1980s (3.0 ml versus 2.7 ml and 65% versus 58%, respectively; P < 0.0001). By contrast, mean sperm concentration did not show any significant change between the two decades (84 versus 87 × 106/ml).

Bangladesh

Longitudinal changes in semen characteristics of 13,810 men aged 18–64 years who sought care for general sperm quality or updates on fertility status at an infertility clinic in Dhaka, Bangladesh between 2000 and 2016 have been investigated in a study that adjusted for age and duration of abstinence at testing132. Adjusted total motility declined by 20% from the maximum recorded values at the end of the study (P < 0.0001), whereas sperm concentration lacked clear trends and was unaffected by adjustment. Prevalence of azoospermia increased by 18% between the 2000–2010 and 2011–2016 participants.

China

Several studies have investigated semen trends in Chinese men.

A 2013 study assessed changes in semen quality in a population of 28,213 men aged 20–40 years who attended for fertility examination in Sichuan between 2007 and 2012 (ref.133). Semen analysis was performed according to the WHO 1999 criteria16. During the 5-year duration of the study, sperm concentration and percentage of sperm normal morphology decreased from 66 to 49 × 106/ml and from 14% to 5%, respectively.

A separate study, this time in Shandong province, assessed semen data from 5,210 sperm donors between 2008 and 2014 (ref.134). Semen analysis was performed according to WHO 1999 guidelines16 controlling for appropriate covariates. A significant decrease in mean values was observed for semen volume, sperm concentration, percentage motility and total sperm count (R2 = 0.563, P = 0.052, β = −0.012; R2 = 0.848, P = 0.003, β = −0.032; R2 = 0.829, P = 0.004, β = −0.008; and R2 = 0.796, P = 0.007, β = −0.045, respectively). Moreover, after adjustment for age, BMI, duration of abstinence and season, all of these variables (semen volume, sperm concentration, sperm forward motility and total sperm count) also showed a tendency to decrease with calendar year (β = −0.012, P < 0.001; β = −0.031, P < 0.001; β = −0.006, P < 0.001; and β = −0.045, P < 0.001, respectively).

Another Chinese study assessed semen data for a total of 30,636 young adult sperm donors at the Hunan Province Sperm Bank of China in 2001–2015, with all specimens assessed according to WHO 1999 recommendations16,135. Study participants were divided into three groups by investigation period: 2001–2005, 2006–2010 and 2011–2015. Sperm concentration, total sperm count and normal sperm morphology and the sperm progressive motility significantly declined over the 15-year observation period. For example, median total sperm count decreased from 177 × 106 in 2001–2005 to 137 × 106 in 2006–2010 and 114 × 106 in 2011–2015. Median sperm concentration decreased from 64 × 106/ml (2001–2005) to 60 × 106/ml (2006–2010) and 50 × 106/ml (2011–2015). Similarly, median sperm motility decreased from 31% (2001–2005) to 24% (2006–2010) and finally 20% (2011–2015).

A study from Wuhan, central China, carried out over a similar time period136 also reported a significant decline in sperm concentration from a median value of 53.0 × 106/ml in 2010 to 45.0 × 106/ml in 2015. Total sperm count also decreased during the study period, by –3.76 ± 0.20 × 106 per year.

A 2019 retrospective cross-sectional study of 71,623 infertile men in Xiangya, Hunan, analysed semen data from men with male infertility between 2011 and 2017 (ref.137). The standard WHO 2010 guidelines for semen analysis17 were followed during the 7-year study period, and all semen analyses were performed by the same four technicians. Unlike other Chinese studies over a similar time period, no significant changes in semen quality were found.

By contrast, a study from 2020 that assessed semen data from 23,936 sperm donor candidates at the Henan Sperm Bank of China between 2009 and 2019 did report temporal semen changes138. Using multiple linear regression analyses to account for potential confounders (age, BMI, sexual abstinence) suggested that sperm concentration decreased from 62 × 106/ml in 2009 to 32 × 106/ml in 2019 (P < 0.001), with an average annual rate of 3.9%. Similarly, total sperm count decreased from 160 × 106 in 2009 to 80 × 106 in 2019 (P < 0.001), with an average annual rate of 4.2%, progressive motility decreased from 54% in 2009 to 40% in 2019 (P < 0.001), with an average annual rate of 2.5% and total motility decreased from 60% in 2009 to 46% in 2019 (P < 0.001), with an average annual rate of 1.9%.

A subsequent study139 of infertile men in Wenzhou, China used data obtained from 38,905 patients during 2008–2016. The annual mean percentage of percentage motility and percentage of spermatozoa with normal morphology decreased linearly with slopes of −2.6 (P < 0.01) and −0.70 (P < 0.05), respectively. Data on sperm production were not reported.

Taiwan

In a 2016 report, semen quality of 7,187 northern Taiwanese men recruited from a reproductive medical centre was analysed140. The mean age was 36.9 years (range 26–57 years) and semen analysis was performed following WHO guidelines. The data indicated an annual reduction in sperm concentration, seminal volume, total sperm count, percentage motility and the percentage of morphologically normal sperm of 1.01 × 106/ml, 1.02 ml, 1.03 × 106, 1.02% and 1.02%, respectively.

Australia

The first study from Australia to examine temporal trends in semen quality141 was published in 1997 and included semen data from 509 fertile healthy men volunteering for sperm donation in Sydney during 1980–1995. Overall no significant difference was observed in sperm concentration over time or between years or according to year of birth, regardless of whether the first semen sample was analysed individually or if grouped by year of ejaculation.

A second study, also from Sydney142, reviewed semen data from the first ejaculates of 448 males volunteering for sperm donation over the 18-year period from 1983 to 2001. Participants were not selected for fertility or marital status, but had to be aged between 18 and 40 years; samples were assessed following WHO 1980, 1987, 1992 and 1999 guidelines13,14,15,16. Similar to the first Australian study, no change was observed in total sperm count during the study period (P = 0.17) using a linear regression model and the seminal volume did not change; however, an increase in sperm motility was found (P < 0.001).

In Melbourne, semen data from infertile men attending a fertility centre were compared during two distinct periods: 1977–1981 (n = 309) and 1997–1998 (n = 559)143. The same standardized methodology was used during the study period and only men with sperm concentrations >5 × 106/ml were included in the final analysis. A small, but statistically significant, drop was observed in mean seminal volume between the first and second study periods (3.9 versus 3.6 ml, P = 0.015), but no significant difference in median sperm concentration were found (88 versus 92 × 106/ml). However, the small increase in seminal volume had no effect on total sperm count with no change observed (321 versus 313 × 106).

New Zealand

A 2008 two-centre study revealed declining sperm quality in New Zealand over a 20-year period. Semen data from the first sample delivered by 975 candidates for sperm donation presenting at fertility clinics in Auckland (1987–2007) and Wellington (1992–2007)144 showed that the mean concentration of sperm decreased from 110 × 106 /ml in 1987 to 50 × 106/ml in 2007 (P < 0.001), an average reduction of 2.5% annually. The volume of semen also fell significantly from 3.7 ml to 3.3 ml (P < 0.001). Duration of abstinence did not change between periods.

In a 2015 follow-up study, which assessed sperm quality from 2008 to 2014 (ref.145), 285 further participants for sperm donation were added to candidates already included in the previous analysis144, with donors recruited from the same clinics and semen samples assessed using the same methodology. The new data were compared with previous results from 1987 to 2007 and, interestingly, the decline in semen volume and sperm concentration observed between 1987 and 2007 was not found to continue during the period 2008–2014. When the data were analysed as a whole over the period 1987–2014, no significant change was seen over the total period studied, although seminal volume and sperm concentration decreased significantly (P = 0.05 and P = 0.001, respectively), and sperm motility declined significantly (−8%) in the later period 2008–2014.

Limitations of retrospective single-centre studies

At first glance, the overall results of these retrospective studies from numerous centres worldwide do not support the notion of a general worldwide temporal trend, or even a general trend in the Western world, in sperm production and quality. For example, with respect to trends in sperm concentration (or total sperm count when concentration was not determined), 57% of studies reported a decrease in sperm production over time: 83% of South American studies, 64% of European studies, 60% of Middle Eastern studies, 50% of Asian studies, 40% of Oceanian studies and 33% of the North American studies. In fact, 29% of all studies reported no change and 12% indicated an increasing trend.

However, whether the conclusions of the various published studies are equally reliable, and whether, therefore, they truly show a predominant trend in global human sperm production remains debatable. Critical analysis of the published studies cannot draw this conclusion.

Analysing retrospective data in a single centre might be a useful approach for detecting temporal trends in semen quality, avoiding the potential confounders of spatially heterogeneous populations and enabling more homogeneous methodology for semen analysis than in multicentre studies. However, these conditions are not necessarily the case among all the single-centre studies published.

In retrospective single-centre studies, many factors contribute to the quality of the study’s conclusions. Among the factors contributing to the strength of the trends reported is the type of population studied, its degree of homogeneity at baseline and its maintenance over the course of the study. Several other factors, such as the sample size, period covered, semen analysis procedures and statistical methodology can also affect the degree of confidence in the temporal trends reported. Thus, meaningful interpretation of the conclusions from retrospective single-centre studies of temporal trends in semen quality should consider all these factors carefully.

Male populations selected

The single-centre studies considered were performed in various populations (Supplementary Table 2). Semen data from male partners in infertile couples were assessed in 47% of studies, from candidates for sperm donation with unknown fertility status in 22%, from mixed populations in 21%, from candidates for sperm donation who were already fathers in 6%, from male partners of women with tubal obstruction with unknown fertility status in 3%, and from candidates for vasectomy who were already fathers in 1%. Thus, most studies came from laboratories in which semen analyses of men from infertile couples were accumulated over many years. Such a population is heterogeneous by nature, and characteristics of these populations might be unstable over the study period, which cannot be easily controlled. Importantly, the progressive development of modern ART approaches — including IVF from the 1980s and ICSI in the early 1990s — means that infertile couples in which the male partner has very poor semen quality are likely to have been increasingly included, which represents an uncontrolled covariate in most temporal studies in infertile men encompassing these years. Indeed, if a reproductive laboratory is associated with an IVF and/or ICSI centre that did not exist or was not fully operational in the first years of the study, this might introduce a major bias in the reported trends, as indications for IVF and, even more so, for ICSI, in terms of poor semen quality have steadily increased since these methods were introduced.

Studies in populations of men volunteering for sperm donation, either fertile or of unknown fertility, might also suffer from uncontrolled selection bias. Men who are already fathers who are candidates for sperm donation in an insemination programme without compensation might constitute a relatively homogeneous population as long as the recruitment modalities remain the same over time, although they are not representative of the general population. However, unpredictable factors might bias this population — for example, doubts about paternity of specific children — and cannot be ruled out. Several studies have suggested that the offer of financial compensation is not a strong motivation for the volunteers participating in semen studies, at least in the Western world52,80, indicating that populations of students or young men with unknown fertility status, volunteers in paid sperm donation programmes, clinical or research studies are likely to suffer from minimal bias.

Overall, studies in mixed populations of men or considering varying populations at different times do not offer a high degree of confidence in the temporal trends found.

Finally, partners of women with tubal obstruction undergoing IVF, who were included in just two studies, might constitute a population that is closer to the general population146,147.

Sample size

The largest sample sizes were in studies that included male partners of infertile couples; by contrast, many of the studies involving other populations were based on small populations. More than 20 studies (about half of the studies considered in this Review) that reported trends in sperm production were based on a relatively small sample size for the study period, averaging <50 values per year; ten studies reported trends based on <20 values per year. These small numbers are a serious drawback, as the normal range of values for human sperm production is wide, typically from 0 to hundreds of millions for sperm concentration and thousands of millions for total sperm count. Thus, studies with a small sample size increase the risk of reaching incorrect conclusions about temporal trends in sperm production, which has been demonstrated by data modelling34. Similarly, half of the studies that did not report changes in sperm production were also based on a relatively small sample size.

Study period

The period over which data are collected can affect the reliability of the conclusions. For example, 19% of studies covered periods of <10 years, raising the question of whether a temporal trend can really be observed over such a short period. Temporal studies covering longer periods, ideally more than two decades, would be better for depicting a temporal trend, as the overall data will be less affected by unexplained short-term fluctuations in the recorded values. Moreover, analyses of sufficiently long periods for characteristics with very wide distributions, such as sperm concentration and total sperm count (from a sufficiently large number of men) increase the probability of finding meaningful changes. Overall, in all fields in which societal or behavioural factors might be associated with temporal changes, the longest periods possible should be analysed to be able to reach meaningful conclusions about the trends observed.

Methods used

The methods used for semen analysis might also affect the reported temporal trends. Unlike many other laboratory tests, which are automated and calibrated, semen analysis relies almost entirely on manual procedures. Consequently, semen analyses must be based on well-defined and standardized procedures as well as continuous internal and external quality controls.

At the beginning of the 1980s, the WHO recommended standardized approaches for assessing human semen13, which have evolved and been updated over time13,14,15,16,17. However, some of the discussed studies began before the 1980s, when the WHO guidelines were implemented. In addition, many studies did not describe their methods precisely or did not follow the WHO guidelines at all. Furthermore, several studies stated that the WHO recommended procedures were followed, but the description of the procedures used suggests noticeable deviations from the recommended procedures. Others even reported changing methods during the study period, for example44, for sperm count analysis. Changing a procedure can be a notable confounding factor if the periods before and after the change are not analysed separately. For example, counting spermatozoa in a haemocytometer, a single-use calibrated chamber or a Makler chamber does not produce the same count148. Similarly, changing the procedure for assessing normal sperm morphology from old WHO guidelines to the more recent WHO recommendations, which are based on stricter criteria, results in markedly different percentages of morphologically normal spermatozoa149. Finally, fewer than half of the discussed studies considered inter-observer variability, including both occasional and intrinsic variability within the same pool of technicians and the changes in these pools over the years. In addition, few studies mention the existence of concomitant internal quality controls that are necessary to maintain satisfactory intra-individual and inter-individual homogeneity in semen assessment over time150,151.

Overall, understanding the validity of the temporal trends reported in studies is difficult when the reports themselves do not include sufficient information on the methods used for semen analysis, the technical staff involved throughout the study period and/or the quality control schemes applied.

Data analysis

Many of the included studies failed to check the main validity criteria of the linear regression model, such as the normal residual distribution that might require transformation of the explained variable (for example sperm concentration) or the linear assumption of the relationship between the explained variable and the quantitative covariate. This omission might be a source of biased estimate in the trend152. In addition, more than half of the studies reported results from statistical analysis of semen data unadjusted for known and major confounders such as age or sexual abstinence61,108. Furthermore, some studies used semen data available from only two or three distinct periods, enabling simple mean comparison instead of an estimate of the continuous change in the semen parameters over time. Most studies used the year of semen collection instead of the men’s year of birth; however, this stratifying factor is more relevant for assessing parallel changes in the environment. Of the six studies investigating the link between birth cohort and semen parameters, five reported a decrease in sperm production (sperm concentration and/or sperm count) over time, covering the years 1973–2019 (refs.42,63,87,108,113,136). The most important factors that might reduce the reliability of these studies are those related to changes over time in the characteristics of the men studied. For example, changes in recruitment criteria. A 2018 study from China clearly illustrates the difficulty of controlling for this key parameter136, as the characteristics of semen donors with unknown fertility status changed drastically over the study period. The trend in the percentage of non-students included increased significantly and influenced the temporal trend in sperm concentration, as seen by comparing students and non-students in a stratified analysis, and the education level of the men in this study declined over time. By contrast, the 1997 Parisian study is one of the few to have been able to ascertain that the recruitment pattern of sperm donation candidates did not change over the study period34.

Temporal changes in these factors and in the characteristics of the men studied might be relevant to sperm quality, directly or through other environmental factors, leading to possible bias due to uncontrolled confounding.

Overall, the studies with the most relevant design were those that were based on a standardized and controlled assessment of sperm production throughout the study period; relied on a relatively homogeneous group of men throughout the study; covered a sufficiently long period; had a sufficient cross-sectional sample size, in view of the wide distribution of human sperm production; and accounted for the main covariates known to modulate sperm characteristics. In total, and on the basis of these criteria considered together, six studies can be considered to provide relatively robust data on the trends reported: Zorn et al.113 in partners of women with tubal obstruction in Ljubljana (Slovenia), Auger et al.108 among fertile candidates for sperm donation in the Paris area (France) and the studies of Centola et al.87 in Boston (USA), Gyllenborg et al.47 in Copenhagen (Denmark), Shine et al.144 in New Zealand and Liu et al. in Henan138 (China), the latter four studies being carried out among unselected candidates for sperm donation with unknown fertility status (see Supplementary Table 2). Five of these six studies reported a decrease in sperm production (sperm concentration and/or sperm count) over time, covering overall the period 1973–2019.

Retrospective multicentre studies based on individual values

Over the past 30 years, several multicentre studies have reported temporal trends on sperm production from the individual data of men whose semen was collected in geographically distinct areas (Supplementary Table 2).

France: first data from a national IVF register

Following the two studies carried out in single centres, which reached different conclusions in temporal trends of sperm production in the Paris and Toulouse regions108,109, the French register on IVF — a large database that contains details of 90% of all cycles of IVF in France and has recorded sperm concentrations since 1989 — was used to examine the possibility of a temporal trend in sperm production on a national scale153. To avoid any bias introduced by the increasing use of IVF for male infertility, only couples with pure tubal infertility in which the husband’s semen was normal (sperm count >20 × 106/ml, total sperm motility at 1 h >40% and normal sperm morphology >40%) before the IVF attempt were selected. In total, 19,848 sperm concentrations in 7,714 men were included in the study and analysed with a generalized linear model, using both the crude variable of sperm count and its logarithmic transformation. Sperm counts varied with the year of birth (P < 0.0001), with fairly stable mean values for men born before 1950 and a regular decrease in the mean value for each 5-year strata from 1950 to 1965; this decrease was observed regardless of the year of semen collection (1989–1994) and the results were similar when analysis was restricted to the first ejaculates. Of note, the number of centres involved in the study was not mentioned and semen data were not adjusted for the period of sexual abstinence, which was not recorded in the register.

France: 126 IVF centres

A further study using the French IVF database examined the previously identified semen quality trends since 1989 but expanded the study to include a general population from across the whole country147. The database recorded the ART attempts of couples from the entire French metropolitan territory (126 main ART centres across France) between 1989 and 2005, covering 17 years and the geographical diversity of France (n = 26,609). The results of two semen analyses for each man were provided, ensuring a control for intra-individual variation and only male partners of women who had both tubes noted as absent or blocked (and who were, therefore, definitely infertile) were included, and a regression model controlling for men’s age was applied with adjustment for the ART centre included in a sensitivity analysis to confirm that no particular centre influenced the trends.

A significant 32.2% (95% CI 26.3–36.3) decrease in sperm concentration was observed over the whole 17-year study period; projected concentration for a 35-year-old man went from 73.6 × 106/ml (95% CI 69.0–78.4) in 1989 to 49.9 × 106/ml (95% CI 43.5–54.7) in 2005 with an average decrease of 1.4 × 106/ml per year (1.9%). When it was stable between 1989 (49.5% (95% CI 48.2–50.8)) and 1994 (49.6% (95% CI 49.2–50.1)), the motility percentage increased to 52.4% (95% CI 51.9–52.9) in 1998, after which it stabilized again until 2005 at 53.6% (95% CI 52.0–55.2). A significant 33.4% (95% CI 29.7–37.2) decrease in the mean percentage of morphologically normal spermatozoa was observed from 60.9% (95% CI 58.8–62.9) in 1989 to 52.8% (95% CI 52.0–53.5) in 1995: an average decrease of 1.3% morphologically normal spermatozoa per year. Adjusted trends showed that the observed decreases in sperm concentration and normal morphology were not due to inclusion of infertile men owing to the advent of ICSI.

France: eight metropolitan areas

In a 1997 study, semen quality was retrospectively assessed using data from 4,710 healthy unselected fertile men, who were candidate semen donors in eight different French metropolitean areas during 1973–1993 (ref.20). All the men were referred under the same guidelines and all semen samples were analysed using similar methodologies. Differences were found between centres for several semen characteristics; however, the multiple regression model only detected a temporal trend in semen characteristics after appropriate variable transformations according to the year of semen collection and accounting jointly for age, sexual abstinence and centre. These analyses identified a negative temporal trend for seminal volume (β = −0.002, P < 0.05), sperm concentration (β = −0.036, P < 0.01), total sperm count (β=  −0.106, P < 0.001) and percentage motility (β = −106, P < 0.001).

Denmark: four regions

A Danish study that included four medical centres154 examined temporal trends in semen quality during the period 1968–1992 from medical records of 8,608 infertile men born between 1925 and 1971. Semen characteristics were analysed as a linear function of year of birth, centre, season and calendar year at time of semen examination, sexual abstinence and lifestyle factors. Effects of age were accounted for by restriction and stratified analysis. Sperm concentration declined with increasing year of birth at two of the four centres, but this association disappeared when confounders were adjusted for. Within the subset of men born after 1950 (n = 5,650), a decrease in the average sperm count by 1.9 × 106/ml (95% CI 1.45–2.27) per one advancing year of birth was found. This finding was consistent across centres, and after adjustment for covariates. The proportion of morphologically normal sperm cells changed in parallel with the sperm count, and seminal volume did not decline in any time period.

Canada: 11 regions

Infertility clinics in 11 centres across Canada participated in a study examining the possibility of a temporal trend in semen quality at the level of the country36. All semen data were aggregated regardless of the reasons for analysis, which ranged from outpatient referrals to men who attended fertility clinics for infertility work-up or ART procedures. Semen samples (n = 51,101) were collected by masturbation, after >3 days of sexual abstinence and assessed following WHO guidelines during 1975–1996. In this study, only sperm concentration was collated from all centres, rejecting azoospermia samples and samples of >800 × 106/ml, which were considered outliers. Trends were determined by linear regression. Multiple regression analysis was performed to account for the mean basal differences between centres. Overall, there was no significant temporal trend in sperm concentration (P = 0.397). In the 1984–1996 group of 48,968 samples with more than 1,600 per year, there was a downward trend (P < 0.0001), representing a decline of 1.4% per year (−1.60 × 106/ml, 95% CI −1.37 to −1.84) over 13 years. However, this trend and the distribution by centre accounted for only 3.1% of the variability in sperm concentration.

Limitations of retrospective multicentre studies based on individuals

All these studies reported decreasing trends in sperm concentration. However, only one20 of the discussed investigations considered known relevant covariates. Although this type of study, which combines large sample size with use of meta-regression techniques, is able to somewhat compensate for absent or inappropriate adjustment, each single centre might have a high degree of heterogeneity in their participants, owing to the various types of couple seen in ART programmes, the long time period covered and — more importantly — in the method for assessing semen samples and quality control procedures. For example, data from a 2012 national quality control programme in Germany, involving hundreds of centres, showed that only half used the chamber recommended by WHO for counting spermatozoa and only 30% used the WHO method for diluting the sample before counting155. These usual disparate practices created variability, and if these disparities are not considered in the meta-regression studies, major concerns might arise in the findings.

Geographical variability might also be important, but has not always been taken into account or been well-discussed in temporal studies. The contribution of each centre to the overall sample might have been unequal across time periods considered. For instance, semen data of some centres for a specific time period might be missed for various practical reasons, leading to bias if these centres have contributed particularly high or low mean semen values. In addition, not all studies have taken into account the possible modification introduced by this ‘centre’ factor36. For example, although a temporal trend at the national level has been suggested in the multicentre Canadian study36, the authors reported a significant downward trend for six centres, an upward trend for two centres and a slight, but insignificant upward trend for one centre.

Overall, and considering the strict criteria we consider for an optimal design of a retrospective single-centre study, these limitations cast doubt on the reliability of general trends reported in these multicentre studies.

Retrospective multicentre studies based on mean, median or estimated values

Since 1992, several studies have investigated worldwide or continent-wide temporal trends in semen quality using aggregated data of sperm production, such as mean values.

The first study of this type, by Carlsen and colleagues6, was published in 1992 and investigated global changes in semen quality over the previous 50 years. The authors selected publications using keyword searches in the Cumulated Index Medicus for the period 1930–1965 and for retrieved studies from 1966–1991 using the Medline database. Publications in infertile men, those referred for oligozoospermia or some genital abnormality, populations selected for either a high or a low sperm count, and studies in which sperm concentration was assessed by computer-assisted system or flow cytometry were excluded, so the final analysis was based on a total of 61 papers, which included data on 14,947 men. Numerous countries from different areas of the world were represented among the selected publications, but almost half of the data included originated in the USA. Linear regression analysis showed a marginally significant decrease in seminal volume between 1940 and 1990 from 3.40 ml to 2.75 ml with an estimated regression coefficient of −0.013 ml per year (P = 0.027) and a significant decrease in mean sperm concentration weighted by number of individuals in each publication between 1940 and 1990, from 113 × 106/ml to 66 × 106/ml; the estimated regression coefficient was −0.934 × 106/ml per year (P < 00001). Thus, both mean seminal volume and mean sperm concentration decreased during the study period. Separate analysis of the publications that referred only to men with proven fertility showed a regression coefficient for mean sperm concentration of −0.852 × 106/ml per year (0.185, P < 0.0001) and a separate subanalysis of the US data found a similar negative trend. When mean age was included as an additional covariate the trend was essentially unchanged.

This pioneering study generated considerable discussion and criticism, particularly for its design. For example, some critics suggested that poor or highly variable data invalidated any inference about trends in sperm count, others questioned the statistical methods used in this analysis and bias due to the different populations studied or unaccounted for confounding factors such as age, abstinence time, BMI — all factors that can influence sperm quality156,157,158,159,160.

Thus, in 1997, after controlling for abstinence time, age, percentage of men with proven fertility and specimen collection method, a re-analysis of the studies included in this analysis was conducted and found significant declines in sperm density in the USA, Europe and Australia161. Furthermore, a subsequent study by the same authors162 included an additional 47 English-language studies published in 1934–1996 to those analysed previously. Using a methodology similar to the two first multicentre studies6,163,164, the average decline in sperm concentration reported was virtually identical to that reported previously6 (slope = −0.94 versus −0.93). The slopes in the three geographical groupings (USA, Europe, other countries) were also similar to those reported earlier, although the slope reported for data from North America was somewhat less steep than the slope previously found for the USA (−0.80 (95% CI −1.37 to −0.24)), as was the decline reported in Europe (−2.35 (95% CI −3.66 to −1.05)). As before, studies from other countries showed no trend (−0.21 (95% CI −2.30 to 1.88)), leading the authors to conclude that the results were consistent with those of Carlsen et al.6 and their previous results161, and suggesting that the reported trends are not dependent on the particular studies included in the initial analysis, but that the trends previously reported for 1938–1990 are also seen in data from 1934–1996.

Temporal trends in semen quality from men of the Indian subcontinent were analysed using a similar study design6,162,165, retrieving papers from the Medline database published before 2011. Only studies in Indian participants were short-listed, and publications based on men with predefined sperm count limits, men from infertile couples were excluded as well as those reporting sperm concentrations analysed using flow cytometry or a computer-assisted semen analyser. Mean values of semen characteristics from 40 studies were retrieved, representing 19,734 healthy men from different parts of India over a period of 33 years (1978–2011). Weighted linear regression analysis showed a statistically significant decline in sperm motility during the study period, whereas all other semen characteristics showed no significant change. Regression models for sperm concentration, sperm motility and sperm morphology after controlling for the age of the participants also showed a statistically significant increase in sperm concentration during the study period.

Another study also investigated temporal trends in semen characteristics in Indian men over a period of 37 years from 1979 to 2016 (ref.166). Mean values of semen data were retrieved from 119 studies corresponding to 6,466 men who were presumed normal and from 63 studies corresponding to 7,020 infertile men. In pooled analysis for all individuals, statistically significant declines in sperm concentration and normal morphology were found (P < 0.05 and P < 0.001, respectively). However, isolated analysis for each group of men separately shows declines without statistical significance.

The 2017 contribution from Levine et al.9, based on a systematic analysis and a meta-regression analysis of worldwide data trends in sperm production, sperm concentration and total sperm count, is the most comprehensive study to date and generated conclusions that increased interest in the topic. The study was stratified according to the fertility status of the men as known or unknown, and their categorization as either Western (men from North America, Europe, Australia and New Zealand) or not (men from South America, Asia and Africa, labelled ‘Other’). Overall, 185 studies covering the period 1973–2011 were selected, including 244 separate estimates of sperm production. A statistically significant decrease in sperm concentration was seen between 1973 and 2011 for the Western men of unknown fertility status (−1.3 × 106/ml/year) and fertile Western men (−0.68 × 106/ml/year) but no statistically significant trend was observed for the other groups. The authors concluded, therefore, that a 50–60% decline in sperm production in men not selected by fertility from North America, Europe, Australia and New Zealand occurred between 1973 and 2011.

A 2017 analysis assessed semen concentration data from original research articles from African countries published in English167, selecting from various electronic databases and following the Meta-analyses of Observational Studies in Epidemiology (MOOSE) guidelines and Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) checklist to extract the data. Final analysis was based on 14 studies published between 1965 and 2015 and identified a time-dependent decline in sperm concentration in Africa (r = −0.597, P = 0.02) reflecting an overall 73% decrease in sperm concentration from 1965 to 2015.

The same authors also used a similar methodology to examine temporal trends in sperm concentrations in Europe168. The analysis was based on 54 European studies published during 1965–2015, and mean sperm concentrations were analysed with linear regression weighted by number of participants included in the individual publications. This study also identified a time-dependent decline in sperm concentration between 1965 and 2015 (r = −0.307, P = 0.02), an overall 32% decrease in mean sperm concentration.

Temporal trends in sperm concentration in fertile Asian men were also assessed by these authors, who analysed all the available reports published in English from 1965 to 2015 (ref.169). The study used randomized selection criteria for reports on sperm concentration, with similar predefined standards for inclusion and exclusion to the study on African data to identify 38 studies published between 1965 and 2015. Exclusion of studies such as those in infertile men, or those that did not include sperm count, for example, meant that the final analysis was based on 13 original studies. A declining trend in sperm concentration of −0.44 × 106/ml per year (95% CI −0.65 to −0.23; r = −0.473, P = 0.040) was identified, which accounted for an overall 22% decrease in mean sperm concentration in Asian men over the 50 years.

The question of whether human sperm concentration and total sperm count have declined in China remains controversial. A 2021 investigation of the long-term trends in sperm quality of healthy Chinese men analysed reports beginning in the 1980s170 and followed exactly the same protocol and statistical methodology as that of Levine et al.9. Of the 5,731 papers originally retrieved, only 111 met all the inclusion criteria and included 327,373 men who provided semen samples between 1981 and 2019. Linear regression analysis weighted by sample size across the whole period identified a time-dependent statistically significant decline in sperm concentration (slope = −0.75, P = 0.005) representing a 0.94% decline per year, and an overall decline of 37% between 1981 and 2019. A similar downward trend was also significant for total sperm count (slope = −2.07, P = 0.032) with a decline of 0.72% per year and overall by 28%. However, no statistical change was seen in seminal volume (slope = 0.01, P = 0.23). Similar decreasing trends were found using models weighted by their standard error or adding covariate adjustment in the models. Taking into account collection year, fertility status and region group, the authors showed continuous declines in sperm concentration among men from northern China in both the fertile group (slope = −2.27, P = 0.009) and the unselected group (slope = −0.84, P = 0.003), and decreases in total sperm count were seen in the fertile group (slope = −9.68, P = 0.01). However, in southern China, only fertile individuals demonstrated significant downward trends in sperm concentration (slope = −1.014, P = 0.009) and total sperm count (slope = −3.22, P = 0.042).

Limitations of multicentre studies based on aggregate values

Studies with such a design share common features, including the consideration of data from huge areas, either consolidated or regrouped in continental or subcontinental regions (for example, USA and/or North America, Europe, the Indian subcontinent and Africa) and examination of long periods of time (typically several decades).

The major inherent weaknesses of this type of study, based on aggregated data, such as mean or median values or estimates of sperm production of the selected studies, are its reliance on the assumption that the difference of the means is equal to the mean of the differences. This assumption is obviously wrong if the means are not measured on similar populations. This type of study based on aggregated data does not exploit the longitudinal variability within some of the selected studies and represents an ecological study design for which the weight of evidence is very modest.

Most of these multicentre studies claim that their findings were based on a very large number of men from many regions of the world. However, in reality, the trends reported in these studies are based on only a small number of values — not a large number of men studied — that do not account for underlying demographic, biological and geographical diversity of the source populations studied. Examples include n = 101 studies162, n = 61 studies6, n = 50 studies in the Indian165 or European regions168, and even n = 14 studies in an analysis that theoretically considered the entire continent of Africa167, and n = 13 studies in the 2018 analysis of Asian men169.

This type of study can take into account only a few confounding factors, often as a simple and sometime unprecise category-based variable. For example, in Levine and colleagues’ multicentre study9, the authors could have used three categories for the age variable: all men aged ≤40 years versus some men aged >40 years versus no information, but instead they used a categorization with a cut-off at 40 years, which is questionable because changes in semen characteristics are associated with age well before men turn 40 (refs.62,63,64). Although the existence of possible bias caused by residual confounding is not well substantiated in this type of multicentre study, control for confounding is likely to be insufficient.

Multicentre studies using simple linear regression models weighted their statistical analysis by sample size of the studies and, therefore, did not account for the variability of sperm parameter data within studies. The most recent studies using meta-regression have more appropriately weighted their analysis, choosing to weight by the standard error. In these reports, a considerable number of studies include only small absolute numbers of men and are, therefore, likely to produce imprecise means and are not equally distributed according to the study period. In Levine and colleagues’ multicentre study9, 27% of the estimates included (40% of estimates before 1998 and 21% of estimates from 1998) were based on fewer than 30 individuals, providing possible distortions in the final estimate despite the weighting statistical procedure.

By design, these multicentre studies aim to assess the average temporal trend in sperm production observed among the selected studies without investigating the possible heterogeneities between studies, as they are assumed to be similar in terms of population source and methodology.

Overall, all the considered studies, apart from the two in Indian men165,166, concluded that sperm production has decreased at a continental or worldwide level. However, suggesting that the reported decline is valid for each of these regions is misleading, and is not without consequence when repeated in the media as such. Numerous studies published since the mid-1990s have provided evidence of geographical contrasts in human sperm production, and this trend was appropriately accounted for in subsequent multicentre studies examining temporal trends in sperm production by the use of sufficiently narrow geographical groups. For example, in the study by Levine et al.9, the data were split ‘geographically’ into only two categories: Western versus non-Western. However, studies within countries on the same continents — both Western and ‘Other’ — show noticeable regional differences in sperm production. In fact, in Levine and colleagues’ study9, if the estimates of sperm concentration by subcontinental area are considered separately, several issues arise (Fig. 2). This study reported temporal trends for aggregated data from North America, Europe, Australia and New Zealand (Fig. 2g), despite the fact that, for instance, Europe and Oceania are among the regions farthest apart on the globe with obviously different genetic, climatic, social and environmental conditions. Only 25 data points are available for North America (Fig. 2a) and no data are available for the periods 1989–1996 and 1999–2009. Some large differences are seen in the estimates for the years that do have reports, and several data points for the period before the 1990s come from a single centre (Fig. 2a). The Australian and New Zealand data (Fig. 2b), which were also aggregated, included only 25 estimates of sperm concentration, none of which was from before 1987 or after 2004, and which presented huge disparities in the mean value of sperm concentration for the years with data. Furthermore, this Australian and New Zealand data came from only two centres. The continental European data in this study9 comprised 59 sperm concentration estimates covering the period 1978–2009. The aggregation of these European data is problematic as estimates included in the study come from different European regions for which sperm production levels are known to differ widely and significantly61. The temporal distribution of the estimates according to the commonly used division of Europe into four main continental subregions — northern Europe or Scandinavia, western, southern and eastern Europe — generates 27 data points covering the period 1977–2009 for Scandinavia (Fig. 2c), but only 13 estimates for western Europe (France, the UK, Germany, Holland, Belgium; Fig. 2d), ten for southern Europe (Italy, Greece, Spain; Fig. 2e) and eight for eastern Europe (Fig. 2f). Notably, the estimates from most of the Scandinavian studies reported in the 1990s came from a single centre, which found that sperm concentration actually increased during the study period, whereas the data points for 1995–2009 did not indicate any obvious trend for Scandinavia or western Europe. Those for southern Europe covered only the past 18 years, and estimates for eastern Europe covered only the period 1993–2007 and no evidence of decreased trend in sperm production can be observed (Fig. 2e,f). Finally, the marked geographical contrasts in sperm production levels, combined with the temporal distribution of data by subcontinental regions, raise the important question of the potential bias introduced by geographical differences within countries or continents. If intrinsic geographical origin is not treated as a major effect-modifier in this type of multicentre study, no firm conclusion can be drawn about a possible global (or regional or continental or subcontinental) temporal trend in human sperm production, whether stable, decreasing or increasing.

Fig. 2: Subcontinental temporal distributions of sperm concentration estimates.
figure 2

Data are separated by colours according to subcontinental regions: USA (part a), Australia and New Zealand (part b), Scandinavia (part c), western Europe (part d), southern Europe (part e), eastern Europe (part f). Part g shows an amalgamated representation of temporal distribution including all data from the Western world, merging all subcontinental region data. The dotted lines in parts a, b, c and f link successive estimates from a single centre. The figure uses original data provided by Levine et al.9, with a correction of the mean values of sperm concentration reported in the Rubes et al. study171, which was ~140 × 106/ml instead of the ~40 × 106/ml mentioned in the provided dataset.

The same limitation is also seen in a 2021 study of healthy Chinese men170 in which the mean values of sperm production coming from many parts of the vast China were analysed only according to two arbitrary geographical categories, north China versus south China. However, several studies covering different cities or regions of China have found significant contrasts in sperm production38,69. Finally, owing to the use of large and arbitrary geographical categories, conclusions from these multicentre studies seem to contradict; for instance, Levine and colleagues9 reported no significant trends among their ‘Other’ group (which included South America, Asia and Africa), whereas other multicentre studies based on similar aggregated data concluded that sperm concentrations were declining in China170, in African countries166 or in India166.

In addition, in such multicentre studies, the choice of some selection and/or inclusion criteria might also limit the results reported. For example, in Levine’s study9, the categorization of semen collection method is masturbation compared with incomplete information, when ideally the method of semen collection should be 100% by masturbation in such a study and if the collection method is in doubt, the study should not be included. The same principle applies for the sperm counting method, which is categorized as haemocytometer versus incomplete information. Again, unless sperm concentration was definitively assessed by haemocytometry according to WHO guidelines, the study should not be included.

Finally, studies based on means, medians or estimated values retrieved from publications might make transcription errors in the process of collecting the data that could have important consequences in the statistical analyses owing to the relatively low number of studies included. For example, in Levine et al.9, a transcription error attributed an incorrect value to the sperm concentration data of the study of Rubes et al.171; the real mean value should be ~140 × 106/ml and not ~40 × 106/ml as was shown. Another error of transcription arose in the study by Sengupta et al.166 and could be important owing to the large sample size of the included study by Sheriff and colleagues172, which included 1,500 pre-vasectomy candidates as a source of African data. However, although Sheriff’s centre was located in Libya, the men were actually recruited in Salem, India, as the author subsequently confirmed in a later article173.

Overall, despite the efforts to understand putative trends in human sperm production, multicentre studies based on means, medians or estimates in continental or subcontinental areas include important weaknesses and high heterogeneities, making them intrinsically questionable. After several decades of work, the complexity and difficulty of making appropriate measurement and adjustment for all of the potentially relevant variables are still challenging.

Temporal trends in seminal volume, sperm motility and morphology

The methods for assessing human sperm motility and sperm morphology are by their very nature more subjective than the methods for counting sperm. Quality control studies have repeatedly indicated a high level of variability in the assessment of these qualitative semen characteristics owing to the subjective nature of their assessment and, therefore, the need for extensive experience with the methods174. This subjectivity might introduce uncertainty in the reported values in the absence of quality controls and is probably the main reason why only a small portion of the studies retrieved for this critical Review have examined temporal trends in percentages of motile spermatozoa and morphologically normal spermatozoa.

The assessment of seminal volume is based on objective procedures, by measurement from a graduated pipette or by extrapolation from weighing. The reason for the relatively small number of studies interested in this characteristic could be simply that it is often considered as secondary in the semen analysis because it has no crucial role in fertilizing ability. However, temporal modifications in seminal volume might point to possible hormonal causes or exposure factors.

The five repeated cross-sectional studies discussed herein reported stable percentages of motile spermatozoa over time. For the four studies in which seminal volume is reported, this parameter was reported as unchanged in three studies and increasing with time for one. Three studies reported unchanged percentages of morphologically normal sperm, one reported a temporal reduction in this characteristic74, and the others did not consider it (Supplementary Table 2).

Overall, the single-centre retrospective studies that report results for seminal volume, sperm motility and morphology (Supplementary Table 2) show contrasting results. Of the 37 studies that analysed temporal trends in seminal volume, 63% found it unchanged, 32% reported it decreased and two studies reported an increase. Of 43 studies reporting a temporal trend in percentage sperm motility, about half reported a decrease. Only 28 studies reported trends for the percentage of morphologically normal spermatozoa and almost 80% of these reported a temporal decrease.

Some multicentre studies using individual data from men in geographically distinct centres reported trends in the same direction (declining) for quantitative and qualitative semen characteristics20, but intriguing diverging trends were also reported: for example, one French study147 reported a temporal decrease in sperm count, but this trend was concomitant to an increase in sperm motility (Supplement Table 2).

As shown in numerous studies of quality control of human semen quality assessment, seminal volume is the characteristic with the lowest variability in measurement owing to the quantitative and precise methods used for its assessment17. Thus, the reported trends in seminal volume can be considered to reflect real changes. Overall, most studies report seminal volume to be stable or to decrease over time; the minority that report increases over time thus indirectly suggest geographical contrasts. The possible explanation for this increasing trend is unknown.

As most studies do not report either their experience or their quality control methods for measuring qualitative semen characteristics such as the percentage of motile spermatozoa and the percentage of morphologically normal spermatozoa, the temporal trends that have been reported are less likely to reflect real changes. Changing methods during the study period, for example, for sperm morphology assessment110, might also contribute to the uncertainty of the trends reported. As is the case for sperm counts, several reports have shown that only a minority of centres at a national level follow the WHO recommendations for assessing sperm motility and morphology155,175. Thus, the inter-centre variability induced by highly disparate practices not accounted for in the multiple regression studies casts some doubt about the trends reported for these characteristics. Technical factors might have exacerbated the uncertainty about the trends reported for the percentages of either motile or morphologically normal spermatozoa and, therefore, one cannot conclude that worldwide temporal changes have occurred in either sperm motility or morphology.

Overall, among the numerous studies examining these two characteristics, only the few studies that have optimized their analytical conditions and taken into account the inherent variability in assessment can even suggest — and not confirm — possible spatial and temporal modifications.

Considering the data as a whole

Discussion and analysis of the data from studies with an appropriate design in various world areas indicate the unambiguous existence of geographical contrasts in human semen quality at the continental, national and even regional levels. However, the reasons for differences in human semen quality between regions or cities within a country, between cities in different countries and from one continent to another are currently not well understood.

This Review also highlights the need for circumspection in interpreting the results of retrospective multicentre studies on temporal changes that are based on mean values or estimates at a worldwide level, regardless of the direction of the temporal trend reported. Major fragilities in the data include the negligence of well-established geographical contrasts in sperm production and the uncontrolled heterogeneity of the populations merged.

Theoretically, temporal trends reported in retrospective multicentre studies with a design based on individual data and not means, medians or estimates should be more reliable, assuming that the sampling is well-balanced according to study period and area, and that the area of sampling is appropriately delimited to catch the geographical contrasts. However, the existing studies are limited by the heterogeneity of the semen data, which are collected in many centres with disparate methodologies and often fail to consider major covariates. With the benefit of hindsight, one might think that retrospectively controlling for the many factors that are likely to influence results of multicentre studies is a challenge that is almost impossible to overcome. The critical approach of this Review to the many published single-centre or single-city retrospective studies shows that only a small proportion provided robust findings on the existence of a temporal trend in human semen quality. Most (five out of six) of the few studies meeting the quality criteria set out in this Review have reported negative temporal trends, making the decline in sperm production over time a credible conclusion. However, these centre-specific findings cannot be generalized as a worldwide trend or even a trend across the Western world, even though this is where these few studies were conducted. Likewise, recommending that worldwide studies be performed is also unrealistic; thus, our understanding of temporal trends in semen characteristics will have to come from such studies. The few carefully designed and challenging cross-sectional studies that have been performed investigating possible temporal trends in semen quality of young military conscripts in specific locations60,73,74,75 so far indicate no or very limited temporal trends60,73,75, or a downward trend for sperm production alone and only for short periods of time74. However, one cannot necessarily assume that military conscript volunteers agreeing to participate with a semen sample represent an unselected population, even though such groups are often claimed to constitute a sample of the general population, owing to the very low participation rates reported, which cast doubt on how well the few men accepting to participate actually represent the population. Thus, in such studies, participation rate should be maximized, and any extra information that might relate directly or indirectly to the outcome of interest among both the refusal and the participant groups should also be collected to help clarify potential participation bias and minimize it with sample-weighting techniques.

Finally, the results of a few solid single-centre studies indicating a temporal decline in sperm production87,108,113,138,144, some of which also show an inverse relationship with year of birth (that is, the youngest men produce the fewest sperm)87,108,113, point to a putative causal role for environmental and lifestyle factors. However, owing to the retrospective nature of the datasets, no major evidence of the possible causal factors has yet emerged from these studies.

One hypothesis is that temporal trends in sperm production will parallel temporal trends in their risk factors among each population. These correlations — if they exist — reflecting the existence of a link between sperm production and the risk factors at populational level would have to be confirmed in individual studies, and are, by design, limited with respect to causal inference.

The possible causes of the rapid temporal decline in sperm quality in some populations make genetic changes unlikely. The short time periods over which these trends have been observed instead suggests a role for environmental or lifestyle factors, as is also the case for evidence of geographical contrasts in sperm production and quality that have been observed in relatively ethnically homogeneous populations.

Since Carlsen and colleagues’ first multicentre study examining temporal trends in human sperm production6 was published in 1992, a large body of data has examined the relationships between human semen characteristics and environmental or occupational chemicals such as pesticides176 and heavy metals177, with a focus on endocrine disruptors11,178,179, ambient air pollution180, heat181, cell phones182, psychological stress183,184,185, smoking186,187, diet188,189, BMI190, diseases leading to impaired general health191 and sexually transmitted diseases192. These studies have yielded mixed results, ranging from some that suggest major roles to some that suggest none. Among possible emerging risk factors, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and coronavirus disease 2019 (COVID-19) should be considered as a factor that might affect testis function, including spermatogenesis and maturation193.

Overall, one could hypothesize that the causes of spatial and temporal differences in human semen quality observed in the strongest studies are most likely the result of a combination of various types of environmental factor that exemplify the exposome paradigm194: chemical, physical, biological, socio-economic, socio-cultural, historical and climatic.

Future directions

Of course, further cross-sectional repeated studies would be valuable. The main challenges are to include large samples with harmonized data, reach high participation rates for appropriate representativeness, cover long periods to investigate temporal trends and obtain substantial funding. For example, coupling semen data collection to national biomonitoring surveillance cohorts195,196,197,198, which already exist in numerous countries, would be an ideal approach. On a smaller scale, well-controlled study designs are needed to investigate temporal trends, to minimize the influence of known factors of bias through precise methodological and high-quality statistical planning. For instance, the study centre and/or area must be well-delimited according to possible geographical contrasts, repeated collection of harmonized semen data should be performed to control for within-individual variability, major confounders should be identified and taken into account through restriction or statistical adjustment, and participation rate should be systematically reported with minimal information related to participation motivation or refusal.

Given the complex relationships between humans and their environment, that clear explanations about environment-driven changes in human semen quality are still lacking should not come as a surprise. In contrast to the enormous amount of literature (albeit of varying quality) on temporal and geographical variations of human semen quality, studies that investigate potential links between temporal and geographical trends in semen quality and environmental or lifestyle risk factors are still lacking.

A 2021 opinion paper199 attested that some of the scientific community and general public interpret sperm data over time as a measure of potential male fertility, an indicator of male health and a test of environmental quality, suggesting, therefore, that a decrease in sperm count over several decades is seen as indicating a decline in male fertility and health and is a sign of a deteriorating environment. This article, as well as the magnitude of temporal declines or geographical contrasts in sperm production reported in studies scrutinized here199 also suggest the possible existence of non-pathological variations in sperm count across populations and time. Of note, among the five single-centre studies identified by this Review as being of solid design that concluded a temporal decline in sperm production, four reported a calculated level of sperm concentration at the end of the study periods that is higher than the threshold of sperm concentration at which fertility (likelihood of pregnancy and/or TTP) is affected7,8. Claims of a causal relationship between sperm count and environmental and lifestyle factors require further in-depth investigation: carefully conducted studies must be encouraged to improve our understanding of the relationship between environments and human sperm formation, maturation and fertilizing ability.

Conclusions

The existence of geographical contrasts in human semen quality is unambiguous and is present at various levels: continental, national and, possibly, even regional.

Some evidence from studies with a complete set of quality criteria indicate a decline in sperm production for several decades in specific populations. However, these centre-specific findings cannot be generalized to represent a worldwide trend. Despite their attractive design, the existing multicentre studies that rely on compilation of retrospective and aggregated data such as mean values, have not sufficiently controlled for study heterogeneities, including spatial contrasts or their possible effect-modifier role, and are overall inconclusive.

Although future worldwide studies are, most likely, unrealistic, studies conducted in well-delimited areas, minimizing the well-known biases and combined with the assessment of men’s exposome are recommended to advance our understanding of these interrelated factors.