Introduction

An in-depth appreciation of evolution and the constellation of adaptive forces that have shaped biology is critical to understanding how humans have survived and thrived in different environments1. The human kidneys maintain fluid and electrolyte homeostasis by filtering waste and surplus water from the blood. The structure and function of the human kidneys that we observe today reflect our evolution as a species and are largely the result of legacies of adaptations that have accumulated as our ancestors moved from a marine littoral environment to freshwater to a terrestrial environment. These migrations required the evolution of glomeruli that were capable of supporting a high filtration rate followed by the evolution of tubules capable of reabsorbing the majority of filtrate2,3,4. Selective pressure to maintain a high glomerular filtration rate (GFR) was possibly provided by the need to excrete urea once our ancestors began consuming a meat-rich diet during the Pleistocene epoch5 Since then, other evolutionary pressures have acted to refine human kidney function to its current level.

The nephron is the basic functional unit of the kidney. In humans, nephrogenesis is completed by 36 weeks’ gestation, at which point the nephron numbers for each kidney are established6,7. On average, each fully developed kidney has around one million nephrons, but this number can vary between 0.2 million to 3 million8,9. No nephrogenesis takes place after birth and therefore, the decline in kidney function that occurs as a result of the ageing process or in response to injury leaves those with low nephron endowment at birth — for example, those born prematurely or exposed to nutritional deficiency in utero — vulnerable to kidney disease10,11. When confronted with limited resources in the intrauterine environment, the fetal kidney is vulnerable to compromised development12. By contrast, complex mechanisms have evolved to preferentially protect the fetal brain when challenged similarly13. This hierarchy represents an evolutionary trade-off in resource-limited situations that favours fetal survival, as the effects of compromised kidney development and the consequential reduction in nephron number are generally only observed after reproductive age, when additional pressures such as chronic disorders and environmental insults are present.

Here, we provide insights into the evolutionary genetics of human kidneys that have implications for understanding disease development and progression in children and adults. Our understanding of these processes has been facilitated by the availability of affordable high-throughput genotyping and sequencing technologies, the increasing diversity of populations enrolled in genomic studies, the development of techniques to reliably extract and analyse DNA from ancient genomes and the expansion of statistical toolsets to infer evolutionary pressures acting on the genome. We describe relevant data derived from studies of ancient and archaic genomes and how population migration and genetic admixture have shaped the current landscape of human kidney-associated diseases. We also discuss the impact that infections, environmental toxins, diet and climate may have had on the evolutionary genetics and adaptation of kidneys, and finally, highlight gaps in our knowledge of the evolutionary biology of kidney disease, as well as challenges and opportunities for further research.

Evolutionary forces and processes

Evolution is commonly defined as a change in allele frequencies. It underlies the process by which an organism adapts to its environment to increase the probability of survival and reproductive success. New alleles enter a population through either mutation (that is, a heritable change in the nucleotide sequence of a genome), or migration (that is, from individuals carrying new alleles moving into the population). In addition to mutation and migration, four other forces can drive genetic variation within and between populations: population growth, recombination, random genetic drift and natural selection (Fig. 1). Population growth refers to a change in the number of individuals in the population and can be positive or negative. Recombination is the process by which segments of DNA are broken and recombined; in particular, reciprocal crossing over of genetic material during meiosis can create new combinations of alleles in the offspring. Recombination disrupts genetic linkage and creates new haplotypes, but neither population growth nor recombination can change the expected frequencies of selectively neutral alleles across generations and are therefore not considered further.

Fig. 1: Evolutionary forces and processes.
figure 1

Common evolutionary forces and processes are represented as acting on alleles within a population. Different alleles at a locus are represented by coloured circles. Exterior rings depict changes in the proportions of the alleles, given the evolutionary processes acting on the original population (middle ring). a | Genetic drift involves a shift in the frequency of alleles as a result of stochastic sampling. b | In natural selection, some alleles affect the success of passing along genetic material to the next generation, causing changes in the distribution of alleles. c | In recombination, different parental alleles combine to form a new allele in the offspring, which contains sections of DNA from each parent. d | Mutations that involve a novel alteration within the DNA have potential to produce a new allele. e | Migration involves the introduction of an allele from another population into a new population. f | Population growth involves the expansion of a population, resulting in a larger population size, but without changes in allele frequencies.

Random genetic drift refers to changes in allele frequencies that occur in finite populations owing to the random transmission of alleles from one generation to the next. Random genetic drift is directionless, meaning that allele frequencies are equally likely to increase or decrease. Natural selection causes allele frequencies to change in a directed manner, such that alleles conferring a fitness advantage increase in frequency. In this context, fitness refers to the likelihood that an allele will be passed on to the next generation, meaning that individuals carrying the allele must survive until reproductive age. The absolute fitness of a genotype is the ratio of the number of individuals with the genotype after selection to the number of individuals with the genotype before selection, whereas the relative fitness of a genotype is the ratio of the average number of surviving offspring with that genotype to the average number of surviving offspring of all genotypes after a single generation.

Studies of natural selection often focus on a single locus. With the accumulation of genome sequence data, it is possible to study polygenic adaptation, which focuses on the net effect of small changes in allele frequencies at potentially hundreds or thousands of loci. When studying small changes in allele frequencies distributed across the genome, it is critical to not misinterpret population structure (that is, systematic differences in the frequencies of neutral alleles between subgroups in a population resulting from a restriction of gene flow) as polygenic adaptation14,15. It is also now possible to study polygenic adaptation of archaic alleles in modern humans, although such studies are limited by a lack of data on the genetic diversity of early versus late archaic humans and on the structure of archaic populations that once lived in Asia, Europe and the Middle East.

In contrast to population growth and recombination, natural selection and migration can change the expected frequencies of selected alleles across generations, which means that they can have an effect on survival and reproductive success in subsequent generations. In this Review, we describe how the processes of natural selection and migration have influenced factors that affect kidney health. Of note, the evolutionary context of kidney disease is complex, and much remains to be understood. For example, consider the relationship between natural selection and congenital anomalies of the kidney and urinary tract (CAKUTs), chronic kidney disease (CKD) and kidney failure. Severe CAKUTs usually lead to early mortality, implying that any underlying genetic variants are filtered out of the population by negative selection. On the other hand, most forms of CKD and kidney failure develop after the reproductive years, meaning that their underlying genetic risk variants are not selected against. Evolutionary forces that select for resistance against a phenotype can occur only if the phenotype develops before reproductive age. Therefore, it is likely that loci associated with CKD or kidney failure that are under natural selection have been selected for or against by non-kidney factors (or possibly by effects on the kidney that are beneficial in early life). A good example of this scenario is APOL1, for which selective pressure is exerted by the ability of variants to protect against Trypanosoma brucei infection. Thus, it is important to note at the outset that any kidney disorder (whether primary or secondary) that has an age of onset after the reproductive age is unlikely to be an agent of natural selection. However, as exemplified by APOL1, several loci under natural selection from non-kidney influences often have an effect on the risk of kidney disorders. It is important to note that although genetic differences between populations can be driven by evolutionary forces with implications for health outcomes, most between-population differences arise from environmental and social factors such as diet, inequitable access to health care, lack of healthcare infrastructure and social issues such as racism.

Ancient and archaic genomes

The study of ancient and archaic genomes has seen tremendous advances in the past decade. Technological advances coupled with an increase in the number and diversity of archaeological finds of human remains means that we now know more about the genomes of our ancestors than ever before. Most studies of ancient human specimens have provided only limited insights into the evolutionary genetics of kidney phenotypes. However, the availability of increasing amounts of archaic genome data from other Homo sub-species, in particular Neanderthals (Homo sapiens neanderthalensis), which existed between 400 thousand years ago (Kya) and 40 Kya, and Denisovans (Homo sapiens denisova) who lived between 195 Kya and 52 Kya, have potential to provide new insights (Fig. 2). Some Neanderthals lived in Europe 300 Kya to 200 Kya — a period characterized by long glacial periods with extreme cold conditions and long winters, during which the main source of food was the hunting of large game. This environment required the inhabitants to adapt to a high-protein, high-fat diet. It has been hypothesized that the large size of the inferior thoracic cavity and the wide pelvis of the Neanderthal were adaptations to provide the space needed for large kidneys that developed following the adoption of a high-protein diet based on meat consumption16,17. Analysis of tooth proteins showed that the rs4236 variant in the MGP gene (which introduces a Thr127Ala mutation into the Matrix Gla protein), which is associated with a reduced risk of nephrolithiasis in individuals of Han Chinese or Japanese ancestry18,19, is also found in Denisovans and Neanderthals. Thus, some genetic variants associated with kidney phenotypes existed in extinct human lineages. It should be noted that greater Neanderthal and Denisovan introgression seems to have occurred in European and Asian ancestry populations than in African populations20 (Fig. 3). Further studies of the introgressed Neanderthal segments in modern humans indicate that these have a functional impact as indicated by evidence of cis-regulatory effects, and effects of the segments on gene expression changes, the regulation of steady-state gene expression and transcriptional responsiveness to immune (in particular virus) challenges, as well as phenotypic effects on traits, including skin tone, hair colour, height, sleep and mood21,22,23,24,25. Ongoing investigations into the functional consequences of introgressed variants may identify effects on aspects of kidney health.

Fig. 2: Relationships between anatomically modern humans, Neanderthals and Denisovans.
figure 2

Note that there are probably multiple admixture events from distinct Denisovan populations to various Southeast Asian island populations167.

Fig. 3: Archaic humans and anatomically modern humans.
figure 3

Fossil archaic human specimens that have yielded sequence data have come from archaeological sites ranging across Asia and Europe168,169. Interbreeding between Neanderthals (red dots on the figure) and anatomically modern humans (AMHs) occurred on multiple occasions, based on observed patterns of divergence and heterozygosity and modelling of gene flow170. The percentage of Neanderthal DNA in AMHs is low, at around 1–4%, in Asians, Europeans, Native Americans, North Africans and Oceanians171. Neanderthal DNA is present in sub-Saharan Africans in even lower amounts, as a result of the gene flow of Neanderthal lineages into ancestral Europeans after the split of Europeans and East Asians, their migration to Africa and subsequent gene flow into Africans20. Denisovans (blue dots) have been identified in archaeological sites in Siberia172 and the Tibetan Plateau169. The Siberian site also has Neanderthal specimens, and a first-generation hybrid has been identified, with a Neanderthal mother and a Denisovan father173. The percentage of Denisovan DNA in AMHs is up to 4–6% in Melanesians and negligible in African and Western Eurasians172. Neanderthal DNA sequences in AMH are dispersed throughout approximately 20% of the human genome, with evidence that some Neanderthal sequences have been positively selected and that some regions of the genome are depleted of Neanderthal sequences, consistent with the notion of purifying (negative) selection174. In particular, two derived mutations in APOL1 confer protection against Trypanosoma brucei, the cause of sleeping sickness, at the cost of an increased risk of kidney disease39. The APOL1 haplotype present in the human reference genome generated by the Human Genome Project175 is shared with both Neanderthals and Denisovans. It may have been reintroduced into the AMH gene pool via gene flow from archaic humans and might have conferred a selective advantage176. An archaic SLC16A11 haplotype that affects hepatic lipid metabolism may have introgressed from Neanderthals into ancestors of Amerindians and conferred an advantage in the context of diet29. Admixture with Neanderthals also introduced into European genomes multiple variants that regulate immune-responsive gene expression, specifically to viral challenges22. It is conceivable, although unproven, that the kidneys might be affected by these variants.

One example of an archaic genome variant that is indirectly linked to kidney disease is a five-SNP haplotype in SLC16A11, a locus associated with an increased risk of type 2 diabetes mellitus (T2DM) that was first described in Mexican and other Latin American populations26. This diabetes risk haplotype has a frequency of up to 50% in Native Americans and about 10% in East Asians, but it is rare to absent in other populations, including in Africans. Interestingly, evidence indicates that the risk haplotype introgressed into modern human populations outside Africa within the last 250 Kya through admixture with Neanderthals or close Neanderthal relatives26. Hence, one could speculate that introgression potentially occurred in North or East Asia, after out-of-Africa migration. Notably, studies of SLC16A11 gene expression in the HeLa cell line (which does not express SLC16A11) indicate a role for this variant in lipid metabolism. Expression of SLC6A11 led to several changes in lipid metabolites, including an increase in intracellular levels of triacylglycerol — a lipid species that is associated with an increased risk of T2DM and insulin resistance26. T2DM is an important risk factor for CKD and thus we can posit that the SLC16A11 haplotype acquired via archaic admixture has an indirect effect on kidney phenotypes via the development of T2DM (Fig. 4).

Fig. 4: Examples of loci under selection that can influence phenotypes in which kidney disease is a feature or complication.
figure 4

For example, HBB is a locus under selection owing to falciparum malaria, but is also the locus for sickle cell anaemia, which can be characterized by kidney disease (sickle cell nephropathy).

Another example of archaic genetic variants that exert indirect effects on diabetic kidney disease (DKD) are those that encode fatty acid desaturases (FADs). These proteins are important for converting short-chain polyunsaturated fatty acids (SC-PUFAs) to long-chain fatty acids (LC-PUFAs). The genes that encode FADs are associated with multiple metabolic traits, including T2DM27,28,29. Notably, studies of FADS1 in archaic human genomes showed that the Neanderthal and Denisovan sequences clustered with the ancestral haplotype found in Eurasians and not with a derived haplotype found in high frequencies in Africans. The derived haplotype originally evolved 500–100 Kya on the lineage leading to modern humans after the split from Neanderthals, and present-day geographic distributions of both haplotypes suggest that both haplotypes were present in Africa at the time of the Out-of-Africa migrations30. Unusually, the FAD genes show signatures of positive selection (that is, high frequencies of derived alleles and extended haplotype homozygosity) in multiple populations, including populations of African, South Asian, European and Inuit ancestry29,30,31,32,33,34. Of note, different alleles seem to be under selection pressure in different modern populations, with differences in the direction of the effect on the selected alleles. For example, the selected alleles in the Greenland Inuits are associated with increased concentrations of SC-PUFAs (including linoleic acid) and decreased concentrations of LC-PUFAs, in contrast to non-Inuit European population groups, in which the selected alleles are associated with a decrease in serum levels of the SC-PUFA linoleic acid and an increase in arachidonic and eicosapentaenoic acids29,31. These effects reflect the adaptation of each population to their historical diets. The Inuit diet is characterized by very high levels of LC-PUFAs from fish and marine mammals but a low intake of some SC-PUFAs, whereas the variants present in Europeans likely arose because of a change in the dietary composition of fatty acids following the transition to agriculture. Thus, dietary adaptations in different populations have placed the FAD genes under strong selection pressure. How these variants contribute to metabolic disorders, such as T2DM, which is a major risk factor for CKD, remains to be explored.

APOL1 evolution

The geographic migration of peoples and genetic admixture with other populations have played major roles in the evolutionary history of humanity. The impact of these forces on kidney disease is most clearly observed in terms of the effects of APOL1 variants on the risk of various kidney diseases35,36 in people of African ancestry37,38,39. At the global level, the APOL1 recessive risk haplotype (two APOL1 risk alleles) is found in appreciable to high frequencies in West Africans and populations of admixed West African ancestry, including Hispanic Americans, Afro-Caribbeans (for example, Jamaican, Barbadian, Grenadian, Trinidadian, Panamanian, Honduran, Haitian) and Brazilians from Salvador, but is absent from populations of European and Asian ancestry40,41,42. For individuals with two APOL1 risk alleles, the lifetime attributable risk of idiopathic focal segmental glomerulosclerosis (FSGS) is ~4% and of HIV-associated nephropathy is ~50%, in contrast to 0.2% and 2.5%, respectively, for individuals without the risk alleles40. Although natural selection at the APOL1 locus has been attributed to protection against sleeping sickness caused by infection with T. brucei subspecies43, the frequency distributions of and levels of protection afforded by APOL1 variants do not match the present-day geographic distributions of known trypanosomal subspecies; moreover, some evidence suggests that the protection provided by the APOL1 variants may extend to other infectious agents such as Leishmania parasites44. The present-day global distribution of APOL1 variants are a result of the forced migration of Africans during the trans-Atlantic slave trade and their admixture with other populations42. As a consequence, alleles that arose through natural selection in Africa currently represent the most important genetic risk factor for kidney failure in North America.

The APOL1 gene is one of six genes in the APOL family45. Of the six APOL proteins, APOL1 is the only one that is secreted into the bloodstream, where it circulates as a component of lipoproteins. Expression of APOL1 mRNA is high in placenta, liver and lung, and low in heart and kidney46. Circulating APOL1 exists as a complex with HDL — specifically the densest subfractions HDL3c and HDL3b — and APOA-I, the major apoprotein of HDL47,48. In humans, APOL1 is localized to the cytosol, where it acts to facilitate cell death through autophagy49. In the kidneys of mice, podocytes show a high level of autophagy under basal conditions; glomerular injury triggers autophagy to prevent glomerular disease whereas inhibition of autophagy leads to late-onset glomerulosclerosis50. Furthermore, podocyte-specific expression of the APOL1 risk variant blocks autophagic flux and reduces endosomal trafficking to induce kidney disease51.

Of note, APOL1 risk variants are not associated with an increased risk of DKD. In liver, high levels of glucose induce expression of the insulin-sensitive transcription factor USF1 and promotes USF1-mediated transactivation of genes involved in glucose and lipid metabolism, such as LIPC, which encodes hepatic lipase52,53,54. LIPC is involved in the delipidation of HDL2 to HDL3 (refs55,56). Although HDL levels are reduced in individuals with T2DM compared with levels in non-diabetic individuals, this reduction reflects reductions in levels of HDL2; by contrast, HDL3b and HDL3c levels are elevated57,58. USF1 also transactivates HNF4A53, which in turn transactivates APOA1, APOL1 and HPR59. Given the ability of APOL1 risk variants to reduce autophagic flux51, the absence of an association between APOL1 variants and DKD may suggest a protective effect of hyperglycaemia on autophagic flux.

Genetic factors and CKD

Mendelian kidney diseases

The explosion in high-throughput next-generation sequencing technologies and improvements in analytical pipelines over the past decade have led to the discovery of multiple Mendelian causes of CKD60,61,62,63,64. Broadly speaking, these discoveries can be divided into five categories. First, variants that cause CAKUTs, including renal dysplasia or aplasia, multicystic dysplastic kidneys and obstructive uropathies; most of the genes in this category are important in kidney development62. Second are variants that cause cystic kidney diseases such as autosomal recessive and autosomal dominant polycystic kidney diseases and microcystic kidney disease such as nephronopthisis60,61; this category is enriched for genes associated with cilia — evolutionarily conserved organelles that are found on the surface of most mammalian cells. Third are variants that cause glomerular diseases that are associated with proteinuria; most of the genes in this category localize to the podocyte63,64. Fourth are variants that cause kidney tubule defects61; and fifth are variants that cause metabolic disorders that result in kidney disease such as hyperoxaluria. A catalogue of these genes and the clinical implications of identifying the genetic defects can be found in recent reviews61,65,66. In evolutionary terms, it is important to note that many of the causal mutations are filtered out of the population by natural selection because the associated disorders are not compatible with successful reproduction and transmission of the alleles to the next generation. For those disorders that are compatible with a near-normal lifespan, the underlying mutations may persist subject to genetic drift, admixture and other population genetic processes.

Complex kidney diseases

The study of genetics of complex kidney diseases has benefited immensely from genome-wide association studies (GWAS), designed to test the association between genotype and phenotype. GWAS are powered to find associations for common alleles that tend to exist at similar frequencies across populations. By contrast, admixture mapping is powered to detect alleles that differ in frequency between populations. Differentiated alleles are generally divided into two types: alleles that are evolutionarily old and are common, meaning that they are present in nearly all populations across the globe, but with large differences in allele frequency; and evolutionarily young alleles that have low to rare frequencies and tend to be population specific. If an allele is deleterious and has a large effect size, then the causal allele will tend to be eliminated. However, a rare allele has little impact from a population or public health perspective because the fraction of disease attributable to an allele is a function of the square of the allele’s frequency.

Many GWAS have attempted to elucidate the genetic architecture of kidney traits, and over 260 loci have been associated with estimated GFR67. Although actual causal genes have been identified for only a few such loci68, it should be noted that these studies represent a major advance towards the goal of identifying the complete catalogue of genes that influence kidney traits. Of the 276 genes and/or loci that are associated with CKD69 in the National Human Genome Research Institute–European Bioinformatics Institute GWAS Catalogue (version 12-18-2019), only 5 (BSND, C9, CCDC57, PLG and SP3) show evidence of positive selection in simian primates — an infraorder of mammals that includes 5 hominoids (human, chimpanzee, gorilla, orangutan and gibbon), 1 New World monkey (marmoset) and 3 Old World monkeys (macaque, baboon and vervet)70 (Table 1). On the other hand, 84 CKD-associated loci show signatures of selection in one or more of the 26 modern human populations in the 1000 Genomes Project71, and 48 of these 84 loci (57%) show signs of selection in only one population. This observation suggests that most of the selective pressure exerted on CKD-associated loci occurred after the human–chimpanzee divergence72. These findings illustrate that natural selection events can occur over a variety of timescales and demonstrate how our increasingly rich knowledge of genetic associations can facilitate the identification of selected loci that confer disease risk.

Table 1 CKD-associated loci under positive selection in primates

Of the 84 CKD-associated loci that show signatures of selection in modern human populations, 66 (79%) are associated with either positive selection (that is, an increase in new advantageous alleles in the population) or purifying selection (that is, the selective removal of alleles that are deleterious; also known as negative selection). However, about one-fifth (21%) are associated with more than one form of selection or have more than one region in the gene showing signatures of selection (Supplementary Table 1). These observations suggest that CKD-associated loci in modern human populations have been the target of a wide variety of evolutionary forces. For example, we have found that among CKD-associated loci that are under selection, there is statistical over-representation of members of the class C/3 (metabotropic glutamate/pheromone receptors) reactome pathway (C.N. R., unpublished observations). This pathway is mainly involved in signal transduction by G protein-coupled receptors (GPCRs). GPCRs are involved in nearly all physiological processes, including the regulation of water and electrolyte transport in kidney tubules, the maintenance of acid–base balance, and the regulation of kidney blood flow and filtration73,74. Several kidney disorders are associated with dysregulated GPCR signalling, including kidney fibrosis and CKD. The class C/3 GPCRs include the metabotropic glutamate receptors and taste receptors75. Although these genes show evidence of natural selection and past demographic events (such as bottlenecks and population expansions), they also have functions in several systems beyond their involvement in taste, thereby making it difficult to speculate about the factors responsible for the selective pressures76.

Gene expression studies have shown differences in the transcriptional profiles of specific nephron compartments. For instance, tubule-specific genes are enriched for genes associated with the regulation of serum metabolites, whereas glomerulus-specific genes are enriched for CKD-GWAS significant hit variants and variants associated with blood pressure68. Gene network analyses of tubules have also provided insights into the mechanisms by which genetically regulated functional pathways can drive disease, for example, implicating a role for the endo-lysosomal pathway in the development of CKD68 and T cell and collagen pathways in tubulointerstitial fibrosis77. As of yet, no evidence exists, either from studies of individual loci or from studies of many loci in aggregate, for a role of natural selection in adaptive changes in the expression of genes in the kidney.

Immune factors

The human immune system is responsible for distinguishing self from non-self and is a critical defence against disease-causing microorganisms. The adaptive immune system is continuously challenged and has been shaped by exposure to infectious agents78 and introgression from archaic humans21. Therefore, it is not surprising that immune genes have been a target of evolutionary selection. Nowhere is this more apparent than in genes in the MHC region, particularly the HLA genes. Kidney diseases that have associations with immune genes generally fall into three broad categories. First, those associated with autoimmunity, in which an immune response is mounted against kidney-specific tissue or tissue components (for example, Goodpasture disease and membranous nephropathy) or affects multiple organ systems (for example, systemic lupus erythematosus). Second, those kidney diseases for which no evidence of autoimmunity currently exists but that have recognizable immune components in their aetiopathogenesis, including infection-related glomerulonephritis, sepsis-induced acute kidney injury, minimal change disease, primary FSGS and DKD. Third, drug-induced kidney diseases associated with commonly prescribed drugs, including abacavir, allopurinol, sulfonamides, penicillins, penicillamine and NSAIDs79, which can induce inflammation in various kidney tissue components, including the glomerulus, proximal tubules and the interstitium. For example, drugs such as rifampicin and NSAIDs can induce acute interstitial nephritis, whereas lithium and calcineurin inhibitors can cause chronic interstitial nephritis. Carriers of the HLA-B*57:01 allele are at a high risk of hypersensitivity reactions to treatment with the antiretroviral agent abacavir, which often damages the kidney80,81.

Strong purifying selection has been observed for variants of some immunity genes including those in the Toll-like receptor (TLR) pathway (for example, TLR3, MYD88 and IRAK4) owing to its essential role in protecting the host against life-threatening infections82. On the other hand, positive selection has been observed in nearly 200 other genes that mediate differences in disease susceptibility or outcome. Well-known examples include variants in HBB, G6PD and DARC that confer protection against P. falciparum and/or P. vivax-associated malaria and, as mentioned earlier, against variants in APOL1 that confer resistance to T. brucei. Pathogen-driven selection at these immune loci have confirmed probable implications for kidney disease. Studies over the past decade indicate that humans have acquired genetic diversity from archaic humans at several immune genes, including HLA genes (such as the HLA B*73 allele), TLR1 and the O-acetyl-l-serine (OAS) cluster of genes83,84. A 2016 study showed that regulatory variants that affect steady-state gene expression and transcriptional responsiveness to immune challenges were preferentially introduced into European genomes via admixture with Neanderthals and that some of these variants may have conferred a selective advantage to modern European populations22. Therefore, pathogen-driven selection at immune genes and introgression of functional variants from Neanderthal into modern humans likely have an evolutionary role in kidney disorders through either direct or indirect mechanisms.

Exposure to environmental factors

Over the course of human history, our species has been challenged by marked shifts in environmental exposures, and adaptation to those challenges has been key to our success in populating the diverse ecosystems of the world. In the context of kidney health, environmental factors that have been particularly challenging include infectious agents, toxins, dietary components and climate. Each of these factors is considered below.

Infectious agents

The kidneys are affected in many diseases caused by exposure to bacterial, viral, fungal and parasitic agents85. In some infections, kidney involvement is minor; however, kidney failure can occur in some instances. Bacterial infections can lead to septicaemia or toxic shock, with symptoms including inflammation due to systemic infection or the release of toxins, ischaemia or hypoperfusion. Infections caused by Staphylococcus aureus and Leptospira interrogans, and to a lesser extent Mycobacteria tuberculosis, Legionella and Rickettsia, can localize to the kidneys. Viruses such as arenaviruses, flaviviruses and hantaviruses are associated with haemorrhagic fever and have variable but generally low kidney involvement, typically presenting as vasomotor nephropathy. However, hepatitis B virus, hepatitis C virus, and HIV are clearly associated with kidney disease, which can be severe. For example, HIV-associated nephropathy is characterized by both glomerular and tubular damage and classically presents as collapsing FSGS. In immunocompromised individuals, cytomegalovirus and polyomaviruses are also associated with nephropathy.

Infections with coronaviruses have also been reported to involve the kidneys. Various case series of patients infected with the novel SARS-CoV-2 virus — the cause of the COVID-19 pandemic — have demonstrated kidney involvement manifesting as proteinuria, haematuria and acute kidney injury (AKI)86,87. SARS-CoV-2 preferentially infects the respiratory tract but some evidence suggests that it might infect the kidney, along with a broad array of other organs and tissues88. Viral RNA and protein have been detected in all compartments of the kidney, with preferential localization to glomerular cells. The associated histological features include diffuse proximal tubular injury and collapsing FSGS89,90. Although the development of AKI in patients with COVID-19 is associated with poor prognosis and high mortality87,91, SARS-CoV-2 is unlikely to have a major role as an agent of natural selection as the resulting disease — COVID-19 — disproportionately affects the elderly (that is, individuals beyond reproductive age).

Among parasitic diseases, malaria has known associations with glomerular disease, whereas other diseases such as schistosomiasis, filariasis and leishmaniasis (caused by Schistosoma, Filarioidea and Leishmania parasites, respectively) can lead to nephritis or nephrosis. Fungal infections typically affect the kidneys only in immunocompromised individuals. Amyloid deposition in the kidney can occur in the context of primary amyloidosis or in chronic conditions such as rheumatoid arthritis, ankylosing spondylitis, inflammatory bowel disease and chronic infection. The immune response to such insults, including the release of inflammatory mediators and the trapping or deposition of immune complexes, involves a number of genes in which mutations may modify the severity of kidney damage.

One example of an infectious agent with evolutionary implications for the kidney is the protozoan parasite Plasmodium falciparum, which is responsible for most malaria deaths worldwide. The distribution of the sickle cell mutation (rs334, c.20A>T) in the HBB subunit of haemoglobin, which arose ~7,300 years ago, is the result of balancing selection between protection against P. falciparum infection and mortality from sickle cell disease92. Consistent with the malaria hypothesis, the geographical distribution of the sickle cell mutation includes the Arabian Peninsula, parts of the Mediterranean (including Greece and southern Turkey) and parts of India93. Sickle cell nephropathy is a well-known complication of sickle cell anaemia (SCA) and accounts for the 5–18% prevalence of kidney failure observed in patients with SCA94. As would be expected, SCA and sickle cell nephropathy are common in West and Central Africa where the selection pressure from P. falciparum malaria is highest. However, both conditions are also observed in North America, Brazil, the Caribbean and other regions where Africans have migrated through voluntary or involuntary means. The sickle cell mutation has been associated with increased albumin-to-creatinine ratio in self-identified Hispanics and Latinx in the USA95, and with eGFR in non-Hispanic Black individuals in the Million Veteran Program96. Thus, the currently observed distribution of sickle cell nephropathy is a result of geographic migration and admixture following natural selection of a malaria-associated allele.

A 32-base pair deletion in the CC-chemokine receptor 5 gene (CCR5Δ32) is another example of evolutionary pressure on a locus that has consequences for the kidney. CCR5 is expressed in immune cells (including T cells, macrophages, dendritic cells and eosinophils) and is required, along with CXCR4 and CD4, for HIV-1 to gain entry into cells. Individuals who are homozygous for the CCR5Δ32 mutation are resistant to HIV-1 infection97. Interestingly, this mutation is present in up to 10% of individuals of European ancestry but is virtually absent in populations of African, East Asian and indigenous American ancestry98,99 — a difference that probably accounts in part for the disparities in HIV infection and, by extension, HIV-related comorbidities such as HIV-associated nephropathy in different populations (although it should be noted that most of these differences are due to public health measures and access to health care). The mutation has been dated to at least 1,500–2,000 years ago and, given its geographic distribution, must have arisen after the out-of-Africa migration99,100,101,102. As the mutation arose long before the emergence of HIV, the selection pressure that allowed it to spread to such a high frequency within an evolutionary short time frame must have been an agent other than HIV. The selective agent may be Yersinia pestis (the bacterium responsible for the bubonic plague), as the CCR5Δ32 mutation also protects against Y. pestis infection and would have conferred survival benefits during the bubonic plague epidemics of the Middle Ages. It has been suggested that other infectious agents, including the variola virus, which causes smallpox, and flaviviruses, may be better candidates than Y. pestis to explain global frequencies of CCR5Δ32 as the geographical distribution and clinical effect of the allele are a better fit with the current geographical distribution of the CCR5Δ32 mutation103.

Environmental toxins

Owing to their role in blood filtration, the kidneys are exposed to myriad toxins, most of which are efficiently removed through the urine by glomerular filtration, passive diffusion or active transport processes. However, some toxic substances are difficult for the kidneys to efficiently filter and may accumulate, causing damage. Particularly challenging substances are those to which humans have been exposed only in modern times, such as glyphosphate-based herbicides. These herbicides have been associated with AKI following exposure to high doses104, with CKD following occupational exposure among individuals in Sri Lanka105, and with kidney damage in animal studies106,107. Other substances that are toxic to the kidneys are present in a variety of sources in the modern environment. Exposure to lead, cadmium, mercury and uranium can occur in the context of poor infrastructure whereby these substances are incorrectly disposed of; exposure to cadmium, mercury and melamine can occur during industrial practices; and exposure to arsenic, cadmium and aristolochic acid can be traced to food contamination108.

In the context of evolution, the relationship between the kidneys and arsenic may be of most interest. Exposure to high levels of arsenic is associated with a variety of serious effects, including kidney damage and reduced kidney function109,110,111,112,113. High mortality from kidney diseases has been observed in areas with high arsenic levels in drinking water, including Utah and Michigan, USA114,115. Despite the observed toxic effects of arsenic exposure, some indigenous Andean populations live in areas with arsenic-contaminated drinking water but do not experience the expected harmful outcomes. Among these populations, efficient arsenic metabolism has been reported, such that inorganic arsenic is rapidly metabolized to the less toxic and more readily excretable dimethylarsinic acid116,117. For example, protective variants of AS3MT (which encodes the metabolic enzyme arsenite methyltransferase and is associated with efficient arsenic metabolism) are frequent (68%) in the Camarones people of Northern Chile — the region with the highest arsenic levels in the Americas118. High frequencies of protective variants are also found in indigenous Argentinean populations, with allele frequencies inversely associated with distance from arsenic exposure119. A GWAS of mono- and de-methylated arsenic levels in urine samples from an Argentinean Andes population identified AS3MT as the major gene influencing arsenic metabolism in humans116. The high frequency of protective AS3MT variants among individuals with a long ancestral history in this region with arsenic-contaminated drinking water and the evidence of a selective sweep around AS3MT suggest natural selection among these indigenous populations for efficient arsenic metabolism116, perhaps dating from an ancient Andean common ancestor118. This example represents the first known human adaptation to a toxic chemical116. As arsenic is a cause of kidney damage, this finding also represents an example of a positively selected locus that confers protection from nephrotoxicity via reduced susceptibility to the toxic effects of arsenic.

Diet

The need to excrete urea increased once our human ancestors began consuming a meat-rich diet following a transition from a diet of plants and fruits5, which probably placed selective pressure on the kidneys to maintain a high GFR. Interest in the ancestral diets of humans has grown in recent years, based on the premise that chronic disease might in some instances stem from a mismatch between modern dietary trends and that which our ancestors evolved to eat120, which primarily comprised wild animals and uncultivated plants121. However, the concept of a single ancestral diet is flawed, in that much of the success of our species stems from an ability to adapt to a wide range of environments and concomitant dietary diversity122. However, the over-riding consensus is that industrialization has facilitated the current high intakes of animal protein, refined grains and processed foods (with associated high sodium intake), whereas fruit and vegetable intake has decreased. These shifts are consequential in terms of the risk of chronic diseases, including kidney disease, particularly as related to maintaining the acid–base balance in the body.

The relationship between dietary acid load and kidney disease risk is supported by epidemiological data123,124,125,126,127,128. Dietary sodium intake predicts acid–base status independently of dietary acid load, with a predicted effect of similar magnitude129. US adults have an average dietary sodium intake of 3,584 mg per day130; the majority of salt intake comes from processed foods, whereas table salt contributes only a minor proportion of daily sodium intake131. The systematic use of salt has been dated to the Neolithic period132, the onset and duration of which varies by geographic region but encompassed the period ~10,000 BCE to 2,000 BCE. As a comparison, the average sodium intake of pre-Neolithic individuals is estimated to have been 400–1,200 mg133. A meta-analysis of dietary sodium and CKD risk found an association between the highest versus lowest sodium intake and increased risk of CKD (relative risk 1.09 [95% CI 1.01–1.19]), but with inconsistent results across geographic groups or study type134. The increased contribution of dietary salt resulting from a reliance on processed foods, coupled with the replacement of base-producing fruits and vegetables with acid-producing or neutral animal foods and grains, has resulted in a modern diet that chronically challenges the kidneys, leading to increased risk of functional decline. Although the kidney undergoes physiological changes in response to these challenges, maladaptation does not necessarily lead to mortality before the reproductive age and thus any underlying genetic variants that may influence this process are subject to minimal or no negative selection. The important role of the kidney in regulating body sodium levels is highlighted by the genetics of SLC12A1. This gene encodes Na–K–Cl cotransporter 2 (NKCC2), which is central to the reabsorption of sodium, potassium and chloride from the urine. Rare homozygous variants in SLC12A1 lead to neonatal Bartter syndrome, a disease characterized by an excess loss of salt in the urine, with accompanying polyuria, polydipsia, hypercalciuria and hypokalaemia. Heterozygotes for disease-causing SLC12A1 variants experience reduced blood pressure, presumably through reduced blood volume secondary to increased salt excretion135. Loss-of-function variants in SLC12A1 are under purifying selection136, which is consistent with the importance of renal sodium and potassium reabsorption in maintaining homeostasis, particularly in the African savanna — the home of early humans.

Climate, latitude and temperature

The African environment in which modern humans arose is characterized by a hot and wet climate. With salt scarcity being the norm, the thermoregulatory mechanisms needed for survival included a high capacity to sweat, efficient renal conservation of sodium and rapid vascular reactivity to blood volume changes. Indeed, the evolutionary prediction is that natural selection on the African continent would have favoured salt avidity and effective heat loss137. As humankind spread to other regions with different climates, these mechanisms are thought to have become maladaptive in certain environments. Indeed, evidence suggests that heat adaptation is associated with latitude, temperature and rainfall138. Of note, geographic latitude, ambient temperature, solar radiation and other similar measures are often confounded. In relation to health and disease, these phenomena have often been studied in relation to blood pressure regulation and hypertension. In this context, it is important to note that genes associated with climate adaptation, including ATP1A1, AQP2, CSK, AGTR1, AGT, CYP3A5 and GNB3, are both under natural selection and associated with blood pressure138,139,140,141,142. ATP1A1 and AQP2 (encoding the sodium/potassium-transporting ATPase subunit α1 and aquaporin 2, respectively) are involved in osmoregulation, whereas functional alleles in the other genes affect volume avidity (that is, water retention to maintain cardiovascular volume) and cardiovascular reactivity (that is, changes in heart rate, blood pressure and other cardiovascular measures in response to external stressors), and correlate with geographic latitude, rainfall and temperature. A 2016 study from Chile suggested that hypertension is more strongly associated with geographic latitude than with solar radiation and ambient temperatures143. This finding suggests that genetic variants influencing hypertension risk may be a characteristic of populations in those regions, rather than being due to climate adaptation-related genes or to physiological adaptation.

Insights into the effect of the environment on human kidney function in our evolutionary past might be gleaned from studies of men in Central America working in heavy manual labour occupations in hot environments, in whom the incidence of CKD is high144,145,146. This form of kidney disease has been termed MesoAmerican nephropathy or chronic interstitial nephritis in agricultural communities and is characterized by an absence of traditional risk factors such as hypertension, diabetes and proteinuria147,148,149. Environmental exposures to nephrotoxic pesticides and herbicides (glyphosate, organochlorine, alachlor, atrazine, metolachlor, pendimethalin, paraquat) have been implicated in its aetiology150. In addition, these individuals are also commonly exposed to extreme heat stress, with repeated episodes of dehydration151. One longitudinal study showed the rapidity and impact of these environmental conditions, with young men (aged 17–38 years) developing a marked reduction in kidney function, characterized by a 20% increase in serum creatinine level and a 9% reduction in estimated GFR, after only 9 weeks152. In evolutionary terms, it is possible that the human kidney is maladapted to conditions of repeated dehydration and heat stress and that over time, genetic variants that confer protection from such stress will be selected for in the exposed populations. To our knowledge, the role of genetic risk variants is currently unknown.

Insight into renal adaptations to the environment can also be derived from populations that currently live subsistence lifestyles in deserts and similar environments, characterized by heat stress and water scarcity. These populations include the KhoeSan peoples of Southern Africa and Australian Aborigines. Notably, studies of kidney function in Australian Aboriginal populations have demonstrated much higher rates of CKD and kidney failure than in non-Aboriginal Australians153,154. Although risk factors such as cardiovascular disease, metabolic disorders and reduced access to health care contribute to this observed disparity, autopsy studies have demonstrated that kidneys of adult Australian Aborigines have on average 30% fewer glomeruli than those of non-Aboriginal Australians, associated with an observed compensatory increase in glomerular volume153. Analyses of kidney biopsy samples have been unable to identify a single histological lesion that explains these findings, implying that the excessive rates of CKD and kidney failure among Australian Aborigines may be due to low nephron endowment at birth. Small body size and low nephron endowment are a consequence of poor nutritional environmental conditions. However, whether these changes are examples of acclimatization or adaptation in Australian Aborigines is unclear. Current evidence suggests that these changes may reflect acclimatization, given the particularly high rates of kidney disease among Australian Aborigines living in remote or very remote areas155, which likely reflects inadequate access to care, and given that nephron number was positively associated with height156, highlighting the link to nutritional status. However, the genetic influence on nephron number remains unclear and until genetic studies are available, we cannot rule out a possible role for loci under natural selection that may underlie these changes.

Conclusions and opportunities

The evolutionary context of kidney disease is complex and much remains to be understood. Major forms of CAKUT often lead to early mortality, and thus responsible genetic variants would be filtered out of the population by negative selection. On the other hand, most forms of CKD and kidney failure have an age of onset after the reproductive years, meaning that their underlying genetic risk variants are probably not selected against (that is, they are not under purifying selection). Therefore, it is likely that kidney-associated loci that are under natural selection have been selected for or against by non-renal factors, as in the case of APOL1. An unusual feature of CKD and kidney failure is that they are complications, comorbidities, or end points of several other disorders (such as diabetes mellitus and hypertension), which are often more prevalent in the population. Therefore, apart from loci directly associated with CKD and kidney failure, other loci may be indirectly linked through their intermediate phenotypes (Fig. 4). Currently, more data exist on the evolutionary context of the latter set of loci than on directly associated renal loci. Although the available evidence on the role of specific genes in evolutionary nephrology is presented here, it should be noted that the effect on disease risk may involve more than a single mutation or gene. Nonetheless, some selected variants in specific genes have potential clinical application. For example, APOL1 high-risk variants are potential markers not only for FSGS but also for other forms of kidney disease such as sickle cell nephropathy and lupus nephritis157, whereas the CCR5Δ32 mutation has been associated with reduced inflammation and lower cardiovascular mortality in a European study158 and a clinical trial of CCR5 inhibitors is currently underway in kidney transplant recipients. It is important for such studies to be as diverse as possible, not just to avoid race-based assumptions about allele frequencies but also to facilitate better generalizability of findings and ensure equity in realizing the benefits of the research159.

A number of gaps exist in our current knowledge of evolutionary genetics and nephrology. The pathogenesis of many apparently non-genetic kidney disorders is not yet fully understood and thus the contribution of any specific risk factor to disease is difficult to estimate at the present time. Our understanding of genetic factors that influence variation of nephron number remains limited. Similarly, the genetics of blood volume is under-studied. Another major challenge in the study of kidney health and disease is the lack of clinical data from large numbers of individuals for variables other than GFR estimated from serum creatinine level. This paucity of data limits the genomic and genetic studies that can be done. Measures or markers of proteinuria or albuminuria and tubulointerstitial fibrosis that can be easily and inexpensively assessed at the population level are needed before large-scale genetic studies can be performed. The availability of tools and methods for single-cell sequencing and single-cell transcriptional profiling presents an opportunity for the monitoring of gene expression at the single-cell level. These technologies will provide greater insights into kidney cell types and cell-type specific gene expression patterns, potentially enabling the mapping of genetic diseases of the kidney to single cell types160,161 and further enriching our knowledge about the evolutionary genetics of kidney disease.