Introduction

In order to meet the grand challenge of human genetics, that is, to understand what causes disease and translate this knowledge to improve health outcomes, we need to know the number and population frequency of disease variants, the magnitude of their effects on phenotype and gene–gene and gene–environment interactions. In other words, we first need to understand genetic architecture [1]. For monogenic disorders, the genetic architecture is simplified as disease variants are, by definition, highly penetrant and environmental and gene–gene interactions are minimized. We believe that the very characteristics that make genetic isolates disproportionality valuable for disease mapping (i.e., reduced genetic heterogeneity, shared ancestry, environment, and lifestyle), along with their small size, also make them practical choices for studying genetic architecture.

Stargardt disease 1 (STGD1, [MIM: 248200]) is an autosomal recessive form of inherited juvenile macular degeneration with an estimated prevalence of 1 in 8000–10,000 [2,3,4,5] and usually presents as progressive bilateral central vision loss, often to the point of legal blindness. Although it has been 20 years since the ABCA4 gene was cloned [6], research is more intense than ever because the functional consequence of the >1000 ABCA4 variants associated with STGD1 [7] and a broad spectrum of other retinal dystrophies, including cone-rod dystrophy (arCRD) [8,9,10,11], retinitis pigmentosa (arRP) [8, 12,13,14,15], and age-related macular degeneration (AMD) [15,16,17,18,19], is largely unknown. Patients with ABCA4-related disease have pathogenic variants in the ABCA4 gene which encodes a large, 2273 amino acid photoreceptor-specific transporter involved in the removal of toxic retinoid compounds from photoreceptors [20]. ABCA4-related retinal dystrophies have a major impact on quality of life and clinical interventions including stem cell therapy, gene replacement therapy, and pharmacological agents are currently being developed [21]. In the case of STGD1, diagnostic screening is challenging. Direct sequencing of the entire open reading frame, the most comprehensive approach, reveals biallelic variants affecting function in 65–70% of patients [22]. The aggregate population frequency of ABCA4 alleles affecting function is high (1 in 20) [23]. The vast majority of patients are compound heterozygotes and may carry three or more causative alleles, making phenotypic attribution to specific ABCA4 variants difficult [24, 25]. Clinical heterogeneity both within and between families is thought to depend on the unique combination of ABCA4 variants in the genome of individual patients. Traditionally, ABCA4 variants frequent in the population were thought to be benign and variants incapable of precipitating STGD1 phenotype in the homozygous state, categorized as “mild” (hypomorphic) although recent studies are challenging this classification schema [7]. Phasing of ABCA4 variants, identifying two causal ABCA4 variants in trans in the germline of patients, is essential to confirm the clinical diagnosis, to provide prognostic and recurrence risk assessments and to determine eligibility for therapeutic trials [23, 26]. The utility of using whole exome/genome approaches for screening purposes in clinical settings is currently being explored [27, 28].

We set out to determine the genetic architecture of STGD1 in the well-defined genetic isolate of Newfoundland (NL), Canada. This is a prospective study, recruiting patients and their extended families over a period of 40 years, and includes serial clinical investigations and a comprehensive molecular approach. Although we had no expectations as to the frequency of STGD1 disease in NL, we anticipated reduced allelic heterogeneity, with the presence of one or more recurrent variants affecting function due to genetic drift. We also anticipated insights into genotype–phenotype correlations for specific ABCA4 alleles present in the homozygous state in STGD1 and STGD1-like patients.

Subjects and methods

Study population

According to archeological records, the Canadian province of NL, perched on the extreme northeast coast, has been inhabited for longer than any other place in North America. The first indigenous peoples (Maritime Archaic, Palaeoeskimo, and Beothuk) arrived 18,000 years ago, before the disappearance of the retreating Laurentide ice sheet from the last ice age. Recent mtDNA studies suggest that the relationships between these culturally distinct peoples predates their arrival by land across the continent of North America [29]. More recently, fisherfolk from ports in England and Ireland came to the island to prosecute the summer cod fishery. By 1775, English and Irish outports were established and 12,000 people overwintered on the island. By the late 19th century, hundreds of outports dotted the rugged, extensive 900-mile coastline, each isolated for long periods due to distance and rough seas. The island population expanded naturally (little immigration) so by the 1930s, more than 95% of residents were native born and the descendants of ~20,000 English and Irish settlers [30]. Both geographical and religious isolation (between Roman Catholic and Protestants) resulted in high kinship correlations, equivalent to that of the Bedouin of Kuwait in some communities [31]. The island population is a young genetic isolate (15–20 generations) and stratifies into three clusters, Protestant, Roman Catholic, and a small cluster of North American Indians [32].

We frequently observe unrelated patients on the island with rare monogenic diseases, including autosomal dominant, recessive, and X-linked disorders, that share the same pathogenic variant on an extended disease-associated haplotype (identical-by-descent) due to a common ancestor. Our most striking example is that of a particularly lethal form of sudden cardiac death due to Arrhythmogenic Right Ventricular Cardiomyopathy (ARVC). We have uncovered hundreds of NM_024334.2: TMEM43 c.1073C>T; p.(S358L) carriers across numerous unrelated families, some pedigrees extending back nine generations, that all share the 2 Mb disease-associated haplotype due to a founder effect [33], the variant having been imported from Europe and dating back to the medieval period (400–700 AD) [34].

Patient recruitment

A population-wide recruitment effort on the island of NL (population 501,251) with the goal of identifying all patients with hereditary eye disease, including STGD1 and STGD1-like cases, began in 1978 with the establishment of an Ocular Genetics Clinic at the General Hospital in St. John’s and in select rural hospitals. A complete review of the provincial records of the Canadian National Institute for the Blind (CNIB) in the early 1980s determined the number and geographic distribution of registered CNIB clients with a possible hereditary form of blindness and also identified clients to be recruited to the study [35, 36]. STGD1 and STGD1-like patients from the island underwent comprehensive clinical examinations, including visual acuity, central and full visual fields, color vision testing (Ishihara and Farnsworth D-15), dark adaptation and/or full-field electroretinogram testing, retinal photographs and intravenous fluorescein. Extended family histories, including historical religious affiliations and community of origin, were obtained by family interviews and from the public archives. As STGD1 is a recessive disorder, we also noted consanguinity. All efforts were made to clinically reexamine patients in order to document the progression of retinal disease. This longitudinal study was approved by the institutional review board protocol # 02.116 (Human Research Ethics Board) and the Research Proposals Approval Committee of Eastern Health in St. John’s NL, Canada.

Full ABCA4 gene sequencing

So far as we know, all disease-causing variants for STGD1 reside in the ABCA4 gene. Of the 29 STGD1 or STGD1-like families enrolled in the study, five families were subsequently molecularly diagnosed with achromatopsia (S10, S12, S14, S15, and S20) and excluded from further investigation (Fig. 1). Bidirectional Sanger sequencing of the 50 exons and flanking intronic sequences of the ABCA4 gene was performed on 41 STGD1 cases from 24 families (Fig. 1). Genomic DNA was isolated from peripheral blood using a standard salting out procedure [37]. PCR amplification of the ABCA4 gene (GenBank NM_000350.2) was performed according to Azarian et al. [38]. except for exon 29 where primers were redesigned using Primer 3 software [39]. PCR products were size fractionated (1% agarose gels), stained (ethidium bromide or SYBR© Safe, Life Technologies, USA), documented (Kodak GEL 200; Mandel Scientific, Canada) and subjected to analysis according to standard protocols (ABI PRISM 3130XL or 3700 DNA Analyzer; Applied Biosystems, Foster City, CA; Mutation Surveyor Software, SoftGenetics LLC State College, PA 16803). Primer sequences are available upon request. Data from this study have been submitted to the public Leiden Open Variation Database.

Fig. 1: Pedigrees of the 46 cases with Stargardt macular dystrophy or Stargardt-like disease recruited over three decades in Newfoundland.
figure 1

Filled symbols represent relatives clinically diagnosed with Stargardt or Stargardt-like disease. An asterisk symbol indicates DNA was available for study.

Variant interpretation and STGD1-associated haplotypes

From the sequencing data, ABCA4 variants of interest were identified consisting of both novel and pathogenic or likely pathogenic variants as reported in the primary literature, public databases (e.g., Retinal Information Network, Retnet; Human Gene Mutation Database, HGMD; Leiden Open source Variant Database, LOVD, ClinVar), and in silico prediction software (e.g., Alamut). Variant classification based on the American College of Medical Genetics and Genomics criteria is provided (Supplementary Table 1). In each family, we genotyped variants of interest in parents and extended relatives when available to test for co-segregation with retinal disease consistent with a recessive mode of inheritance.

For autosomal recessive disorders and especially for STGD1, phasing is critical as two causal variants on the same chromosome (in cis) predict an unaffected carrier of a complex allele. To determine parent-of-origin for the purpose of phasing, intragenic single nucleotide polymorphisms (SNPs) that were identified in probands were also genotyped in extended family members to construct STGD1-associated haplotypes. To determine if recurrent pathogenic variants were due to founder effect, STGD1 haplotypes were compared between families. Recurrent variants may result from genomic hotspots (multiple events) or from single variants passed down through generations from a common ancestor. Despite the extreme allelic heterogeneity of ABCA4, we anticipated, due to the nature of the peopling of NL, to identify the same (recurrent) pathogenic variant in two or more unrelated patients due to a shared common ancestor and subsequent amplification in the population due to genetic drift.

Population frequencies of alleles and knowledge transfer to clinic

To provide a targeted gene panel to the local clinic, we designed a custom diagnostic panel including the 16 variants identified in this study using the iPLEX PRO Chemistry and the Agena MassARRAY platform. PCR primers were designed according to the manufacturer protocols (Design Assay Suite and Typer4 software; Agena Bioscience). We determined analytical validity (sensitivity, specificity, accuracy, and reproducibility) with a subset of molecularly characterized (“solved”) patients and partnered with the local genetics clinic to offer genetic counseling to families. Anonymized population control samples from NL and outbred populations (Canada, United Kingdom) were run on the custom panel and minor allele frequencies (MAFs) compared using Fisher’s Exact Test.

Results

The vast majority of STGD1 patients are of English extraction

Of the 29 STGD1 or STGD1-like families enrolled in the study, 41 cases were diagnosed with STGD1 (Fig. 1). Although the current population of NL is nearly equal parts English and Irish, the majority of STGD1 pedigree founders are of English extraction and trace their NL origins to a localized area within Conception Bay, a region of early English settlement in the northeastern part of the island (Supplementary Fig. 1) [40]. One singleton case (S11) originates on the French island of Saint Pierre, located 16 miles off the south coast of NL. Five families (S1, S2, S5, S9, S16) have two or more affected sibships; S9 represents two families connected by marriage and S16 is particularly notable with four consanguinity loops and an affected trio (Fig. 1).

Full gene sequencing and haplotype analysis achieves a high solve rate

Comprehensive molecular characterization of 41 STGD1 cases resulted in 38 solved and three unsolved cases (Families S5, S24), yielding a high case solve rate of 92.7%. Bidirectional Sanger sequencing yielded 16 pathogenic or likely pathogenic variants with an MAF of <0.01 in gnomAD browser (Table 1). We also identified a novel deletion, NM_000350.3: ABCA4 c.67-1delG, in three cases. This variant is located at the 3′ splice acceptor site of intron 1 and in silico analyses predict that it results in the removal of exon 2 from the mature ABCA4 transcript (MaxEnt [41], NNSPLICE [42], and Human Splice Finder [43]). Cascade screening and comprehensive haplotype analyses helped to confirm causality of the 16 ABCA4 alleles. Of the 41 STGD1 cases, 18 cases (43.9%) are homozygotes (Table 1). We found that 38/41 (92.7%) of STGD1 or STGD1-like cases had two pathogenic or likely pathogenic variants identified (Table 2). Most notably, the majority of solved cases (28/38; 73.7%) have at least one of two recurrent ABCA4 alleles (Table 2). We observed phenotypic heterogeneity, both within and between families; however, all diagnoses fell within the spectrum associated with ABCA4.

Table 1 Frequencies of the 16 pathogenic/likely pathogenic ABCA4 (NM_000350.3) variants identified in 24 families (41 cases) in population controls and unsolved retinal disease cases.
Table 2 Disease-causing and likely pathogenic alleles (simple and complex) in ABCA4 (NM_000350.3) and their phenotypic effect in 24 Stargardt families from an isolated population.

Recurrent c.5714 + 5G>A allele is due to a major founder effect

The most frequent ABCA4 allele in NL is c.5714 + 5G>A, surprisingly detected in 22/38 (57.9%) of solved cases across 13 families and represented the only pathogenic or likely pathogenic variant identified in all coding and exon/intron flanking regions within the gene. The c.5714 + 5G>A splice variant was first described in 1998 and is one of the most common STGD1 alleles currently reported in Northern Europe [8, 44, 45]. Most patients with the c.5714 + 5G>A allele shared the entire disease-associated haplotype (Fig. 2 and Table 2). Comparison of STGD1 haplotypes revealed patients in 13 families shared a common c.5714 + 5G>A haplotype (Fig. 2). In S9, two families connected by marriage, a c.5714G>A haplotype was introduced from the left side of the pedigree and a distinct c.5714 + 5G>A complex allele c.[2588G>C;5714 + 5G>A] was introduced on the right side of the pedigree (Fig. 3a). When compared with the 2019 ProgStar study [46], the c.5714 + 5G>A allele has a much higher frequency in NL cases (0.023 vs 0.707) and it is significantly increased (p = 0.00001) in the NL population compared with non-NL controls. The occurrence of two haplotypes with the c.5714 + 5G>A splice variant (Fig. 3a: yellow and orange haplotypes) suggests that it was introduced into the population at least twice and likely underwent natural expansion from the Conception Bay area, historically the most densely populated area of the province. We conclude that the c.5714 + 5G>A variant is recurrent in this population due to major founder effect. In seven homozygous cases, haplotype analysis revealed that the variant was the only pathogenic or likely pathogenic variant on the STGD1-associated haplotype. Furthermore, longitudinal observations show it is associated with a definite STGD1 phenotype with visual loss by middle age (Table 2).

Fig. 2: Examples of STGD1-associated haplotypes identified in Newfoundland.
figure 2

STGD1 haplotypes are read vertically using intragenic SNPs (5′ to 3′) residing in cis on chromosome 1p22.1 (pathogenic variants bolded). The first two haplotypes represent the recurrent NM_000350.3: ABCA4 c.5714 + 5G>A haplotype (yellow) and the ABCA4 c.[5714 + 5G>A;2588G>C] complex allele (orange), which are both associated with a milder phenotype. The ABCA4 c.2564G>A haplotype (red) is associated with a moderately severe phenotype. The blue haplotype denotes the recurrent ABCA4 c.[5461–10T>C;5603A>T] complex allele associated with the most severe phenotype.

Fig. 3: Progression of retinal phenotype associated with allele-specific haplotypes.
figure 3

Bars below symbols represent STGD1-associated haplotypes in Fig. 2. a Family S9 shows retinal phenotypes of two different compound heterozygotes. IV-1 has four pathogenic variants arranged into two complex alleles: NM_000350.3: ABCA4 c.[5461–10T>C;5603A>T];[5714 + 5G>A;2588G>C] (orange and blue bars). III-8 has three pathogenic variants arranged into two alleles: ABCA4 c.[5714 + 5G>A;2588G>C];[5714 + 5G>A] (yellow and orange bars). b Family S16 shows severe retinal phenotypes of patients either homozygous (blue bars) or compound heterozygous (blue/red and blue/yellow bars) for the ABCA4 c.[5461–10T>C;5603A>T] complex allele. An asterisk symbol indicates DNA was available for study and symbol “[]” indicates inferred haplotype.

Recurrent c.5461-10T>C Allele also due to a Founder Effect

The c.5461-10T>C ABCA4 allele was detected in 9/38 (23.7%) of solved cases across three families (S9, S16, S23) and forms a complex allele with c.5603A>T in 9/9 (100%) of STGD1 cases (Table 2and Fig. 2). First reported in 1999 [10], this intronic splicing variant is the third most frequent ABCA4 allele identified in STGD1 patients of European or African descent and functional studies reveal it causes skipping of exon 39 (100%) or 39–40, reduction in full-length mRNA and reduced protein levels in homozygotes [44, 45, 47, 48]. The c.5461–10T>C variant in the ABCA4 gene has been reported previously in the homozygous state and in the presence of a second ABCA4 pathogenic variant, in association with STGD1 disease [45, 49], has been consistently clinically classified as a pathogenic variant and is a severe variant. Our nine cases included three homozygous patients and 9/9 (100%) of cases shared the STGD1 complex allele c.[5461–10T>C;5603A>T] (Fig. 2). Haplotype analysis revealed that ABCA4 c.5461–10T>C is identical-by-descent and recurrent due to founder effect. Haplotype analysis in S16 (with four consanguinity loops) demonstrated the affected trio to be a case of pseudodominance (Fig. 3b). The affected son (PID V-1) is homozygous for the recurrent ABCA4 c.[5461–10T>C;5603A>T] complex allele, inheriting one copy from his homozygous mother and a second copy from his father. When compared with the 2019 ProgStar study [46], the c.5461–10T>C variant is amplified in NL cases (0.048 vs 0.293). Although we observe the c.5461–10T>C allele in ethnically matched controls, the frequency is not significantly amplified in NL (p > 0.05; Table 1).

Recurrent c.5603A>T hypomorphic allele

The ABCA4 c.5603A>T; p.(Asn1868Ile) variant was the second most frequent allele observed in this study (Table 1) and there is evidence to support the variant being a hypomorphic allele as it is phenotypically characterized by late onset of symptoms (4th decade) and foveal sparing (85%) only when in trans with a deleterious mutation [7, 50]. Complex alleles involving the ABCA4 c.5603A>T; p.(Asn1868Ile) variant were detected in 13 STGD1 cases including nine solved cases with c.[5603A>T; 5461–10T>C], two solved cases with c.[5603A>T; 2588G>C]) with a third complex allele, c.[5603A>T;2564G>A], identified in two cases in a single unsolved family (Table 2). There is evidence supporting the ABCA4 c.2588G>C; p.(Gly863Ala) variant as being pathogenic producing less protein, reducing ATP binding and hydrolysis and retinal transfer [51]. In our study, the ABCA4 c.2588G>C; p.(Gly863Ala) variant causes disease only when in cis with ABCA4 c.5714 + 5G>A or c.5603A>T and may not act as a disease-associated hypomorphic variant, but rather as a modifier (Table 2).

Novel pathogenic variant/complex alleles

We identified a novel variant, ABCA4 c.67-1delG, in third cousins once removed in family S2, which seems likely to have been transmitted by one of the pedigree founders four generations back (Fig. 1 and Table 1). The fact that this novel ABCA4 variant was in trans with the recurrent c.5714 + 5G>A allele helped to confirm its pathogenicity (Table 2). As ABCA4 c.67-1delG was also identified in a non-NL control sample it is probably not a private variant; however, we cannot be certain as the identity of the non-NL control is unknown (Table 1). We also identified four complex alleles: c.[5461-10T>C;5603A>T], c.[5714 + 5G>A;2588G>C], c.[2588G>C;5603A>T], and c.[2564G>A;5603A>T] (Table 2).

Natural history of retinal disease attributable to specific STGD1 alleles

Homozygous STGD1 patients are rare [24] due to the extreme allelic heterogeneity in ABCA4 disease, making it difficult to determine accurate genotype–phenotype prognoses when counseling patients and their families. Prospective clinical examinations over several decades in 18/38 (47.4%) of solved cases, which are homozygous, conclusively reveal that ABCA4 c.5461-10T>C variant is the most severe STGD1 allele in this population (Table 2 and Fig. 2). Patients having one or two copies of the ABCA4 c.[5461-10T>C;5603A>T] allele are characterized by the earliest age of onset, diagnosed with STGD1 as young as 8 years of age, and by the second or third decade, all progressed to a severe arCRD or arRP phenotype with atrophic retina and sparse pigment clumping with reduced or extinguished rod and cone function on ERG testing (Figs. 3 and 4). For example, in family S16, the retinal pattern over a 23-year period has progressed to the point where it does not present as STGD1 disease (Fig. 4). In contrast, ABCA4 c.5714 + 5G>A cases experience a mild-moderate vision loss with homozygotes having a less severe and later onset of decrease in acuity than compound heterozygotes (Table 2).

Fig. 4: Progressive course of retinal disease in patients with the NM_000350.3: ABCA4 c.[5461–10T>C;5603A>T] haplotype.
figure 4

Retinal photos of patient PID IV-2 (Family S16) showing phenotype at: a age 12 and b age 35. The patient presented early with typical STGD1 but by clinical follow-up in their 3rd decade, the retinal disease progressed to the point where the clinical presentation was more typical of cone-rod dystrophy or retinitis pigmentosa.

Founder effects amplify specific alleles but not STGD1 prevalence or carrier frequency

According to Hardy–Weinberg Law, we know that violating the first assumption of random mating can cause large deviations from the frequency of individuals homozygous for an autosomal recessive condition. After complete clinical ascertainment, we identified 41 NL residents with STGD1 or STGD1-like disease (41/501,251; q2 = 0.0000818) and calculate the prevalence to be 1 in 12,255, which is lower than the prevalence reported for other populations of 1 in 8000–10,000 [2,3,4,5, 22] and calculate the expected carrier frequency (2pq) to be 0.178. The actual carrier frequency (summation from Table 1) equals 0.0728, is also lower than expected. Using a custom diagnostic panel of 16 STGD1 pathogenic or likely pathogenic variants, five variants in the NL controls had a higher MAF compared with non-NL controls (Table 1) and the ABCA4 c.5714 + 5G>A allele is significantly higher (p = 0.00001) compared with other Northern European populations (Table 1). In this genetic isolate, it is likely that the reduced genetic heterogeneity (founder effect) combined with a high degree of kinship correlation, geographical, and religious isolation have precipitated the unprecedented rate of homozygous STGD1 patients observed in this study without a corresponding increase in STGD1 prevalence or population carrier frequency (Table 1).

Discussion

The genetic architecture of STGD1 in Newfoundland, Canada

STGD1 in this young genetic isolate is almost exclusively of English origin and characterized by drastically reduced allelic heterogeneity with the concomitant amplification of specific European alleles due to genetic drift. Sixteen ABCA4 (NM_000350.3) pathogenic variants, including a novel deletion, account for 92.7% of solved cases, with a disproportionate number of homozygous cases. The common European ABCA4 c.5714 + 5G>A splice variant resides on two distinct disease haplotypes, occurs in 58% (14/24) of families and the frequency is significantly amplified in the NL population. Public archival records, family interviews and comprehensive haplotype analysis reveal that the ABCA4 c.5714 + 5G>A allele was likely introduced by English settlers early in the peopling of the island population. Key genotype-specific insights in homozygous patients include the confirmation of the ABCA4 c.5714 + 5G>A allele as a true pathogenic but mild allele. Despite the presence of recurrent founder variants, the prevalence of STGD1 in NL is similar to that of outbred populations in North America and Europe.

Advantages of a population-based approach in genetic isolates

We believe the high solve rate (92.7%) is due to the triad of population-based clinical ascertainment, comprehensive molecular screening, and the presence of recurrent founder variants. To date, the vast majority of natural history studies on the progression of retinal diseases are retrospective from single centers serving outbred populations, resulting in calls for natural history studies of large cohorts of molecularly proven patients [22]. Our successful approach was serial retinal investigations over four decades on molecularly characterized cases and their extended families who share ancestry and environment, while leveraging key resources including public archives, community-based ophthalmologists and patient advocacy databases (e.g., the CNIB).

The design and execution of successful clinical trials depend on accurate predictions of clinical outcomes for specific alleles. In autosomal dominant disorders, this is relatively straightforward, as each affected person harbors a single disease allele, and the natural history can be gleaned from multiple affected individuals over several generations with the exact same disease allele. In recessive disorders, and particularly for STGD1 in outbred populations, the majority of affected individuals are compound heterozygotes. In the NL population, we have a rare opportunity to track the progression of retinal disease in seven patients homozygous for c.5714 + 5G>A (one of the most common European variants) often reported as a mild allele [52] with mild-to-moderate vision loss and late onset of decrease in acuity, as well as 15 compound heterozygous. Full gene sequencing and haplotype analysis confirmed that c.5714 + 5G>A was actually the only pathogenic or likely pathogenic variant on the STGD1-associated haplotype in seven homozygotes.

In this study, we relied on phasing to determine which alleles were in cis to identify the presence and composition of complex alleles, a particular challenge for ABCA4-related retinal diseases. Patients in our study who inherited two copies of ABCA4 c.5461–10T>C, the third most frequent variant associated with STGD1 patients of European or African descent [44, 45, 47] are the most severely affected, presenting with a classic STGD phenotype and rapidly progressing to RP or arCRD with severely reduced or extinguished cone and rod responses on electroretinogram testing as early as 8 years of age.

In genetic isolates, recurrent variants can unmask rare recessive alleles. In this study, the predominance of ABCA4 c.5714 + 5G>A served to unmask ABCA4 alleles in the region of early English settlement. This included a novel ABCA4 deletion pathogenic variant (c.67-1delG) which co-segregated with disease in three cases in a multiplex family and was inherited in trans with the founder ABCA4 c.5714 + 5G>A variant (c.[67-1delG];[5714 + 5G>A]), increasing our confidence that the novel variant affects function. We also uncovered a case of an affected trio with STGD1. Pseudodominant transmission (vertical inheritance of a recessive trait) in STGD1 occurs more often than might be expected, probably because of the relatively high carrier frequency of pathogenic ABCA4 variants in the general population. Huckfeldt et al. [53] recently reported a case of pseudodominant transmission where there were three alleles in the family and Lee et al. [25] reported four alleles in a family from the UK. In our study, the affected trio turned out to be a case where an affected son inherited two copies of the severe c.[5461–10T>C;5603A>T] allele, one from his homozygous mother and the other from his compound heterozygous father. Molecular characterization of this case is not only interesting from a genetic architecture lens, but also clinically significant as it drastically increases the recurrence risk for this family from 25 to 100%, as there are no normal ABCA4 alleles in the parental generation.

Prevalence of STGD1 in the NL population

At the population level, recurrent variants and increased homozygosity may lead one to assume that the disease prevalence of STGD1 in NL is comparatively relatively high. We find empirical evidence to the contrary. In genetically isolated populations, it is more likely that members will marry someone with the same variant and have homozygous children, resulting in a higher portion of homozygotes, concomitant with a lower number of disease alleles, a feature of recessive conditions in the Hutterites [54]. In the context of STGD1 disease with >1000 variants affecting function worldwide, we identified a mere 16 ABCA4 alleles accounting for all cases in this population, although we do observe a skewed geographical distribution of cases.

Limitations of a population-based approach in a genetic isolate

A major challenge of any STGD1 study is determining the actual age at onset of vision loss, especially for older participants (70–92 years old). As there were no ophthalmologists in rural NL three or more decades ago, age at diagnosis for many of our senior patients was unknown, and a lack of access to health care is still a feature of rural NL. Three cases across two families remain unsolved, representing a particular conundrum for genetic conditions as the lack of family history makes it impossible to determine the mode of inheritance. Retinal disease in these cases could be due to deep intronic or copy number variants in ABCA4 or due to variants in other genes. Although we did not search for deep intronic variants or large deletions, we tracked STGD1-associated haplotypes across multiple generations and established parental contributions, confirming homozygous cases are truly homozygous rather than a point mutation in trans with a deletion. Whole genome sequencing approaches may solve these cases, although first studies suggest that copy number variations (large deletions and insertions) at the ABCA4 locus are rare [23].

With respect to genetic architecture, we limited this study to STGD1 alleles identified in patients. To identify all disease-causing ABCA4 variants in a population would require a random sampling design and a whole exome/genome approach, with the caveat that discerning pathogenic variants that occur in unaffected people would be difficult for novel variants or variants of unknown significance. For example, the dearth of STGD1 and STGD1-like cases on the west coast of the island may mean there is a broader range of STGD1 alleles that have not been unmasked due to paucity of recurrent STGD1 alleles in this localized area. Although there are concerns that disease variants identified in genetic isolates are not relevant to outbred populations, this study reaffirms that the NL population represents a potent amplifier for a subset of Old-World variants. In this way, specific disease variants, expanded due to genetic drift, can be examined with phenotypic depth and breadth in the context of reduced environmental exposures.

Meeting the grand challenge of human genetics—were we successful?

How close did we come to determining the genetic architecture of STGD1 disease in this genetic isolate, and translating knowledge to the clinic? We took STGD1, arguably the most difficult monogenic disease, and over the course of four decades identified all prevalent cases and their extended families. We then used a comprehensive molecular approach and determined the underlying pathogenic variants and their phenotypic impact over the lifespan. Furthermore, we designed and validated a targeted diagnostic panel on a customizable platform and provided genetic counseling to all patients and their families in collaboration with the local clinic, enabling their eligibility for therapeutic trials.

Renewed calls for longitudinal studies emphasize the need for much more effort to be placed on phenotyping patients and family members in our current era of whole genome approaches [54, 55]. Although outside the scope of this project, we now have a unique opportunity in NL to look at the population level for all ABCA4 variants and to examine individual carriers for retinal phenotypes in order to capture age of onset, and mild manifestations of the same variants in the absence of ascertainment bias. In this way, the exact molecular effect of the >1000 ABCA4 variants associated with STGD1 and a broad spectrum of other retinal dystrophies (arCRD, arRP, and AMD) may be better understood. A focus on monogenic diseases in isolated populations with manageable population size, reduced heterogeneity, and local expertize, provides unique opportunities for meeting the grand challenge of understanding the causes of genetic disease and translating the information to the clinic.

Web resources

Ensembl Genome Browser, http://www.ensembl.org/index.html

Genome Aggregation Database, https://gnomad.broadinstitute.org

Human Gene Mutation Database (HGMD) http://www.hgmd.cf.ac.uk/ac/

Leiden Open Variation Database (LOVD): ABCA4 Locus database: http://www.lovd.nl/ABCA4 (Individual IDs #00233811 - #00233851)

OMIM, http://www.omim.org/

RetNet – Retinal Information Network, https://sph.uth.edu/retnet/home.htm

UCSC Genome Browser, http://genome.ucsc.edu

ClinVar, https://www.ncbi.nlm.nih.gov/clinvar/