Introduction

Rare genetic neurological disorders (RND; ORPHA:71859) are a heterogeneous group of disorders comprising >1700 distinct genetic disease entities. However, genetic discoveries have not yet translated into dramatic increases of diagnostic yield and indeed rates of molecular genetic diagnoses have been stuck at about 30–50% across NGS modalities and RND phenotypes [1, 2]. Existence of yet unknown disease genes as well as shortcomings of commonly employed NGS technologies and analysis pipelines in detecting certain variant types are typically cited to explain the low diagnosis rates.

To increase the diagnostic yield in RNDs - one of the four focus disease groups in Solve-RD - we follow two major approaches, that we will here present and exemplify: (i) systematic state-of the art re-analysis of large cohorts of unsolved whole-exome/genome sequencing (WES/WGS) RND datasets; and (ii) novel-omics approaches. Based on the way Solve-RD systematically organizes researchers’ expertise to channel this approach [3], the European Reference Network for Rare Neurological Diseases (ERN-RND) has established its own Data Interpretation Task Force (DITF) within SOLVE-RD, which is currently composed of clinical and genetic experts from 29 sites in 15 European countries.

Systematic re-analysis of coding variation

Unsolved WES datasets (fastq) from 2048 families with RNDs were submitted by clinical sites of ERN-RND [4] to the RD-Connect Genome-Phenome Analysis Platform. Genomic data were processed and filtered as detailed [5]. The Solve-RD SNV/Indel working group reported back 74,456 variants in 2246 individuals, which were ranked according to their likelihood of being causative. One thousand nine hundred and forty-three variants in 1155 individuals (average 1.68 variants/individuum) were classified as rank 1 (genotype matches OMIM and variant (likely) pathogenic according to ACMG).

Based on these results and the work of the RND DITF 44 cases could be solved by this systematic re-analysis approach, which equals 29% of the re-analysed cases for which feedback was available. Reasons for solving cases were firstly updates of the respective ClinVar entry of identified variants between the time of the initial genetic workup and the Solve-RD re-analysis due to now additional available evidence. One example is the re-classification of variants in highly variably genes like ITPR1 between 2016 and 2020 [6] (Fig. 1A).

Fig. 1: Clinical information and functional variant validation for families 1–3.
figure 1

A Pedigrees and cranial MRI of patient 1 (NM_001168272.1(ITPR1):c.805C>T, p.(Arg269Trp)). Mid-sagital MRI (T2) shows marked cerebellar atrophy at age 7. B Pedigree and longitudinal MRIs taken from patient 2 (pontocerebellar atrophy—NM_016042.3(EXOSC3):c.395A>C, p.(Asp132Ala)). MRIs demonstrate marked cerebellar atrophy while brainstem volume is not affected. C Pedigree, segregation analysis and functional analysis in family 3. The index cases carries two intronic POLR3A variants. Variant c.1048+5G>T is located in intron 7; RT-PCR with primers binding to sequences in exon (forward) and exon 9 (reverse) demonstrate presence of an aberrant transcript that is absent in controls. Specific amplification of this additional band and sequencing revealed that all 177 bp of intron 7 are included in the transcript. A nonsense codon in intron 7 presumably leads to termination of translation (p.Phe352_Arg353ins(23)Ter). The variant c.1909+22C>T has previously been demonstrated to lead to inclusion of the first 19 nucleotides from intron 14 into the final transcript und consequently to shift of the reading frame [8].

Second, use of human phenotype ontology-based phenotypes [7] rather than diagnostic categories as well as consideration of variant-specific rather than gene-specific phenotypes enabled detection of functionally relevant variants because initial analysis focused on disease-specific panels. Mis-classification of phenotypes in RNDs is a common problem due to the considerable overlap between diagnostic categories especially in phenotypes affecting more than one neurological system. This approach i.e. allowed identification of a causative variant in EXOSC3 (c.395A>C) that is typically associated with a ‘milder’ clinical disease course and lacking the hallmark pontine atrophy characteristic for EXOSC3-associated disease (Fig. 1B).

Analysis of non-coding variation

The relative contribution of non-coding variation to RNDs has not been established yet and will be systematically explored by Solve-RD by combining WGS and RNA Seq. We will evaluate the added value of RNA Seq in early onset sporadic cases (Trio-WGS), multiplex recessive and dominant families.

In the meantime, the exon–intron boundaries commonly covered by WES already allow at least a glimpse into the realm of non-coding variants. Indeed, the systematic Solve-RD re-analysis top-listed a single heterozygous intronic variant in the POLR3A gene (NM_007055.3(POLR3A):c.1909T>A: c.1909+22G>A, p.Tyr637Cysfs*14) that had recently been shown to be a frequent cause of spastic ataxia [8] in trans with a second loss-of-function POLR3A variant in an unsolved adult patient with a spastic ataxia phenotype. No second coding POLR3A variant was identified. However, a variant in intron 7 of the POLR3A gene was discovered in the WES data (NM_007055.3(POLR3A): c.1048+5G>T). RT-PCR from whole blood revealed an aberrant transcript that was absent in controls. Specific amplification and sequencing demonstrated the inclusion of all 177 bp of intron 7 into the final mRNA transcript. On protein level, this change is predicted to insert 23 amino acids coded by intron 7, followed by a stop codon (p.Phe352_Arg353ins(23)Ter) (Fig. 1C).

Finding novel variations through novel omics

Scientific rationale drives application of novel-omics technologies in Solve-RD. From the large variety of different omics technologies that will be used by SOLVE-RD, we here present the example of long-range WGS for ataxias, which has just been initiated. For ataxias >25% of all autosomal-dominant and >50% of all autosomal-recessive ataxia patients remain unsolved despite advanced WES analysis [9]. Ataxias are unique in so far as repeat expansions represent the most frequent disease cause. Seventy-five percent of all known autosomal-dominant ataxia cases and 50% of all known autosomal-recessive ataxia cases are caused by repeat expansions [10]. We thus hypothesize that a substantial share of repeat-expansion disorders is still to be found in the large share of still unsolved WES-negative ataxia cases. Therefore, in Solve-RD we will be using long-range WGS in family ‘triplets’ from autosomal-dominant ataxia families, which will be stringently enriched for novel repeat-expansion disorders: namely only families negative not only on WES and frequent SCA repeats, but also on short-read WGS and for which DNA from >2 affected and >2 non-affected family members are available. In a first round of submission, 20 families with 44 ‘slots’ have been submitted and we are awaiting data in 2021.

Conclusion

This viewpoint presents and exemplifies the approach being taken by Solve-RD to diagnostically solve unsolved RND. While re-analysis so far succeeded in 29% of cases, scientifically rational ‘beyond the exome’ approaches are being implemented to further unravel new RND causing genes.