Introduction

Lupins are valuable crops appreciated as a source of protein for food and feed, as well as plants enhancing mobilization of soil phosphorus, improving soil fertility through symbiotic nitrogen fixation, and increasing economic payback for succeeding crops (Lambers et al. 2013). The seed of modern white lupin (Lupinus albus L.) germplasm is characterized by high content of protein, around 38–42% on a dry-weight basis (Papineau and Huyghe 2004); moderate content of oil, around 10–12% (Annicchiarico et al. 2014) with outstanding food quality (Boschin et al. 2007); and low alkaloid content (Lin et al. 2009). Moreover, extracts from different lupin species were revealed to have antimicrobial activity (Confortin et al. 2018; Confortin et al. 2017; Confortin et al. 2019; Erdemoglu et al. 2007; Romeo et al. 2018). During the domestication process, germplasm resources with dwarf architecture determinate growth habit and higher cold tolerance have also been selected (Harzic et al. 1995; Huyghe and Papineau 1990; Julier et al. 1993). Moreover, the assay of 121 entries representing the global white lupin germplasm revealed that potentially high-yielding landraces are available for exploitation in breeding programs (Annicchiarico et al. 2010). However, worldwide attempts of white lupin improvement have been hampered by high susceptibility to anthracnose, caused by the pathogenic fungus, Colletotrichum lupini (Bondar) Nirenberg, Feiler & Hagedorn (Nirenberg et al. 2002). Typical symptoms were observed as early as in 1912 in Brazil, but the underlying fungus was identified three decades later (Weimer 1943). Early screening of the resistance revealed some level of resistance in L. angustifolius and L. luteus germplasm and high susceptibility of all L. albus accessions tested (Weimer 1952). The appearance of the disease on white lupins in France (1982) and Ukraine (1983) has challenged dramatically the European white lupin breeders (Gondran et al. 1996). Soon afterwards, anthracnose appeared worldwide in nearly all regions where white lupins are cultivated, including major producers such as Australia, Poland, and Germany (Frencel 1998; Frencel et al. 1997; Sweetingham et al. 1995; Talhinhas et al. 2016). Anthracnose susceptibility is a more important issue in white lupin than in the narrow-leafed lupin, because the latter has several independent sources of resistance already present in improved germplasm (Boersma et al. 2005; Fischer et al. 2015; Yang et al. 2004; Yang et al. 2008). This disease proved to be a critical obstacle for the agronomic improvement of white lupin, as the only source of resistance, located in a mountainous region of Ethiopia, has been identified hitherto (Phan et al. 2007) in the form of several accessions collected in one district (Adhikari et al. 2009). Recent studies revealed that Ethiopian germplasm landraces carry rare alleles and are relatively uniform genetically (Atnaf et al. 2017; Raman et al. 2014).

The lack of modern breeding tools hampered the rate of white lupin genetic improvement. Breeders have spent more than two decades to harness Ethiopian anthracnose resistance alleles with very limited success rate: only a few cultivars showing an incremental improvement of anthracnose resistance were bred (Adhikari et al. 2009; Adhikari et al. 2013; Jacob et al. 2017; Talhinhas et al. 2016). Molecular genomic resources of white lupin include two mapping populations with associated low-density linkage maps (Croxford et al. 2008; Phan et al. 2007; Vipin et al. 2013) and a draft transcriptome assembly (O'Rourke et al. 2013). Some sequence tagged site (STS) markers linked to low alkaloid content (PauperM1) and anthracnose resistance (WANR1, WANR2 and WANR3) (Lin et al. 2009; Yang et al. 2010) were developed, but the recombination rate between these markers and corresponding trait loci was too high for their implementation in marker-assisted selection. The reference recombinant inbred line (RIL) mapping population developed from the cross Kiev Mutant (Ukraine) × P27174 (Ethiopia) segregates for many agronomic traits, including also the resistance to anthracnose inherited from Ethiopian parents (Phan et al. 2007). Trait loci have been localized on the linkage map (Cowley et al. 2014; Phan et al. 2007; Vipin et al. 2013), but low marker density, namely one marker per 10.8 cM (Phan et al. 2007) and one per 4.6 cM (Vipin et al. 2013), impeded quantitative trait loci (QTL) mapping attempts. Recently, genotyping-by-sequencing (GBS) data were exploited to develop a new high-density consensus linkage map of the species based on new, transcriptome-anchored markers (Książkiewicz et al. 2017). Mapping of white lupin QTLs revealed polygenic control of anthracnose resistance and provided a bunch of allele-sequenced markers tagging these QTLs, opening new possibilities for development of tools for molecular tracking of anthracnose resistance. For traits controlled by various genes, an alternative approach to marker-assisted selection is represented by genomic selection, by which major and minor gene effects are taken into account by a statistical model for breeding value prediction that is constructed from the joint analysis of phenotyping and genotyping data of a germplasm sample that represents well the target genetic base (Heffner et al. 2009; Meuwissen et al. 2001). The first investigations of genomic selection for white lupin revealed high ability to predict grain yield in contrasting European environments (Annicchiarico et al. 2019) and several morphophysiological and agronomic traits in Northern Italy (Annicchiarico et al. 2020).

The present paper aims to exploit the molecular information generated by markers (Książkiewicz et al. 2017) to develop PCR-based array tagging two major white lupin anthracnose resistance QTLs, ultimately facilitating preselection of germplasm carrying candidate Ethiopian alleles of anthracnose resistance genes. As an additional objective, we assessed preliminarily the applicability of a genomic selection approach for prediction of anthracnose resistance based on GBS data of a mapping population.

Materials and methods

Transformation of GBS polymorphisms to PCR-based markers

To select reference transcript sequences for primer design, GBS marker sequences from QTL loci (Książkiewicz et al. 2017) were aligned to assembled transcriptomes of Kiev and P27174 lines as well as to reference white lupin gene index LAGI01 (O'Rourke et al. 2013) by BLAST (Altschul et al. 1990) implemented in the Geneious software (Kearse et al. 2012) under an e-value cut-off of 1e−10. Selected transcripts were then aligned to the genome sequence of the narrow-leafed lupin (Hane et al. 2017) under an e-value cut-off of 1e−15, extracting matching regions with context size of 5000 nt. To find exon/intron boundaries, extracted narrow-leafed lupin genome regions were assembled together with white lupin GBS and transcript sequences into contigs using progressive Mauve algorithm (Darling et al. 2004) assuming genome collinearity. Mauve alignments consisting of corresponding markers, L. albus Kiev, P27174 and LAGI01 transcripts and L. angustifolius scaffolds were screened for the presence of polymorphic loci. The primers flanking these loci were designed using Primer3Plus (Untergasser et al. 2007) and L. albus cDNA sequences as templates. DNA was isolated from L. albus Kiev and P27174 using DNeasy Plant Mini Kit (Qiagen, Hilden, Germany) and ethyl alcohol (96%, < 0.1% methanol, Avantor Performance Materials, Gliwice, Poland). Amplification was performed using GoTaq® Flexi DNA Polymerase (Promega, Madison, USA) and Labcycler Gradient thermal cycler (Sensoquest, Göttingen, Germany). Ninety-six-well PCR plates (4titude, Wotton, Surrey, UK) and standard tips (Neptune Scientific, San Diego, USA) were used. Amplicons were purified directly from the post-reaction mixtures (QIAquick PCR Purification Kit; Qiagen) and sequenced (ABI PRISM 3130 XL Genetic Analyzer; Applied Biosystems, Hitachi) in the Laboratory of Molecular Biology Techniques, Faculty of Biology, Adam Mickiewicz University (Poznan, Poland). Nucleotide substitution polymorphisms were resolved by the cleaved amplified polymorphic sequence (CAPS) (Konieczny and Ausubel 1993) or derived CAPS (dCAPS) (Neff et al. 1998) approaches. Restriction sites and dCAPS primers were identified using dCAPS Finder 2.0 (Neff et al. 2002) and SNP2CAPS (Thiel et al. 2004). Restriction enzymes were supplied by New England Biolabs (Ipswich, USA) and Thermo Fisher Scientific (Waltham, USA), depending on price and availability. Restriction products were separated by agarose gel electrophoresis (Wide Range Agarose, Serva, Heidelberg, Germany) with the agarose concentration (1–3%) adjusted to follow the size of the expected digestion products. Standard Tris-Borate-EDTA buffer was exploited (Serva). Electronic expandable multichannel pipette (Matrix, Thermo Fisher Scientific) was used for transfers of samples between PCR plates and gels. Data on developed molecular markers, including primer sequences, annealing temperature, nucleotide sequence identity, and polymorphic loci are provided in Tables 1 and 2 as well as in the Supplementary Table S1.

Table 1 List of developed markers for antr04_1/antr05_1 QTL, with primer sequences, PCR amplification temperature, validated enzymes, and restriction product sizes
Table 2 List of developed markers for antr04_2/antr05_2 QTL, with primer sequences, PCR amplification temperature, validated enzymes, and restriction product sizes

Linkage mapping

Genetic mapping was performed using the reference Kiev x P27174 F8 recombinant inbred line (RIL) population (n = 196) delivered by the Department of Agriculture and Food Western Australia. This population was derived from a cross between the anthracnose-resistant Ethiopian landrace P27174 and susceptible Ukrainian line Kiev Mutant (Phan et al. 2007). Kiev Mutant-like scores were assigned as b, P27174-like scores as a, and heterozygotes as h.

Chi-square (χ2) values for Mendelian segregation in F8 RILs were estimated using the following expected segregation ratios: 0.4961 (Kiev Mutant), 0.4961 (P27174), and 0.0078 (heterozygote). The calculation of probability was based on χ2 and 2 degrees of freedom. L. albus marker segregation files (Książkiewicz et al. 2017; Phan et al. 2007; Vipin et al. 2013), together with those developed in this study, were imported to JoinMap 4.1 (Stam 1993). Multipoint mapping was performed after grouping under independence LOD of 11.0 (parameters in the Supplementary Table S2). Linkage group optimization was performed according to the procedure described by Książkiewicz et al. 2017.

The Pearson product-moment correlation coefficient between averaged anthracnose disease resistance scores (Książkiewicz et al. 2017) and marker allelic phases for the set of 191 mapping population lines was calculated in Excel. The p value (both one-tailed and two-tailed probability) of a Pearson correlation coefficient was calculated using the p value calculator for correlation coefficients (http://www.danielsoper.com/statcalc).

Anthracnose resistance QTL mapping

Data on anthracnose resistance scores (Książkiewicz et al. 2017) and an updated linkage map from this study were used to re-draw QTL loci. As the study was aiming to develop PCR-based markers for preselection of germplasm carrying published candidate anthracnose resistance loci, we did not perform new phenotyping of RIL population. Thus, we used average values (trait antr_avg) across two years of anthracnose resistance screening in Perth, Australia, which encompassed three independent experiments per year (trait antr04, n = 151; trait antr05, n = 191) (Książkiewicz et al. 2017; Phan et al. 2007). Interval mapping (van Ooijen 1992) was performed using MapQTL 6 (Kyazma, Wageningen, Nederlands) (Supplementary Table S3). The LOD threshold of 3.5 was used for QTL determination. QTLs were located at positions with the highest LOD values. Composite interval mapping was performed in Windows QTL Cartographer V2.5 (North Carolina State University, Raleigh, USA) using 20 background control markers window size 10 cM and walk speed 0.5 cM to exactly follow the approach used during the reference study (Książkiewicz et al. 2017). Linkage groups and LOD graphs were drawn in MapChart (Voorrips 2002). Marker sequences developed in this study (LC416306–LC416345) were aligned to the L. albus genome assembly (Hufnagel et al. 2020), using custom BLAST database (e-value 1e−20, 1 best hit) implemented in Geneious, to find genome regions collinear to QTL loci conferring anthracnose resistance. Genes localized in these regions were screened for the presence of typical domains using the Disease Resistance Analysis and Gene Orthology (DRAGO 2) tool at the Plant Resistance Genes database (PRGdb) (Osuna-Cruz et al. 2018).

Assay of marker polymorphism in diversified germplasm

The set of 107 L. albus lines derived from the European Lupin Gene Resources Database maintained by Poznań Plant Breeders Ltd. station located in Wiatrowo was used: 51 primitive populations, 30 landraces, 16 cultivars, 7 cross derivatives, and 3 mutants. These lines originated from 22 countries (Supplementary Table S4). Markers were scored using the same methods as those applied for the RIL population. For binary data similarity analysis, Kiev Mutant-like scores were assigned as 0, P27174-like scores as 1, heterozygotes as 1, and additional alleles as 0. Simple matching (Sokal and Michener 1958) and Rogers-Tanimoto (Rogers and Tanimoto 1960) coefficients were calculated using a Binary Similarity Calculator http://www.minerazzi.com/tools/similarity/binary-similarity-calculator.php.

Genomic selection model

A genomic selection model was built to investigate the possibility of predicting anthracnose resistance scores derived from the RIL population. The set of GBS-derived single nucleotide polymorphism (SNP) markers, after validation via chi-square test, was filtered for growing levels of missing rate per marker (10%, 20%, 30%, 40%, and 50%), resulting in five different genotype matrices with a growing number of markers (631, 1859, 2430, 2833, and 3230, respectively). Following Nazzicari et al. (2016), the remaining missing data-points were imputed using Random Forest Imputation by the R package missForest (Stekhoven and Bühlmann 2012) with parameters ntree = 100, maxiter = 10, and parallelize = variables. The target traits were the two sets of anthracnose disease resistance scores issued from each year of evaluation (traits antr04 and antr05) and the average score across years (trait antr_avg).

For genome-wide predictions, we applied the ridge regression BLUP model (Searle et al. 2008), which displayed comparatively high predictive ability in prior studies on other legume species such as pea (Annicchiarico et al. 2017a), white lupin (Annicchiarico et al. 2019), and alfalfa (Annicchiarico et al. 2015). Ridge regression BLUP analysis was performed through the R software package rrBLUP (Endelman 2011), assessing the model predictive ability as Pearson’s correlation between true and predicted scores. In particular, the mixed model:

$$ y= Zu+e $$

with y being the phenotypes, Z being the genotype matrix, and u a vector of random effects with variance σ2u was solved in a maximum likelihood (ML/REML) framework using the function “mixed.solve” from rrBLUP package. With this approach, the (equivalent in ML of the) ridge parameters is estimated via λ = σ2e/σ2u and thus it can be computed directly from the data without variance component estimation.

The model was trained in a tenfold cross-validation schema, and training was repeated 50 times for each trait and then averaged to ensure numerical stability, using the R package GROAN (Nazzicari and Biscarini 2017).

Results

Twenty PCR-based markers were developed for two major anthracnose resistance QTLs

The region from 0.00 to 4.74 cM on the linkage group ALB02, which carried the antr04_1/antr05_1 anthracnose resistance QTL (explaining approximately 25–28% of phenotypic variance) (Książkiewicz et al. 2017; Phan et al. 2007) and contained 22 sequence-defined markers developed by GBS, was selected. Perfectly matching transcriptome sequences were assigned for 14 markers (64%) using available datasets (Książkiewicz et al. 2017; O'Rourke et al. 2013). Eleven GBS markers were selected for transformation to PCR-based markers (one directly and ten using complementary transcriptome sequences), based on the position on the linkage map and availability of affordably priced restriction enzymes. The summary of this transformation is included in Supplementary Table S5. Comparative mapping to L. albus transcriptome and L. angustifolius genome sequences (Hane et al. 2017) provided coordinates for intron/exon boundaries for ten markers. The list of L. albus transcriptome contigs anchored to GBS sequences is provided in Supplementary Table S6, whereas the coordinates of corresponding L. angustifolius genome regions are provided in Supplementary Table S7. The position of markers confirmed the highly conserved collinearity between the regions of L. albus linkage group ALB02 and L. angustifolius chromosome NLL-20. The WANR1 marker already implemented in marker-assisted selection of antr04_1/antr05_1 QTL in Australian breeding programs (Yang et al. 2010) was also included in the analysis. PCR amplicons were obtained for all primer pair combinations. Restriction enzyme cleavage of these amplicons yielded expected products for all markers except TP254603, which revealed additional unspecific amplification. The list of developed markers with primer sequences, PCR amplification temperature, validated enzymes, and restriction product sizes is given in Table 1.

A second region selected for marker development was that from 115.99 to 128.75 cM on the linkage group ALB04, which carried antr04_2/antr05_2 anthracnose resistance QTL (explaining approximately 15–23% of phenotypic variance) (Książkiewicz et al. 2017; Phan et al. 2007) and included 23 GBS markers. Twelve GBS sequences were chosen for marker development (Supplementary Table S5). Alignment to L. albus transcriptome and L. angustifolius genome sequences provided matching sequences for ten and twelve GBS sequences, respectively, and provided novel evidence for collinearity between the regions of L. albus linkage group ALB04 and L. angustifolius chromosome NLL-02 (Supplementary Table S6 and S7). WANR3 marker used for molecular selection of antr04_2/antr05_2 allele in Australian breeding programs (Yang et al. 2010) was also included in the analysis. PCR amplicons were obtained for 11 primer pairs (85%). Restriction enzyme cleavage confirmed the presence of expected polymorphic loci in all amplicons (Table 2).

PCR-based markers refined mapping resolution at both QTL loci

The segregation pattern of 22 markers in the white lupin RIL mapping population was resolved. Newly developed GBS-derived and previously published simple sequence repeat (SSR)-derived PCR markers revealed high amplification stability, providing on average 98.4% (antr04_1/antr05_1) and 98.5% (antr04_2/antr05_2) reads for the RIL population. Taking into consideration that GBS markers had on average 71.4% RIL data for antr04_1/antr05_1 and 74.4% for antr04_2/antr05_2, this approach improved segregation data by 37.8% and 32.4%, respectively (Supplementary Table S8 and S9). Thus, it contributed to linkage mapping refinement and increased map resolution as visualized by smaller blocks of co-segregating markers in both regions (Fig. 1). The order of markers was highly reproducible, as highlighted by low SD values of relative marker positions calculated for ten mapping runs: from 0.0 to 0.6 for antr04_1/antr05_1 and from 0.0 to 2.2 for antr04_2/antr05_2 loci (Supplementary Table S9). High fidelity of these markers was reflected by high LOD values of linkage to adjacent markers, ranging from 22.3 to 55.1 in antr04_1/antr05_1 and from 18.6 to 56.0 in antr04_2/antr05_2.

Fig. 1
figure 1

Major QTLs for anthracnose resistance in white lupin. Linear plots show LOD values (threshold 3.5); rectangles, LOD-based QTL ranges (LOD 2.0 and 1.0 below the maximum value), whereas bar graphs visualize corresponding linkage group fragments. Names of markers included into PCR-based assay are bold faced. Colors correspond to QTL assays: interval mapping, IM, blue (antr04, the first year) and green (antr05, the second year); composite interval mapping, CIM, pink (antr04) and red (antr05). Linkage groups and LOD graphs are drawn to scale

The updated linkage map was subjected to QTL mapping using phenotype observations from previous studies (Książkiewicz et al. 2017). The presence of two major anthracnose resistance QTLs in linkage groups ALB02 and ALB04, resolving about 45–50% of observed phenotypic variance, was confirmed (Table 3, Fig. 1). Linkage map improvement contributed to higher LOD values compared to the earlier study (Książkiewicz et al. 2017), indicating strengthened statistical significance of QTL mapping results. LOD values for antr04_1/antr05_1 locus were 13.3 in this study vs 10.3 (the most recent linkage map) in interval mapping (MapQTL), and 32.8 vs 22.7 in composite interval mapping (WinQTL Cartographer). The same result was observed for antr04_2/antr05_2, yielding LOD values 5.4 vs 5.2 in interval mapping and 26.6 vs 14.7 in composite interval mapping. The position of the LOD peak for the major resistance QTL (antr04_1/antr05_1) was almost identical for both experiments and methods, fitting within the range of 0.2 cM. The position of the LOD peak for the second QTL (antr04_2/antr05_2) covered the range of 6.2 cM.

Table 3 Two major anthracnose resistance QTLs detected in a recombinant inbred line population of white lupin. PVE - proportion of phenotypic variance explained by QTL

Three markers revealed applicability for Ethiopian allele selection

Pearson correlation between the observed anthracnose resistance phenotype and marker genotype in the RIL population was in the range of 0.49–0.58 for antr04_1/antr05_1 and 0.28–0.39 for antr04_2/antr05_2. The significance (both one-tailed and two-tailed probability) of the correlation coefficients, given the correlation values and the sample size (190 RILs with anthracnose resistance scores), was very high: p values were ~ 0.0 for antr04_1/antr05_1 and below 0.00009 for antr04_2/antr05_2.

Significant linkage disequilibrium decay was observed in both anthracnose resistance QTL regions in a set of 107 white lupin core collection lines (Fig. 2, Supplementary Table S10). As the resistance allele originated from one mountainous region of Ethiopia (Adhikari et al. 2013; Raman et al. 2014), it is uncommon to have it in the germplasm which has not been crossed with any Ethiopian line. Therefore, we expected to have “susceptible” marker scores for all European, African, and Middle East lines except the Ethiopian parent P27174 (distribution ratio like 106 vs 1). Simple matching coefficients were calculated, to compare the marker and expected phenotype. These values ranged from 0.25 to 0.96 for antr04_1/antr05_1 and 0.33–0.89 for antr04_2/antr05_2, indicating high similarity of marker pattern and hypothetical possession of resistant alleles.

Fig. 2
figure 2

Linkage disequilibrium pattern observed for PCR-based markers from ALB02 (a) and ALB04 (b) linkage groups. The set of 107 white lupin lines originating from 22 countries (81 primitive populations and landraces and 26 domesticated) was used to estimate R2 values

We calculated values of the Rogers-Tanimoto (RT) coefficient, to address the putative applicability of markers in the selection of germplasm for further disease resistance assays. This coefficient is a variant of the simple matching coefficient that gives double weight to mismatching variables, thereby resulting more convenient for the analysis of false-positive scores. We found high RT values, with maxima of 0.93 in antr04_1/antr05_1 and 0.80 in antr04_2/antr05_2. This was a considerable improvement over the previous markers, highlighted by RT values as low as 0.59 for antr04_1/antr05_1 (WANR1) and 0.49 for antr04_2/antr05_2 (WANR3). Two newly developed markers for antr04_1/antr05_1 (TP222136 and TP47110) and one for antr04_2/antr05_2 (TP338761) were revealed to be applicable for selection of Ethiopian alleles with > 90% confidence (Table 4, Supplementary Figure S1).

Table 4 Results of PCR marker validation: distance to QTL peak on the linkage map (cM), correlation values between anthracnose resistance phenotype and marker genotype in RIL population, and simple matching (SM) and Rogers-Tanimoto (RT) coefficient values, in lines of a white lupin core collection

Genome regions carrying anthracnose resistance QTLs encode candidate disease resistance genes

Alignment of marker sequences developed in this study (LC416306–LC416345) to the L. albus genome assembly (Hufnagel et al. 2020) revealed high collinearity between the linkage map and genome assembly in both QTLs. The list of genes identified in the regions of white lupin genome carrying anthracnose resistance loci is provided in the Supplementary Table S12. Several candidate genes were identified, namely Lalb_Chr02g0142231 (putative protein ENHANCED DISEASE RESISTANCE 2, EDR2), Lalb_Chr02g0141611 (putative protein kinase RLK-Pelle-LRR-XI-1 family), and Lalb_Chr02g0141701 (putative transferase, protein kinase RLK-Pelle-LRR-II family) for the antr04_1/antr05_1 locus and Lalb_Chr04g0264801 (putative protein kinase RLK-Pelle-RLCK-IXb family) for the antr04_2/antr05_2 locus. Screening of coding sequences using the Plant Resistance Genes database revealed the presence of kinase, coiled-coil (CC), leucine-rich repeat (LRR), nucleotide binding site (NBS), Toll/interleukin-1 receptor (TIR), and transmembrane (TM) domains (Supplementary Table S13).

Genomic selection displayed moderately high ability to predict anthracnose tolerance

Average predictive abilities of whole-genome regression models for the three traits (antr04, antr05, and antr_avg) are reported in Fig. 3. The best predictive ability values were found for antr05 and antr_avg, which showed similar values. The trait antr04 displayed lower predictive ability values, probably because of the lower number of available samples (only 151, compared to 191 available for antr05). When one of the two scores was missing, the average resistance score (antr_avg) was based only on data of the other score (Książkiewicz et al. 2017). The effect of filtering on maximum allowed missing rate for markers exhibited a clear pattern of better predictions with low missing rates. The absolute best performances were achieved for 10% maximum missing rate (implying 631 markers), which resulted in predictive abilities of 0.491, 0.558, and 0.557 for antr04, antr05, and antr_avg, respectively. This can be linked to the effective total missing rate pre-imputation. For the filtering thresholds of 10%, 20%, 30%, 40%, and 50%, the utilized data set resulted in missing rates of 1.8%, 4.8%, 7.2%, 9.8%, and 13.2%, respectively.

Fig. 3
figure 3

Predictive ability of ridge regression BLUP models as measured by Pearson’s correlation between true and predicted values as a function of the maximum allowed missing rate for single SNP markers, for three anthracnose disease resistance scores from the first year (antr04), the second year (antr05), and mean from both years (antr_avg). Values are derived through 10-fold cross-validations and averaged over 50 repetitions

Discussion

Ethiopian sources of anthracnose resistance arise from non-domesticated primitive lines

Following the anthracnose appearance in 1996, all L. albus lines from the Lupin Genetic Collection at the Department of Agriculture and Food Western Australia were phenotyped for anthracnose resistance. From more than 8 hundred accessions tested, originating from 21 countries and four continents, all breeding materials (416 lines) and 97% of primitive populations and landraces were found to be very susceptible to anthracnose (Adhikari et al. 2009). Despite testing of a large seed collection, a significant and reproducible level of resistance was found only in several Ethiopian landrace accessions, P27172, P27174, P27175, P27178, P28512, P28523, and P28538 (Adhikari et al. 2009). The three most resistant lines (P27172, P27174, P27178) were collected in one district, Debre Markos, at altitudes around 2000 m. Genotyping of 94 white lupin accessions with PCR-based SSR and microarray-based Diversity Array Technology markers revealed that Ethiopian accessions formed a separate clade, indicating their close genetic relation and distinctiveness from other germplasm (Raman et al. 2014). Assay based on the analysis of agronomical and phenological traits revealed relatively high similarity of Ethiopian white lupin germplasm grouped into several clusters showing significant genetic distance from the German accession used as an outgroup (Atnaf et al. 2015). Recent SSR-based screening of 212 Ethiopian white lupin landraces confirmed low population differentiation among four major white lupin collection areas in the country (Atnaf et al. 2017). These observations have substantial consequences for breeders, who shall apparently deal with the only genetic source of white lupin anthracnose resistance worldwide, buried in landrace germplasm locally adapted to mountain Ethiopian climatic conditions and carrying numerous undesired traits such as high alkaloid content, late flowering, vernalization responsiveness, and low yield (Adhikari et al. 2009; Kroc et al. 2017; Lin et al. 2009; Phan et al. 2007). Indeed, Ethiopian germplasm displayed poor adaptation to European environments in a multi-environment evaluation (Annicchiarico et al. 2010). The line P27174 as a well-recognized anthracnose resistance donor was crossed with the very susceptible Kiev Mutant to generate an advanced recombinant inbreed line population for mapping studies (Phan et al. 2007). Anthracnose phenotyping in independent experiments, followed by marker development and linkage mapping, provided clear evidence for a quantitative pattern of resistance, indicating the involvement of several unrelated heritable factors (Adhikari et al. 2009; Książkiewicz et al. 2017; Phan et al. 2007; Vipin et al. 2013; Yang et al. 2010). White lupin breeding for anthracnose resistance has been hampered so far by a lack of molecular markers to track resistant alleles. There were only few markers linked to white lupin agronomic traits published hitherto, one for the low-alkaloid pauper locus and three for two anthracnose resistance loci, and the usefulness of these markers was limited by non-negligible ratios of false-positive scores in diversified germplasm (Lin et al. 2009; Yang et al. 2010). White lupin anthracnose resistance donors are late flowering. Both traits are under polygenic control, which makes breeding attempts much more challenging than in the narrow-leafed lupin. Combining anthracnose resistance with early flowering by traditional breeding has been fairly unsuccessful over two decades, and the hypothesis of a linkage between these traits was put forward (Adhikari et al. 2009). This hypothesis was recently confirmed, as the major anthracnose resistance QTL (antr04_1/antr05_1) and one of the major early flowering QTLs (nonv05_1/nonv15_1/nonv16_1) were located in the same linkage group (Książkiewicz et al. 2017; Rychel et al. 2019). Moreover, one of the few minor early flowering QTLs was found just several centimorgans away from the antr04_1/antr05_1 locus.

Availability of marker-assisted selection for lupin breeders

The white lupin PCR-based marker toolbox contains several markers developed for candidate genes (FLOWERING LOCUS T, GIGANTEA, SEPALLATA, and FRIGIDA) conferring QTLs of flowering time (Rychel et al. 2019), a pair of markers tagging low-alkaloid pauper locus, including one anchored in a candidate gene (LaAT), and six markers for anthracnose resistance, including three developed in this study (Lin et al. 2009; Rychel and Książkiewicz 2019; Rychel et al. 2019; Yang et al. 2010).

Unlike white lupin, narrow-leafed lupin exhibited high effectiveness of anthracnose resistance breeding, leading to the development of a large collection of resistant germplasm (Fischer et al. 2015; Yang et al. 2012). Three factors contributed to this achievement: (i) the monogenic inheritance of the resistance, (ii) the presence of resistance alleles in germplasm that had already been subjected to selection for regional adaptation, and (iii) the development and successful implementation of truly selective markers into breeding programs. The resistance to anthracnose in narrow-leafed lupin is controlled by several single dominant genes that were discovered in different germplasm resources, namely Lanr1 in cv. Tanjil, AnMan in cv. Mandelup, and LanrBo in the breeding line Bo7212 (Fischer et al. 2015; Yang et al. 2004; Yang et al. 2008). Interestingly, none of these loci were localized in genome regions collinear to the white lupin regions carrying QTLs for anthracnose resistance (Książkiewicz et al. 2017). Nevertheless, annotation of white lupin genome regions carrying two major anthracnose resistance QTLs revealed the presence of protein domains which are typical for disease resistance genes (Bent 1996; Dangl and Jones 2001; Jones 2001). Moreover, a homolog of the EDR2 gene conferring Arabidopsis thaliana resistance to biotrophic powdery mildew pathogen Erysiphe cichoracearum (Tang et al. 2005) was identified.

Narrow-leafed lupin breeding was also facilitated by the development of sequence-defined SSR-derived markers linked to key agronomic traits. These include soft seediness (Li et al. 2012a), reduced pod shattering (Boersma et al. 2007b; Boersma et al. 2009; Li et al. 2010; Li et al. 2012b), low alkaloid profile (Li et al. 2011), early flowering (Boersma et al. 2007a; Nelson et al. 2017), and resistance to various fungal diseases, including anthracnose (Yang et al. 2004; Yang et al. 2008; You et al. 2005), Phomopsis stem blight (Yang et al. 2002), and lupin rust (Sweetingham et al. 2005). These markers, except those for lupin rust, were subsequently implemented in Australian breeding programs. The development of low-cost high-throughput sequencing methods opened new possibilities for molecular genetics. Mass sequencing has been exploited in narrow-leafed lupin, to provide several markers linked to anthracnose and Phomopsis stem blight resistance genes. Some of these SNPs were revealed to have a lower recombination rate between the particular trait and marker loci than the previously developed SSR-based markers, and were implemented into breeding practice (Książkiewicz et al. 2020; Książkiewicz and Yang 2020; Yang et al. 2015a; Yang et al. 2015b; Yang et al. 2013a; Yang et al. 2013b; Zhou et al. 2018). All of these molecular resources have greatly contributed to the improvement of narrow-leafed lupin as a crop.

Marker-assisted selection in lupin breeding in Europe is currently at an initial stage, contrary to Australia, where it has been commenced for more than 15 years, targeting around 20,000 plants annually (Książkiewicz and Yang 2020; Yang and Buirchell 2008). Various techniques for polymorphism detection were implemented in the Australian breeding program, including 96-well polyacrylamide denaturing gel electrophoresis, CAPS/dCAPS, single-stranded conformation polymorphisms, high-resolution melting, and high-throughput allele-specific nanofluidic array (Boersma et al. 2009; Li et al. 2012b; Yang et al. 2015a; Yang et al. 2015b; Yang et al. 2002; Yang et al. 2013a; Yang et al. 2013b; Zhou et al. 2018). In this study, CAPS/dCAPS markers were developed, which are preferred in small-scale experiments (Shavrukov 2016). Moreover, other markers for white lupin agronomic traits are also based on PCR and electrophoresis (Książkiewicz et al. 2017; Lin et al. 2009; Rychel and Książkiewicz 2019; Rychel et al. 2019; Yang et al. 2010). However, such markers are relatively expensive per data point (about 1–1.5 USD per sample) and have limited capacity. As sequences of developed markers have been publicly released (see accession numbers LC416306–LC416345), selected SNPs can be transformed to the other system allowing the high-throughput approach. Popular medium-scale systems include PCR with TaqMan probes, Kompetitive Allele Specific PCR (KASP) and RNase H2 enzyme-based amplification (rhAmp) (Broccanello et al. 2018). These methods are considerably cheaper than CAPS and dCAPS (0.12–0.41 USD per sample), provided that the number of analyzed samples corresponds to the number of reactions supported by the assay mix (min. 2–10 thousand) (Ayalew et al. 2019).

Genomic selection can assist marker-assisted selection of anthracnose resistance

In this study, QTL mapping was performed to investigate the association of the resistance score to specific parts of the genome. A valid alternative analysis was genome-wide association mapping (GWAS), which can be used to score the association between thousands of SNP markers and the desired trait. Thus, GWAS was recently exploited in L. angustifolius, highlighting novel candidate genes for phenology, growth, and yield traits (Plewiński et al. 2020). However, it was shown that GWAS can lead to false associations if the trait of interest is present in a very small proportion (< 5%) of the population (Sonah et al. 2015). QTL mapping does not present such a limitation and was, therefore, preferred for the trait in the test RIL population.

An additional aim of this study was investigating the ability of a whole-genome genomic regression approach to predict anthracnose disease resistance scores. This modeling approach is typically used for marker-based selection of genetically complex traits (Heffner et al. 2010) relative to crop yield or crop quality, and did prove promising to improve the selection efficiency for crop yield and key quality traits of various legume species (Annicchiarico et al. 2015; Annicchiarico et al. 2017b; Biazzi et al. 2017; Jarquin et al. 2014). This unprecedented attempt to apply genomic selection for a legume crop trait controlled mainly by a few major QTLs revealed predictive ability values in the range of 0.50–0.55. These values are comparable with those observed for several morphophysiological traits such as pod fertility, individual seed weight, plant height, leaf size, and mainstem proportion of seeds and number of leaves in a pioneer study focusing on a world collection of white lupin landraces (Annicchiarico et al. 2020). Results from the same germplasm collection or other material indicated that onset of flowering (also under oligogenic control) is highly predictable genomically (Annicchiarico et al. 2020), whereas the predictive ability of white lupin grain yield was in the range of 0.40–0.51. The current predictive ability values may justify the inclusion of this resistance trait in GBS-based genomic selection programs in which this trait is one of several target traits under selection, possibly combining the predicted values of the different target traits into a selection index. Our genome-wide predictions were obtained using trait-agnostic markers, leveraging the similitude of the selection candidate to a general “resistant genetic profile” to predict future resistance. Besides the possible practical interest of applying genomic selection for the resistance trait by the same operational tool used of other target traits (i.e., without adopting an additional marker-based tool specific for the trait), the whole-genome selection approach may also allow explaining some of the ~ 50% phenotypic variance that was not explained by the two observed QTLs, thereby adding some of the unavoidable missing heritability relative to QTLs (Brachi et al. 2011; Manolio et al. 2009) to the measurable contribution of the genetic background. For example, GBS-based genomic selection may address the possible presence of a third QTL for resistance to anthracnose (antr04_3) whose effect, however, did not reach statistical significance in composite interval mapping (Książkiewicz et al. 2017).

We noted that a lower number of markers (631, at 10% missing rate) resulted in predictive abilities higher than those obtained with more markers (e.g., 3230 at 50% missing rate). While this can be surprising at face value, it must be considered that the information provided by markers with more missing data is more noisy. In other terms, there is a trade-off between accepting more information at the cost of that information being of lower quality.

Breeding white lupin for anthracnose resistance without molecular selection has been an arduous process, as the polygenic control of major traits implies low frequency of desired phenotypes in the progeny. Moreover, anthracnose testing is largely destructive for seed samples, because of the impact of the disease and the lack of full immunity. In the present study, two types of molecular tools were developed: markers to focus essentially on this trait by the PCR-based procedure, and genomic selection to enable future multi-trait selection based on trait-specific models for material genotyped by GBS. These tools could enable the selection of germplasm carrying candidate Ethiopian alleles of two major anthracnose resistance loci what would decrease the number of lines subjected to disease screening. Also, they could prove valuable for locating other putatively tolerant material in germplasm collection. It should be noted that white lupin anthracnose resistance assays were performed using the approach exploiting spreader plants inoculated by spray inoculation, to mimic naturally occurring anthracnose (Phan et al. 2007; Yang et al. 2010) and an isolate C. lupini isolate 96A4 (IMI375715), classified into vegetative compatibility group 2 (Yang and Sweetingham 1998). Other methods, such as direct spraying of plants with spore suspension, inoculation of seeds before sowing, or injection of spores using a hypodermic needle to cotyledons or stems, were also used in lupin studies (Weimer 1952). Moreover, the Kiev Mutant × P27174 mapping population was phenotyped for resistance to anthracnose in Australia in wintertime, which is characterized by relatively low daily temperatures compared to those occurring in European regions where lupins are cultivated as a spring-sown crop. As the pathogenicity of C. lupini depends on the temperature pattern during infections and the strain of the pathogen (Dubrulle et al. 2020; Thomas et al. 2008), candidate lines selected by molecular markers should be subjected to disease resistance screening in local environment using domestic isolates.