Introduction

The genus Secale comprises typical representatives of the Mediterranean flora, and its members are widely distributed from Central Europe and the Western Mediterranean through the Balkans, Anatolia, Israel, and the Caucasus to Central Asia. An isolated population also occurs in South Africa. The genus includes annuals, short-lived and long-lived perennials, wild, weedy, and cultivated taxa (Sencer and Hawkes 1980).

Common rye (Secale cereale L.) is one of the most important cereal crops cultivated in Eastern and Northern Europe. It is characterized by the ability to produce high yields even when grown under environmental stress conditions, i.e., low temperatures, drought, and low soil fertility. The presence of disease resistance genes reduces the need for intensive chemical protection of this crop (Schlegel and Melz 1996; Korzun et al. 2001). Moreover, rye offers high contents of many nutritionally favorable compounds such as a whole suite of minerals (Zn, Fe, P), beta-glucans, resistant starch, and bioactive compounds. Rye products are characterized by a high level of dietary fiber (Andersson et al. 2009) that may contribute to positive health effects (Rosén et al. 2011).

Relatively good progress in rye breeding has been achieved in the case of hybrid cultivars (Geiger and Miedaner 1999). The two heterotic pools Carsten and Petkus were selected as the most promising heterotic pattern. During hybrid breeding, parental inbred lines were developed by recurrent selfing of plants from both pools, or inbred lines were generated by second-cycle breeding (Fischer et al 2010). These processes contributed to the reduction of genetic diversity within the heterotic pools. The consequences of such narrowing of germplasm diversity in breeding programs are a decrease in selection gain and an increase in susceptibility to biotic and abiotic stresses, coupled with the threat of further genetic erosion (Smith et al. 2004). To prevent genetic erosion, it is important to continuously broaden the genetic base of the established heterotic pools. However, a major difficulty preventing the exploitation of genetic resources in hybrid rye breeding is the prevalence of self-incompatibility. Furthermore, in the absence of any prior knowledge about the relationships within heterotic pools, testcrosses with both such pools have to be developed and evaluated (Fischer et al. 2010). The implementation of modern methods based on knowledge of the genetic diversity within the species S. cereale could significantly improve the efficiency of breeding, and thus increase the competitiveness of rye.

Studies on relationships within (Matos et al. 2001; Bolibok et al. 2005; Burger et al. 2006) and between Secale species (Shang et al. 2006; Skuza et al. 2007) have been carried out using a variety of different methods. The molecular tools employed in these analyses include PCR-RFLP (Isik et al. 2007), RAPD (Matos et al. 2001; Bolibok et al. 2005), ISSR (Bolibok et al. 2005), AFLP (Chikmawati et al. 2005 and 2012), SAMPL (Bolibok et al. 2005), SSR (Bolibok et al. 2005; Shang et al. 2006; Fu et al. 2010), DArT (Bolibok-Brągoszewska et al. 2009, 2014), and isoenzymatic markers (Matos et al. 2001; Burger et al. 2006). However, in the most cases, the number of accessions included in these analyses has been relatively low. One exception is our recent study (Bolibok-Brągoszewska et al. 2014) where 379 rye accessions, including cultivars, landraces, and wild relatives, were used. The reported results showed that cultivated rye forms, especially modern cultivars, are genetically similar, while relatively high genetic diversity was observed between landraces. Moreover, the lack of correlation between clustering of improved accessions and geographic origin was observed. These results may suggest that rye accessions from diverse geographic regions have common genetic background and indicate the extensive germplasm exchange. The DArT method employed by Bolibok-Brągoszewska et al. (2014) is widely used in studies on genetic diversity in crops. It permits the simultaneous detection of several thousand DNA polymorphisms throughout the whole genome. However, in the case of rye, which is heterogenic and characterized by a high level of heterozygosity, some information about genetic diversity may be lost during genotyping using dominant DArT markers. Simple sequence repeat (SSR) markers, which are amplified microsatellite sequences, are very useful in genetic studies due to their multi-allelic nature, codominant inheritance, and high informativeness (Rakoczy-Trojanowska and Bolibok 2004). Rye SSR markers were first developed over 10 years ago (Saal and Wricke 1999; Hackauf and Wehling 2002) and have also been used in studies on genetic diversity (Shang et al. 2006; Bolibok et al. 2005). Recently, a considerable number of rye SSR markers was developed by Haseneyer et al. (2011), but their utility in genetic diversity studies has yet to be examined.

In the literature, there are several examples of the usefulness of SSRs in a comparison with other marker systems like DArT and SNP (Laidò et al. 2013; Hurtado et al. 2008; Simko et al. 2012). Such analyses were conducted in tetraploid wheat (Laidò et al. 2013), cassava (Hurtado et al. 2008), and sugar beet (Simko et al. 2012). In genetic studies of the genus Secale, SSR data would be useful to confirm results obtained using DArT markers (Bolibok-Brągoszewska et al. 2014) and extend our knowledge about genetic diversity structure.

The main purpose of the present study was to evaluate genetic diversity within a broad population consisting of S. cereale L. accessions using SSR markers. We also wanted to broaden the results of our earlier genetic diversity analysis obtained using the DArT system.

Materials and Methods

Plant Material

The plant material consisted of 367 S. cereale ssp. cereale accessions from different parts of the world: 90 population cultivars, 46 cultivated materials, and 155 landraces obtained from the PAS BG in Powsin (Poland); 27 population cultivars, 12 hybrid cultivars, and 26 breeding strains from European breeding companies; and 11 rye accessions from the collection of A. Lukaszewski from the University of California, Riverside (USA). Out of 367 accessions used in this study, 362 were examined previously using DArT markers (Bolibok-Brągoszewska et al. 2014). The assembled plant material was selected to represent genetic and geographical variation of cultivated rye. Polish rye inbred line L318 and wheat cultivar Chinese Spring were used as references. Detailed information about rye accessions used including source of seeds, improvement status, and region of origin can be found in Online Resource 1.

Plant Propagation and DNA Isolation

Each rye accession used in genetic diversity analysis was represented by 96 individual plants. Plants were grown in multi-trays in an air-conditioned glasshouse. Two-week-old leaves were collected and lyophilized. Total DNA was isolated using a Mag-Bind Plant DNA 96 kit (Omega Bio-Tek, Norcross, USA). Equal amounts of DNA from the 96 plants representing each accession were pooled into one sample. DNA purity and concentration were evaluated using a NanoDrop 2000 spectrophotometer (Thermo Scientific, Waltham, USA).

SSR Assays

Forty-two public SSR markers (Hackauf and Wehling 2002; Saal and Wricke 1999) (Online Resource 2) were included in the genetic diversity study. The initial selection of publicly available markers was made based on literature data. Markers representing all chromosomes and characterized by a high polymorphism level were selected (Bolibok et al. 2005; Shang et al. 2006; Hackauf et al. 2009).

The analysis of genetic diversity was performed in two steps. In the first step, SSR markers were used to analyze six highly divergent rye accessions, selected based on the results of DArT marker analysis (Bolibok-Brągoszewska et al. 2014): cultivars Dańkowskie Nowe and Zima from PAS BG, landrace M1-72-73-321 from PAS BG, cultivar Prima from the collection of A. Lukaszewski, cultivar Palazzo from KSW Lochow, and cultivar Riihi from the company Boreal. Based on the results of this initial screening, 22 SSR markers were selected for the second step of the analysis including all 367 rye accessions (Table 1). The marker selection criteria were the presence of good quality and polymorphic products. Moreover, only markers amplifying a single locus were selected, based on the results obtained using the inbred line L318.

Table 1 Parameters characterizing 22 SSR markers used in genotyping

All PCRs were conducted in 15 μl volumes containing 50 ng of genomic DNA, 0.5 U of DreamTaq polymerase (Thermo Scientific, Waltham, USA), 1.5x DreamTaq buffer, 0.2 mM dNTP’s, 0.5 mM MgCl2, and 0.2 μM of each primer. Amplification was carried out in a Veriti 96 Thermal Cycler (Applied Biosystems, Foster City, USA) under conditions specified by the high-stringency protocol of Pillen et al. (2000). The amplified products were separated on 6 % denaturing polyacrylamide gels and visualized by silver staining (Benbouza et al. 2006). The consistency of allele sizing across gels and accessions was ensured using 10 bp Ladder and 50 bp Ladder (Thermo Scientific, Waltham, USA). In case of amplification failure, analyses were repeated until a clear result was obtained.

Data Analysis

For ease of analysis, we labeled rye accessions according to their geographical origin, source of the seeds, and assignation to one of the heterotic pools. We also conducted separate analyses only for accessions from breeding companies and from PAS BG.

Based on the PCR results, a binary matrix was constructed, where the presence of an amplified product was scored as 1 and the absence of a product as 0. Because the templates used in SSR analysis were pooled DNA from 96 plants and more than two alleles per sample were observed, separate bands were treated as individual alleles. The bands were scored as present or absent, and allele frequencies within individual accessions were not established. The number of alleles, number of polymorphic alleles, proportion of polymorphic products (the number of polymorphic alleles divided the total number of alleles), and percentage of frequent (appearing in >50 % of rye accessions) and rare (appeared in <5 % of accessions) alleles were counted. Polymorphic information content (PIC) was calculated according to the formula of Roy et al. (2002).

The total number of alleles, number of alleles with a frequency of <5 %, number of private alleles, number of alleles found in more than 25 and 50 % of the accessions from subgroups created based on source of seeds and improvement status, mean diversity, and unbiased diversity were calculated using GenAlEx 6.501 (Peakall and Smouse 2012).

Based on the binary matrices, genetic similarity (GS) was calculated using Jaccard and Dice coefficients with NTSYSpc 2.1 software (Rohlf 2000). Correlations between two similarity matrices were demonstrated by Mantel’s test applied in NTSYSpc.

STRUCTURE software (Pritchard et al. 2000) was used to infer population structure. To identify the number of populations (K) capturing the major structure in the data, a burn-in period of 100,000 Markov Chain Monte Carlo (MCMC) iterations was used, with 100,000 run length and admixture model following the Hardy-Weinberg equilibrium. Five independent runs were performed for each simulated value of K, ranging from 1 to 20. The most likely number of K was determined using the DeltaK method in Structure Harvester (Earl and von Holdt 2011) with the Evanno correction (Evanno et al. 2005). Permutations of the output of STRUCTURE analysis were calculated with CLUMPP (Jakobsson and Rosenberg 2007) using independent runs to obtain a consensus matrix. Bar graphs of the population structure results were generated with Distruct (Rosenberg 2004).

A 1-GS distance matrix was used to construct a dendrogram in MEGA 5.2 (Tamura et al. 2011) using the neighbor-joining (NJ) clustering method. GS matrix was used to conduct principal coordinate analysis (PCoA) in NTSYSpc 2.1.

The significance of differences between populations indicated by STRUCTURE, between groups representing different geographical origins and between accessions groups from different sources and with a different improvement status were tested by analysis of molecular variance (AMOVA) in GenAlEx 6.501.

For core collection sampling, the maximization (M) algorithm implemented through a modified heuristic algorithm in PowerCore 1.0 was used (Kim et al. 2007).

Results

SSR Marker Informativeness

The first step in this analysis employing 42 publicly available SSRs resulted in establishment of the amplified product length (Online Resource 2) and quality.

For the second step of the analysis, 22 SSR markers were selected. In total, 142 bands were detected in 367 rye accessions, and 132 bands were polymorphic (93 %) (Table 1). The average PIC value for all markers used was 0.57. The highest PIC value (0.93) was obtained for SCM152, and the lowest PIC (0.18) was determined for SCM050. In the case of the markers with the highest PIC value, frequent alleles appeared in ≤40 % of accessions.

Analysis of the total allele pattern within groups of accessions sorted according to the source of the seeds and improvement status (Table 2) showed that the highest number of detected bands (130) was observed for landraces (PAS BG) and the lowest for breeding strains from Danko (93). The number of bands with a frequency of <5 % was the highest for landraces (18) and the lowest for the collection of A. Lukaszewki (0). There were no unique bands in the group of breeding strains from Danko, while the highest numbers of unique bands (4) within groups were observed for the collection of A. Lukaszewski and landraces. Mean heterozygosity was also the lowest for breeding strains from Danko (h = 0.113), while this value was nearly two fold higher for the collection of A. Lukaszewski (h = 0.218).

Table 2 Total allele pattern within groups of rye accessions according to the source of seeds

Genetic Similarity Analysis

Mantel’s test showed a strong positive correlation between Jaccard and Dice coefficient matrices with r = 0.991 and P = 0.001 (Online Resource 3), so only the Jaccard coefficient matrix was used in subsequent analyses. The highest GS (0.932) was observed between breeding strains SZK44 and SZK85 from Danko, and breeding strain SZK44 from Danko, whereas the lowest GS (0.255) was observed between cultivar Gonello F1 (KWS Lochow) and cultivar Dańkowskie Złote (Danko). The average GS value was 0.664. The average GS for accessions labeled according to the source of seeds ranged from 0.8 for breeding strains from Danko to 0.63 for the collection of A. Lukaszewski. The table with GS values can be found in the Online Resource 4.

Bayesian Model-Based Clustering

After analysis of the population structure of all rye accessions using STRUCTURE with Evanno correction, the peak of ΔK was observed for K = 2 (Online Resource 5). This suggested the presence of two main model-based populations, which were visualized in the graph (Fig. 1). In total, 270 (74 %) rye accessions were assigned to one of the two model-based populations, with more than 65 % of their inferred ancestry derived from one of the respective populations. Populations one (P1) and two (P2) consisted of 144 (39 %) and 126 (34 %) accessions, respectively. The remaining 97 accessions (26 %) had mixed ancestry (P1P2).

Fig. 1
figure 1

STRUCTURE plot showing the population structure of 367 rye accessions with K = 2 clusters, based on 22 SSR markers

Population P1 contained mainly accessions from breeding companies (43 %) and cultivars from PAS BG (34 %), especially from Western and Central Europe. Only cultivar Gonello (KWS Lochow) belonged to the P2 population. Breeding strains from the company Danko were the most homogeneous group and were characterized by an inferred ancestry level of >80 %. The P1 population also contained landraces, cultivated materials from European countries, and accession from the A. Lukaszewski collection.

Population P2 was comprised mainly of landraces from PAS BG (79 %), while the representation of cultivars and cultivated materials from PAS BG was relatively lower: 8 and 10 %, respectively. The geographical origins of the majority of accessions classified as P2 were Asia, the Balkans, and Southern Europe. The remaining P2 accessions were cultivars and cultivated materials originating from Eastern and Central Europe, and South and North America. Two accessions from the A. Lukaszewski collection originating from North America were also assigned to P2.

Accessions characterized by mixed ancestry were mainly from PAS BG (90 %). The geographical origin of accessions with mixed ancestry was very diverse.

Cluster and Principal Coordinate Analyses

Several clusters were visible in the NJ dendorgram (Figs. 2 and 3). To simplify the description of the results, we distinguished two major clusters: I and II, by separating the longest branch into two, and further divided cluster II into six subclusters: IIa–IIf. Concerning the source of seeds and improvement status (Fig. 2), cluster I was comprised of modern cultivars, breeding, strains and two accessions form A. Lukaszewski collection. Cluster II contained mainly accessions from PAS BG. The most dispersed group was landraces which occurred in five subclusters (IIb, IIc, IId, IIe, and IIf). The majority of cultivars from PAS BG were closely clustered in one subcluster (IIe). Cultivated materials from BG PAS were also grouped mainly into one subcluster (IId). The collection of A. Lukaszewski was located mostly in subcluster IIf; however, relatively longer tree branches indicated higher genetic diversity within these accessions. Labeling according to the geographical origin (Fig. 3) showed that the majority of Asian, South European, and in Balkan accessions were grouped in cluster II, although they were dispersed among different subclusters. Accessions from Central Europe were located mainly in cluster I. On the other hand, accessions from Western and Eastern Europe were clustered in subcluster IIe.

Fig. 2
figure 2

NJ dendrogram showing genetic relationships among 367 rye accessions. Accessions labeled according to the source of seeds

Fig. 3
figure 3

NJ dendrogram showing genetic relationships among 367 rye accessions. Accessions labeled according to the geographic origin of plants

Two groups could be distinguished in the PCoA plot (Figs. 4 and 5). Accessions from breeding companies were noticeably distinct from accessions from PAS BG (Fig. 4). The most dispersed and diverse group consisted of landraces, whereas cultivated materials and cultivars from PAS BG were more tightly grouped. The collection of A. Lukaszewski was the most distinct group, but two of its accessions were placed close to the cultivars from breeding companies. Labeling based on the geographical origin (Fig. 5) showed that one the two groups mostly represented rye accessions from Central Europe and the other comprised accessions representing different geographical origins. Accessions from Central and Western Europe were grouped close to each other.

Fig. 4
figure 4

PCoA of 367 rye accessions based on 22 SSR markers. Accessions labeled according to the source of seeds

Fig. 5
figure 5

PCoA of 367 rye accessions based on 22 SSR markers. Accessions labeled according to the geographic origin of plants

PCoA comparing cultivars and breeding materials showed clear dissimilarity between accessions from the gene bank and from breeding companies. Two separate clusters were visible both in the dendrogram (Fig. 6) and PCoA graph (Fig. 7). One contained accessions from PAS BG, while accessions from breeding companies were located in the second cluster. Two samples of Dańkowskie Złote and two samples of Dańkowskie Nowe rye cultivars obtained from different sources (one from the Danko company and the second from PAS BG) were not close to each other, occupying separate positions in the PCoA graph and grouped in different clusters in the dendrogram.

Fig. 6
figure 6

NJ dendrogram showing genetic relationships among 165 rye accessions representing cultivars from gene bank and accessions obtained from breeding companies

Fig. 7
figure 7

PCoA of 165 rye accessions representing cultivars from gene bank and accessions obtained from breeding companies, based on SSR markers

In the PCo analysis with labeling according to the assignation to one of the heterotic pools (Petkus or Carsten), the majority of accessions with confirmed Petkus pedigree were closely grouped (Online Resource 6). However, the only accession with confirmed Carsten pedigree was grouped next to accessions derived from Petkus. Contemporary hybrid (F1) cultivars were dispersed throughout the PCoA plot.

Analysis of Molecular Variance

AMOVA analysis of the two model-based populations indicated by STRUCTURE showed that 86 % of the variation was due to differences within populations, while 14 % was due to differences between populations (P < 0.001). The pairwise PhiPT value (which is an analogue of FST in estimation of genetic differentiation) was 0.137 and indicated relatively high degree of differentiation between populations P1 and P2. Much greater proportion of variation within than among populations was found for groups of accessions from different sources and with different improvement status: 87 and 13 %, respectively (P < 0.001). When accessions were grouped based on geographical origin, these values were 92 and 8 % (P < 0.001). Pairwise PhiPT values estimated between groups of accessions with different source and improvement status indicated a wide range of genetic differentiation between them ranging from low (0.073) between cultivated materials and landraces from PAS BG to very high (0.275) between breeding strains from Danko and accessions from the A. Lukaszewski collection. In the case of plants grouped according to geographical origin, very low 0.003 PhiPT, indicating little genetic differentiation, was detected between accessions from South America and Northern Europe. The highest genetic differentiation (PhiPT = 0.217) was found between accessions from Western Asia and Central Europe (Online Resource 7).

Core Collection Sampling

Out of 367 rye accessions, 25 (6.8 %) were selected as core entries. Core collection entries represented all sources and geographical origins (Table 3). These accessions comprised 93.7 % of all alleles detected within the whole population. The GS value of the core collection ranged from 0.267 to 0.818 (average 0.585). The lowest GS was observed between cultivar Gonello F1 from KWS Lochow and Turkish landrace M1-72-73-194 from PAS BG. The highest GS was found between cultivated material Ceranja de Moreruela from Spain and Lithuanian cultivar Priekulskaja from PAS BG, and these accessions were clustered the closest to each other in the NJ dendrogram (Fig. 8).

Table 3 Accessions selected as core collection entries
Fig. 8
figure 8

NJ dendrogram showing relationships between rye accessions from the core collection

Discussion

In this study, the genetic diversity within S. cereale ssp. cereale was thoroughly evaluated using SSR markers. The panel of 367 accessions studied included contemporary and historical cultivars, landraces, and breeding materials from various geographical origins, as well as from various sources and represented the major portion of the inter-species genetic diversity. Therefore, it was possible, to determine whether factors such as geographical origin, source of seeds, and improvement status influence the genetic diversity. The size and variability of this collection helped us to obtain a clearer picture of the rye population structure. Moreover, the SSR markers employed in this study were derived from all rye chromosomes and represented the whole genome. Previous phylogenetic studies on rye have examined far fewer accessions, e.g., 42 in Ma et al. (2004), 47 in Shang et al. (2006), and 114 in Chikmawati et al. (2012). So far, the only other study of equivalent scale to analyze the genetic diversity of the genus Secale is that of Bolibok-Brągoszewska et al. (2014), which included 362 of the 367 accessions used here but employed DArT markers. Although, there is the possibility of obtaining a wide genome coverage with DArT markers, their dominant biallelic nature limits the range of the analysis to some extent. The allelic diversity is less informative using these markers than it would be with multi-allelic SSRs (White et al. 2008). In the case of a highly open-pollinated, highly heterozygotic and heterogenic species like rye, SSRs are considered to be the superior marker system for examining genetic diversity. A disadvantage of SSRs is their low throughput, especially in comparison with DArT markers. On the other hand, SSRs are highly accessible because of the relatively simple and cheap methodology, which does not require specialized or sophisticated equipment (Rakoczy-Trojanowska and Bolibok 2004).

In general, the presented results were largely consistent with those of Bolibok-Brągoszewska et al. (2014). In the DArT-based study, 379 accessions were analyzed: 362 cultivated rye accessions, which were used also in the present study, and 17 wild Secale accessions that were not analyzed using SSR markers. In the case of cultivated rye accessions, the accessions assigned based on SSR analyses to one of the model-based populations P1 and P2 were largely the same as accessions from populations P3 and P1, respectively, indicated in the DArT-based study (Bolibok-Brągoszewska et al. 2014). The wild accessions were classified mostly as P2 in the DArT-based study. Moreover, PCoA and NJ clustering in both works demonstrated that source of seeds and improvement status contributed significantly to the structure observed in the analyzed set of S. cereale accessions. We revealed also a relatively limited diversity in improved rye accessions, both historical and contemporary, as well as lack of correlation between clustering of improved accessions and geographic origin, suggesting common genetic background of rye accessions from diverse geographic regions and extensive germplasm exchange. The resolution of the obtained results was also comparable. The mean GS values calculated in both studies for accession groups formed based on the source of seeds and improvement status almost identical (data not reported).

In our study, we did not analyze the genetic diversity of rye accessions on a single plant basis. Instead a bulking strategy with 96 plants for each accession pooled into one sample was applied. In consequence of DNA pooling some information is lost, for example, it is not possible to estimate allele frequencies within individual accessions. While this limitation is true for both SSR markers used in the present study and for DArT markers applied earlier (Bolibok-Brągoszewska et al 2014), there are other methods, such as genotyping-by-sequencing (Davey et al. 2011), which allow for estimation of allele frequencies in pooled samples. Therefore, it will be possible to obtain detailed information on within-accession diversity in the future, using DNA samples prepared for the present study. The multi-allelic nature of SSR markers was an important feature in the analyses of pooled samples. While in the case of DArT markers (Bolibok-Brągoszewska et al 2014), a score of one (1) or zero (0) was obtained for each accession, in the case of SSR markers we observed an allele (or in most cases several alleles), that could be defined and distinguished by its length. Furthermore, in the case of accessions groups of different origin on improvements status, we were able to identify frequent or unique SSR alleles (Table 2) or number of alleles detected using a given SSR marker. While in the case of DArT markers, it was only possible to calculate the percentage of 1 or 0 score within a given accession subgroup. Thus, in our opinion, after performing SSR assays, we obtained a fuller picture of genetic diversity in the collection.

There are many examples in the literature of the use of SSR markers in assessing genetic diversity of plant species. Usually, a relatively low number of SSRs is sufficient to accurately reflect genetic structure and diversity among a high number of accessions. For example, 23 SSRs were used in genetic diversity studies of 3342 cucumber accessions (Lv et al. 2012), while 48 markers were used to examine 2945 accessions of chickpea (Upadhyaya et al. 2008). So far, there have been only a few examples where SSR markers have been utilized in studies of genetic diversity in rye. In the study of Shang et al. (2006), the genetic diversity and phylogenic relationships among 30 Secale accessions and 47 cultivated ryes were evaluated using 24 S. cereale microsatellite markers. Bolibok et al. (2005) utilized 38 SSR markers to assess the genetic diversity of 30 rye inbred lines. Thirteen SSR markers were used in the work of Myśków et al. (2010) to conduct genetic diversity analysis on 48 rye inbred lines. In comparison, the present study used 22 polymorphic SSRs, representing different rye chromosomes, in the phylogenetic analysis of 367 accessions.

The PIC value calculated for SSR markers used for genotyping was relatively high with an average of 0.57 (range 0.18–0.93) which indicates their high informativeness. The average percentage of polymorphic alleles for the set of 22 markers was also high (93 %). This was predictable because the public markers were not selected randomly, but based on the results of analyses using six diverse rye accessions and 42 SSR markers. The high values of the marker informativeness parameters and the congruency of the results reported here with the results obtained based on over 1000 genome-wide DArT markers (Bolibok-Brągoszewska et al. 2014) indicate that the selected 22 SSR markers are well suited for assessment of genetic diversity patterns in rye germplasm.

We used the following methods to demonstrate genetic relationships between the analyzed rye accessions: NJ cluster analysis, PCoA, and Bayesian clustering. The results obtained with all three methods were generally consistent. NJ clustering and PCoA distinguished two major groups within the analyzed accessions, dividing them according to the source of seeds and improvement status, and revealed that cultivars from breeding companies were clearly distinct from those reproduced in gene bank. In particular, cultivars and breeding strains from Polish and German breeding companies clustered in one group, and were distinct from all other accessions. The separation of currently cultivated accessions may be the result of adaptation of modern cultivars to the requirements of agriculture and environmental conditions in Central and Northern Europe. However, it is more likely that the different methods of plant material reproduction used in breeding companies and gene banks have affected the genetic diversity of accessions from these two sources. It is noteworthy that even samples of the same rye cultivar from different sources of seeds occupied two different positions in both the PCoA graph and the dendrogram (e.g., two samples of Dańkowskie Złote and two of Dańkowskie Nowe, one sample obtained directly from the company Danko and the second from PAS BG). While this result could be attributed to a mistake in a labeling of seed samples, an alternative explanation may be found in the work of Chebotar et al. (2003), where it was suggested that for open-pollinated species, the genetic integrity of an accession may be changed in each regeneration cycle, which can alter the SSR pattern. Genetic changes in seeds of the Dańkowskie Złote rye cultivar induced by long-term storage effects and consecutive regeneration cycles were also identified by the use of AFLP markers (Chwedorzewska et al. 2002) and SSR markers (Boczkowska and Puchalski 2012). Surprising results were obtained also for Gonello F1 variety, which turned out to be distinct from other hybrid varieties form the same breeding company (KWS Lochow) and also from population and hybrid varieties from other breeding companies. Similar results for Gonello were obtained earlier using DArT markers (Bolibok-Brągoszewska et al. 2014). As we mentioned previously (Bolibok-Brągoszewska et al. 2014), the pedigree information does not provide support for the observed distinctness of Gonello; hence, further analyses of additional independent samples would be needed to elucidate this problem.

As discussed above, factors such as reproduction or the manner of storage of plant material can greatly influence genetic variation. This may reflect differences in the procedures of plant reproduction employed by gene banks and breeding companies. In the case of open-pollinated plants, it is important to consider regeneration management in ex situ collections, particularly the optimal population size and distance between regeneration plots (Börner et al. 2012). It may be concluded that the lack of a standardized procedure for plant reproduction may have caused variation in the genetic diversity of the same rye accessions obtained from different institutions.

All our analyses demonstrated the high level of similarity between currently cultivated rye forms and breeding materials. Dendrograms and PCoA plots clearly showed that Polish and German accessions, in particular, grouped close together. STRUCTURE analysis assigned strains from Danko and modern cultivars to one model-based population (P1). Genetic similarity within cultivars from breeding companies and within breeding strains from Danko was relatively high. It is also noteworthy that no unique alleles were observed in accessions obtained from Danko and the values for mean diversity (h = 0.113) and unbiased diversity (uh = 0.118) were the lowest in this germplasm group. Evidence indicating low genetic diversity in Polish rye was previously obtained by Rafalski et al. (2002) who analyzed five open-pollinated cultivars. The low diversity level suggested that breeding programs of open-pollinated cultivars had reached their limit of genetic variability (Rafalski et al. 2002). Similarly, in the study of Shang et al. (2006), the level of genetic similarity (GS) in cultivated rye from different parts of the world was found to be insignificant. An AFLP-based study involving a broader range of plant material indicated that the genetic diversity of rye in Europe tended to decrease when moving from East to North (Chikmawati et al. 2012), with the lowest genetic differentiation and genetic distance found between accessions from Eastern and Central Europe. These findings suggest that genetic variation between contemporary cultivars has been reduced during the breeding process. The reason for this is that the majority of rye breeding programs focus on common aims, with the main target traits, besides grain yield, being tolerance to abiotic stresses, performance at high plant density, good kernel development, and resistance to powdery mildew or stem rust. As a consequence, this causes uniformity and increased homogeneity in commercial cultivars (Chikmawati et al. 2012; Persson and von Bothmer 2000). Additionally, the majority of the open-pollinated rye forms currently cultivated in Western and Central Europe have a common origin and are directly descended from the Petkus population (Fischer et al. 2010). Also in the present study, the contemporary cultivars, which were derived from Petkus heterozygotic pool (according to the obtained pedigree information) and originated mainly from Poland and Germany, were closely grouped in the PCoA plot (Online Resource 7). Moreover, most of them (82.5 %) was assigned to the same model-based population (P1) by STRUCTURE analysis. Worth to point out is also the fact that the accession derived from Carsten heterotic pool was also assigned to P1 model-based population and was grouped next to accessions derived from Petkus in the PCoA plot. However, the available data is not sufficient for a thorough discussion of this issue. Information on genetic background was not accessible for many cultivars, and based on the obtained information, only one accession representing Carsten genepool was included in the study. Nevertheless, obtained results confirm the common genetic background of contemporary cultivars and suggest extensive germplasm exchange.

The current breeding practices result in a decrease in genetic distance and can lead to a reduction in heterozygosity (Falconer and Mackay 1996). This causes a narrowing of the genetic pool of rye cultivars, which is highly undesirable since it can result in the loss of significant features like resistance to different stresses and nutritional value. Such plants may also have problems in adapting to changing environmental conditions. The introduction of landraces into rye breeding programs could extend the genetic pool of currently cultivated accessions. Landraces are early, highly heterogeneous forms of cultivars that were selected for use in subsistence agriculture, so they give a relatively low but stable yield (McCouch 2004). Our results show that the 155 landraces included in the analysis constituted the most divergent group, with the highest within-group variance. Similar results were obtained by Bolibok-Brągoszewska et al. (2014) using DArT markers. Landraces displayed considerable diversity and were distant from accessions obtained from breeders. This finding is not surprising because landraces are closely related to the wild ancestors of rye and embody much more variation than modern high-yielding cultivars (McCouch 2004), and it has been postulated that landraces could represent a valuable source of genetic variation that has been lost in modern breeding (Gailîte et al. 2013; Boczkowska and Traczyk 2013). Landraces may be a source of potentially useful alleles also for rye breeding. The results of Falke et al. (2008) indicated that exotic genetic resources like landraces in rye carry favorable alleles for baking quality traits, which may be exploited for improving elite breeding material by marker-assisted selection (MAS). Nevertheless, breeders are reluctant to introduce genetically distinct accessions like landraces into breeding programs, fearing that some of the introduced alleles may negatively influence breeding values. Thus, the results of the present study could be used to facilitate the material selection for broadening the genetic diversity within heterotic pools and promote the exploitation of less adapted germplasm in breeding programs.

In general, the clustering of rye accessions was more weakly correlated with geographic origin than with the source of seeds. However, some accessions grouped according to their geographical origin could be distinguished in the dendrogram and PCoA plot. Similar results were obtained using DArT markers (Bolibok-Brągoszewska et al. 2014). Weak correlations between genetic diversity in rye and geographical origin were identified by Chikmawati et al. (2012), Shang et al. (2006), and Persson and von Bothmer (2000) through analyses of AFLPs, SSRs, and organellar sequences, respectively. In the present study, accessions from Central Europe displayed the most visible genetic separation. However, this observation may result from the fact that the majority of Central European accessions were obtained directly from breeding companies in which the same selection criteria and breeding methods are applied.

It was shown in several studies on genetic diversity that growth habit (winter vs. spring) is one of the major determinants of the population structure in cereals (Ma et al. 2004; Alheit et al. 2012; Wang et al. 2012). Unfortunately, we were unable to check how this factor influences the genetic diversity structure in rye, since only one variety from our germplasm set represented spring type (Bojko), and the information on growth habit of landraces was not available. Nevertheless, Bojko variety grouped closely with other accessions obtained from breeding companies.

Another goal of our study was the selection of accessions to form a core collection. Core collections are a set of accessions derived from an existing collection that are selected to represent the widest possible spectrum of genetic variation in a given population in order to minimize the cost of genetic conservation (Brown 1989). The selected core collection constituted 6.8 % of all rye accessions and involved 93.7 % of SSR alleles that have been identified. The rye core entries were selected from a higher number of accessions than used in any previous study. A limited number of accessions, characterized by high genetic variation, represent a useful tool in the study of diversity within a population. Core collections play a very important role, especially in gene banks which face significant problems connected with the size and organization of plant germplasm collections. Nowadays, over 80 rye germplasm collections are maintained and the total number of accessions is estimated to be over 21 000 (FAO 2010). The proposed core collection could be the first step to simplify access to genetic diversity contained in rye germplasm and to enable its efficient use in basic and applied research. Moreover, our core collection could be treated as a testing panel in evaluating newly developed genetic markers or in studies on sequence diversity of selected genome fragments.

Conclusions

In the presented study, we conducted a very large and comprehensive study of genetic diversity in S. cereale species. As a result, we revealed a relatively limited genetic diversity within contemporary rye cultivars and breeding strains, indicating the common genetic background and germplasm exchange. We found also a lack of correlation between clustering of rye accessions, including improved cultivars, and their geographic origin. Results obtained in the presented work highlighted the need for broadening the genetic diversity in breeding programs.

We also defined a core collection of rye accession, which represented the vast majority of the observed genetic diversity. The proposed core collection could contribute to simplification of germplasm maintenance issues in ex situ collections, as well as find practical use in basic research involving testing of newly developed markers and analyses of sequence diversity of selected genes/gene fragments.

Obtained data confirmed also that the set of 22 SSR markers used in the study revealed a realistic picture of genetic diversity between 367 rye accessions. Therefore, it can be recommended for further germplasm characterization project.

Results obtained in our study could be also helpful for breeders considering introduction of less adapted germplasm to breeding programs.