Introduction

Cucumber (Cucumis sativus L., 2n = 14) is a worldwide important horticultural crop and has served as the research model plant of sex determination of Cucurbitaceae1. Cucumber fruit is a type of fleshy fruit that is usually consumed at immature stage (1–2 weeks after anthesis). In horticultural industry, fruit size and shape are important traits for selling and the determination of its usages2. According to the recently published papers, cucumber can be divided into six market classes or four geographic groups that exhibited extensive variations in shape or size: European long type, European short type, North/South China long type, North American short type, mini cucumber, and Japanese long type2,3.

Fruit size and shape are usually evaluated by the length (L) and diameter (D) of fruit, or its ratio (L/D), and are commonly modulated by both quantitative trait loci (QTLs) and some environmental factors2,4,5. Traditional QTL mapping, a reliable method of determining complex traits, was performed on fruit size and shape by analyzing F2, F3, BC (backcross) populations and recombinant inbred lines (RILs), and a series of expected QTLs were successfully detected6,7,8,9. For example, the first mapping for cucumber fruit shape-related QTLs was carried out by Kennard and Havey10 via using F3 and backcross populations, and 12 QTLs were identified to be involved in regulation of fruit length (FL), fruit diameter (FD) and the ratio of length to diameter (L/D), respectively. By using F2 and F3 populations, Serquen et al.11 detected 11 QTLs that were related with FL, FD, L/D ratio and fruit weight (FW). Recently, Bo et al.12 mapped 11 QTLs responsible for the regulation of FL, FD and FW using RIL populations developed from cultivated × semi-wild Xishuangbanna cucumber lines. Using three QTL models, Weng et al.2 mapped 12 consensus fruit size related QTLs with F2, F3 and RIL populations came from Gy14 (North American short fruit cucumber) × 9930 (North China long fruit cucumber) at multiple developmental stages and environments. By using two segregating populations from WI7200 (cultivated cucumber) × WI7167 (semi-wild Xishuangbanna cucumber), Pan et al.13 detected 21 QTLs that were involved in the regulation of mature fruit length (MFL), mature fruit diameter (MFD), FW and L/D ratio. With the populations originated from WI7238 (long fruit) × WI7239 (round fruit), Pan et al.14 detected two QTLs, FS1.2 and FS2.1, which interacted with each other and exerted major effects on fruit shape. Further analysis revealed that CsSUN might be the candidate gene for FS1.2. Additionally, 10 possible candidate genes were identified in the FS2.1 locus such as CsTRM5, which was an ortholog of TRM5 gene in Arabidopsis and tomato15. More recently, three genes have been functionally validated. CsFUL1, a functional allele FRUITFULL-like MADS-box gene, was identified by analyzing the re-sequenced data of 150 cucumber lines, which regulates cucumber fruit length via exerting negative effects on the expression of CsSUP and auxin transporters16. Using two mutants from ethyl methanesulfonate mutagenesis, two fruit length controlling genes, Short Fruit 1 (SF1) and SF2 were isolated and functionally identified as a cucurbit-specific RING-type E3 ligase and a Histone Deacetylase Complex 1 (HDC1) homologue, respectively17,18.

To date, although five QTL/genes have already been cloned14,15,16,17,18, more QTL/genes need to be isolated and functionally validated2,4,19. Moreover, the molecular regulatory mechanism of cucumber fruit shape remains poorly understood. Conventional mapping populations such as F2, F3 and BC1, which are temporary ones, or RIL, which is permanent one, are usually used to identify QTLs with large effects, whereas minor effect or epistatic QTLs might be masked20,21,22. Therefore, the improvement of mapping populations has attracted continuously increased attentions from horticultural scientists and breeders22,23,24. Chromosome segment substitution line (CSSL), a kind of genetic material that randomly harbor a specific chromosomal segment of donor parent under the recipient genetic background, has been widely applied in a series of crop genetic research such as identification and mapping of QTLs associated with traits of interest20,25. Comparing with F2, F3, BC1 and RIL, CSSLs display a significant advantage that detection capacity of QTLs can be enhanced due to the elimination of blurring effects from multiple or interacting ones26. So far, lot of QTLs/genes of interest have been fine-mapped by using CSSLs in many plants such as maize27,28,29, cotton30,31,32, soybean33,34,35, Brassica rapa22,36, peanut37, wheat38, tomato39,40,41, and rice42,43,44,45.

In current study, a population consisting of 71 CSSLs was successfully built by using backcross progenies that came from a cross of CNS21 (long-stick-fruit) as the donor parent and RNS7 (round fruit) as the recurrent parent. CNS21, the Northern-China type inbred line has long stick commercial fruit (L/D > 10) with the average length of 36.20 ± 3.25 cm, green peel as well as white spines, while RNS7 sets round commercial fruit (L/D ≈ 1) with the average length of 7.30 ± 0.40 cm, white peel and black spines (Supplementary Table S3)19. A total of 114 InDel (insertion/deletions of DNA sequences) markers that showed polymorphisms between CNS21 and RNS7 were adopted in subsequent marker-assisted selection in order to identify fruit shape related QTLs. Totally 21 QTLs associated with fruit shape and 4 QTLs associated with fruit ground and flesh color, and seed size were detected. The results will facilitate the future fine mapping and cloning of these fruit related genes, thus benefiting our understanding about the genetic base of cucumber fruit related traits.

Results

Construction of cucumber CSSLs

The outline for cucumber CSSL construction was schematically illustrated in Fig. 1. To construct cucumber CSSL with RNS7 genetic background, the F1 plants from RNS7 (recurrent parent) × CNS21 (donor parent) were consecutively backcrossed to RNS7 four times in order to yield the BC4F1 generation. As a result, 500 BC4F1 individual plants were successfully obtained from 34 BC3F1 lines. All generations from BC1F1 to BC4F1 were screened using 114 InDel markers (Supplementary Fig. S1; Supplementary Table S4) based on the following criteria: (I) the most of genomes displayed high-level homozygosity with RNS7, except one to two substitutions from CNS21; (II) the selected individuals harbored less CNS21-derived chromosomal segments, which were able to cover the whole genome of CNS21 with overlapping regions between different ones. To obtain the desired CSSLs, 60 BC4F1 lines were self-pollinated and the resulted 1980 BC4F2 plants were further investigated by marker-assisted selection (MAS) based on cucumber 9930 V3.0 draft genome. A total of 71 independent BC4F2 substitution lines were kept as the cucumber CSSL population (Fig. 2).

Figure 1
figure 1

Schematic illustration for the construction of chromosome segment substitution lines (CSSLs) covering the whole cucumber genome. QTLs quantitative trait loci, MAS marker-assisted selection.

Figure 2
figure 2

Schematic illustration for the genotypes of 71 CSSLs based on cucumber 9930 V3.0 draft genome. The white regions represent homologous segments from the recurrent parent, RNS7, and the black regions represent homologous segments from the donor parent, CNS21. CSSLs are indicated on the vertical axis.

Characterization of substituted chromosomal segments in the CSSLs

The 71 CSSLs, which possessed the genetic background of RNS7, totally harbored 76 substituted segments from CNS21. Thus this CSSL population contained 1.07 segments per line and 10.86 segments per chromosome on average (Fig. 3; Supplementary Table S1). Among these 71 CSSLs, 66 lines harbored only one CNS21-derived chromosomal segment, 5 harbored two segments, and none harbored three or more segments (Fig. 3). Numbers of the substituted segments were 20, 12, 12, 6, 10, 7 and 9 in chromosome 1 to chromosome 7, respectively (Supplementary Table S1). The summary length of CNS21-derived chromosomal segments in the CSSL population was approximately 546.96 Mb, which equated to 2.61 times of the sequenced genome size of cucumber (Supplementary Table S1). The length of substitutions in each chromosome ranged from 47.66 Mb (1.78 times of the chromosome size) in chromosome 4 to 140.81 Mb (4.28 times of the chromosome size) in chromosome 1, and was averaged as about 78.14 Mb (Supplementary Table S1). In each substitution line, the CNS21-derived chromosomal segments ranged in length from 1.73 to 19.31 Mb, with an average of 7.19 Mb (Supplementary Table S1). Among these CNS21-derived substitutions, 24 segments were smaller than 5 Mb, 37 were 5–10 Mb, and 15 were over 10 Mb (Fig. 4). The recovery ratio of the 71 CSSLs ranged from 95.63 to 99.03% (Fig. 5).

Figure 3
figure 3

Occurrence frequency of substitution events in CSSLs.

Figure 4
figure 4

Size distribution of substituted chromosomal segments in CSSLs.

Figure 5
figure 5

The recovery ratio of recurrent genome in each CSSL.

Identification of QTLs for fruit shape

To identify the chromosomal segments involved in fruit shape, phenotypic variations of fruit shape related parameters including fruit length, fruit diameter and the ratio of length to diameter were investigated in the CSSL population at anthesis, commercial and mature fruit stages (Supplementary Table S3). The fruit shape was markedly different between the two parents, and the L/D index of CNS21 was consistently greater than that of RNS719. In the CSSL population, the fruit length and fruit diameter segregated significantly at immature fruit stage, ranging from 44 to 111 mm and 36.5 to 80 mm, respectively (Table 1; Supplementary Table S3; Fig. 7). The similar results were found at anthesis and mature fruit stages as well (Table 1; Supplementary Table S3). Of these CSSLs, fruit shape of 10 lines were dramatically different from RNS7 (Table 1). By analyzing the CSSL population, we detected 21 fruit shape related QTLs on chromosomes 1, 2, 3, 5 and 6, which included 2 responsible for ovary length (OL), 2 for ovary diameter (OD), 2 for ovary shape index (OSI), 2 for commercial fruit length (FL), 1 for commercial fruit diameter (FD), 4 for commercial fruit shape index (FSI), 2 for mature fruit length (MFL), 3 for mature fruit diameter (MFD) and 3 for mature fruit shape index (FSI) (Fig. 6). Of these 21 QTLs, 16 were detected in the region of 22.73–28.27 Mb on chromosome 1 and the region of 5.10–14.23 Mb on chromosome 2, respectively (Fig. 6; Supplementary Fig. S4; Table 1). QTLs for OL, OD, OSI, FL, FSI, MFL, MFD and MFSI were identified in the aforementioned regions (Fig. 6; Table 1), confirming the great contributions of loci on chromosomes 1 and 2 to fruit shape (Supplementary Fig. S4). In addition, one QTL for FD was mapped in the region of 16.22–22.78 Mb on chromosome 3 as well (Fig. 6; Table 1). Two QTLs for FSI were mapped in the region of 16.22–22.78 Mb on chromosome 3 and 11.70–17.62 Mb on chromosome 6 (Fig. 6; Table 1). Two QTLs for MFD and MFSI were detected in the region of 0–10.61 Mb on chromosome 5 (Fig. 6; Table 1).

Table 1 Phenotypic comparisons and additive effects of CSSLs carrying QTLs for fruit shape.
Figure 6
figure 6

Chromosomal distribution of 21 QTLs for cucumber fruit shape. DNA markers and their physical locations are indicated on the left side of each chromosome based on cucumber 9930 V3.0 draft genome. Short stripes filled with different hatched regions on right sides of chromosomes represent the locations of different QTLs.

Identification of QTLs for seed shape and fruit color

In addition to fruit shape, the two parents showed significant differences in other fruit traits such as commercial fruit ground color (FGC), fruit flesh color (FLC) and seed shape19. Using CSSL2-7, two QTLs for seed length (SDL) and seed width (SW) were identified in the region of 5.10–14.23 Mb on chromosome 2 (Supplementary Fig. S2; Supplementary Table S2). Using CSSL3-11, two QTLs associated with FGC and FLC were uncovered in the 33.31–40.88 Mb region of chromosome 3 (Supplementary Fig. S3; Supplementary Table S2).

Discussion

Fruit shape/size, an important quality trait in cucumber, is often affected by both genetic composition and environmental conditions. To date, there is little information available on the genetic mechanisms of fruit shape/size. CSSLs are ideal materials to detect QTLs and evaluate their contributions to the trait of interest as a single Mendelian factor. CSSLs have extensively been applied for the identification of genes that control important agronomic traits in rice46,47, maize28,29, Brassica rapa22,36, tomato40,41, and so on. However, thus far, only three sets of cucumber CSSLs have already been constructed. The first set of CSSLs was created through a cross of the wild cucumber PI183967 (donor) and the cultivated line Xintaimici (receptor), providing new resources for utilization of valuable genes from wild cucumber48. The other set was adopted to detect powdery mildew (PM) resistance-related genes49. The third set is in the present study (Fig. 2). Polymorphic marker density across whole genome profoundly influences the quality of CSSLs and thus plays crucial roles in the creation of CSSLs50. The CSSLs constructed by Li et al.48 only contain 31 lines including 10 lines harboring two substitution segments, and their substitution segments were big because of small number of CSSLs and makers used in selection. Although the CSSLs for detecting PM resistance-related genes have 17 families with 499 plants, only two markers, one is associated with dwarf plants and the other with PM resistance, were used in the construction of the CSSLs49. However, 114 InDel markers that were distributed on the 7 chromosomes relatively evenly were adopted for the construction of CSSLs in the current study (Supplementary Fig. S1). Moreover, 66 of the 71 CSSLs contained single substituted segment and the other 5 lines were identified to contain two substituted segments (Fig. 3). The length of substituted chromosomal segments in each line ranged from 1.73 to 19.31 Mb, and the average value of these segments was approximately 7.19 Mb (Supplementary Table S1). So these lines harbored a high recovery rate of 95.63–99.03% of the recurrent parent genome and simultaneously the genetic background noise was tremendously decreased (Fig. 5), thus being considered as a powerful tool to identify, map and validate QTLs of interest.

Using the CSSL population, totally, 21 QTLs responsible for cucumber fruit shape were identified and of which, eight QTLs were detected in the region of 22.73–28.27 Mb on chromosome 1 (Fig. 6; Table 1), where numerous QTLs for OL, OD, OSI, FL, FSI, MFL, MFD and MFSI were detected in previous studies (Fig. 6; Supplementary Fig. S4)19. As the best candidate of FS1.2, CsSUN was located in this region, being a major QTL of fruit shape (Fig. 7a)14,19. Comparing with previously reported regions on chromosome 119,51, the size of estimated QTL region in our research was much smaller (Supplementary Fig. S4). The QTLs for OL, OD, OSI, FL, FSI, MFL, MFD and MFSI were identified at the long arm of chromosome 2 in the present study (Fig. 6; Supplementary Fig. S4). The chromosome 2 region harboring these detected QTLs displayed overlapping, but much smaller than the previous three reports by Weng et al.2, Gao et al.19 and Pan et al.51. This QTL(s) on chromosome 2 was (were) uncovered as (a) major one(s) by Pan et al.14, and our data provided a direct evidence for FS2.1 being a major QTL (Fig. 7b; Table 1). Wu et al.15 reported that there were 10 possible candidate genes in FS2.1 locus such as CsTRM5, an ortholog of tomato TRM5 that was able to balance the OVATE and SlOFP20-mediated cell division patterns to determine the final tomato shapes. It was thus that CsTRM5 was regarded as the most possible candidate gene for FS2.1, but they did not give direct genetic evidence of sequence difference or gene expression between the two parental lines used in their research15. Our resequencing results revealed that some single nucleotide polymorphisms (SNPs) or InDels in CsSUN and CsTRM5 genes, which were just located on the region of 22.73–28.27 Mb on chromosome 1 and the region of 5.10–14.23 Mb on chromosome 2 respectively, between RNS7 and CNS21 (data not shown). We thus speculate that CsSUN and CsTRM5 genes could be the candidate genes for QTLs on chromosomes 1 and 2, respectively. However, more experimental evidence should be provided in future study to support this assumption. In addition, SF1 was localized in the region of 5.10–14.23 Mb on chromosome 2 in which FS2.1 was mapped in our study and previous studies (Supplementary Fig. S4)17,19. However, there is no difference in protein sequence and gene expression of SF1 between RNS7 and CNS21. The identification of major QTLs (R2 > 10%) on shortened regions of chromosomes 1 and 2 in the present study (Table 1; Supplementary Fig. S4) indicated that CSSLs could be an advantageous tool for fine mapping stable QTLs and give more information about these QTLs under different cucumber genetic backgrounds.

Figure 7
figure 7

Fruit phenotypes of CSSLs carrying QTLs for fruit shape. (af) CSSLs carrying QTLs for fruit shape. (g) A CSSL not carrying QTLs for fruit shape. (h) The recurrent parent RNS7. Scale bar = 1 cm.

Furthermore, the CSSLs harboring a single segment substitution make it feasible to mine minor-effect QTLs21,25,52,53,54. In the current study, five minor-effect QTLs (R2 ≤ 10%) associated with fruit shape were detected on chromosomes 3, 5 and 6 (Figs. 6, 7c–f; Table 1) and displayed a relatively complex relationship with previous studies2,12,13,19,51,55. In most cases, the identified minor-effects regions on the three chromosomes were well consistent with those described previously2,12,13,51,55, while the inconsistency was also revealed for FD3.2 and FSI3.3 on chromosome 3 with the detected effect regions on the same chromosome by Wei et al.55 and Pan et al.13, possibly due to the differences in genetic background, traits of interest or environmental conditions (Fig. 6; Supplementary Fig. S4). Up to date, none of them has yet been fine mapped and cloned because that it is scarcely possible to fine map or clone these minor QTLs using the F2, F3, BC or RIL populations. However, the CSSLs that we constructed in this present study provided an opportunity for isolating the minor QTLs related to cucumber fruit shape.

We also detected QTLs that were associated with seed size and fruit color in the present study (Supplementary Table S2). The QTLs for SDL and SW were identified in the same region for OL, OD, OSI, FL, FSI, MFL, MFD and MFSI on chromosome 2, suggesting that FS2.1 might have pleiotropic effects (Supplementary Fig. S2; Supplementary Fig. S4; Table 1; Supplementary Table S2). More recently, two consensus QTLs (CsSS2.1 and CsSS2.2) associated with seed size have been reported on chromosome 2 in a review paper by Guo et al.56, and CsSS2.1 displays overlapping, but larger than the identified QTLs for SDL and SW in the present study. The smaller QTL regions in this study will facilitate the future fine-mapping for genes responsible for seed size. The QTLs for FGC and FLC were observed in the distal region of chromosome 3 (Supplementary Fig. S3; Supplementary Table S2), being consistent with the previous results reported by Liu et al.57,58 and Tang et al.59. The w gene controlling white immature fruit color was localized in this region of chromosome 3, but no difference in coding sequence (CDS) of w was observed between RNS7 and CNS21. It will be very intriguing to reveal more candidate genes responsible for fruit color in future studies. In addition, QTLs related to other agronomic traits could be identified with these CSSLs.

In summary, we created a set of CSSLs that resulted from a cross between RNS7 (a round-fruit line) and CNS21 (a long-stick-fruit line) using 114 InDel markers covering the whole cucumber genome (9930 V3.0). Using these CSSLs, we identified 25 QTLs related to fruit shape, fruit color and seed size. Our study provides a powerful tool to isolate the QTLs for fruit shape, especially the minor ones, and other agronomic trait QTLs.

Materials and methods

Plant materials and growth conditions

Two parents CNS21 and RNS7, were used to construct the CSSL population19. Seeds of two parents were germinated in darkness at 28 °C overnight in petri dishes and grown in a growth chamber that was programed as photoperiod of 16 h, air temperature of 25 °C over light course and of 18 °C over dark course. Cucumber seedlings were transferred to a greenhouse of Shandong Agricultural University when they were grown to two-leaf stage. Standard field managements were carried out over cucumber cultivation course.

Molecular marker development

A total of 114 InDel markers that were distributed evenly throughout the cucumber (Chinese Long) 9930 V3.0 genome (https://cucurbitgenomics.org/organism/20) were developed from the data of sequenced genomes (Supplementary Fig. S1)19. And 18, 16, 19, 14, 19, 17 and 11 InDel markers were located on chromosome 1 to chromosome 7, respectively (Supplementary Fig. S1). The average distance was approximately 1.85 Mb between two neighboring markers on the same chromosome. The primers used in the present study were listed in Supplementary Table S4.

Construction of CSSLs

The schematic illustration for construction of CSSLs was displayed in Fig. 1. The F1 plants were generated from a cross between CNS21 and RNS7. Then consecutive backcross was performed between the F1 plants and RNS7 four times in order to generate the BC4F1. Over the course from BC1F1 to BC4F1, the genotype of each individual was analyzed by marker-assisted selection (MAS), and labelled as ‘B’ if the genotype was the same as ‘RNS7’, or as ‘H’ if the genotype was heterogeneous. The appropriate ‘H’ individuals in each generation were further chosen out based on the criteria of harboring CNS21-derived chromosomal segments as well as these segments covering whole cucumber genome, and finally 60 BC4F1 individuals were selected from 500 BC4F1 plants. Thereafter, BC4F2 population was generated by self-pollination of the selected BC4F1 plants for further MAS analysis based on the following principles: the substitution of chromosomal segment in RNS7 by a single CNS21-derived segment, the maintenance of genetic background at a high-level homozygosity with RNS7, and the existence of partially overlapping between substituted chromosomal segments. 114 InDel markers were applied in the method of MAS during the selection process. Ultimately, 71 BC4F2 lines were chosen from 1980 BC4F2 individuals to create a set of CSSLs for the further mapping of cucumber fruit trait QTLs.

DNA extraction and genotype analysis

Genomic DNAs were extracted from unexpanded young leaves of each plant following the CATB protocol reported by Murray and Thompson60. Then the above-mentioned 114 InDel markers were applied to detect the individuals over foreground and background selections. The target DNA segments were amplified on a ABI PCR machine (Thermo Fisher Scientific, USA) with the correspond InDel markers. The resulted products were separated on a 3.5% (W/V) agarose gel and photographed with a FR-980A image analysis system (Shanghai Furi Science and Technology, China).

Phenotypic analysis

Phenotypic data of CSSLs and RNS7 were recorded in the solar greenhouse of Shandong Agricultural University over three years (2016, 2017 and 2018). Two self-pollinated fruits were allowed on each plant. Fruit length (L), fruit diameter (D) and the ratio of length to diameter (L/D) were determined at three developmental stages: ovary length (OL), ovary diameter (OD) and ovary shape index (OSI) at anthesis; commercial fruit length (FL), fruit diameter (FD) and fruit shape index (FSI) at 10–12 days post pollination (dpp), and mature fruit length (MFL), mature fruit diameter (MFD) and mature fruit shape index (MFSI) at 45–55 dpp. At least 5 biological repeats were performed to collect all data. For each repeat, five to ten typical fruits at anthesis, three to five typical fruits at immature fruit stage, and two typical fruits at mature fruit stage, respectively, were selected for statistical analysis of phenotypic parameters including fruit length (L), fruit diameter (D) and the ratio of length to diameter. Seed length (SDL) and seed width (SW) were collected from at least 20 seeds. Data analyses were performed with statistical algorisms installed in MICROSOFT Excel 2013.

QTL mapping

Given there were significant differences in the average value of a trait between a CSSL and RNS7, the existence of QTLs was further estimated. The detection of QTLs was performed on the basis of the t-test results that were derived from the difference comparison between the mean values of each CSSL and RNS7 (P value ≤ 0.05). The additive effect of individual QTL was evaluated by following the formula below53: Additive effect = 1/2 × (value of CSSL-value of RNS7).

The observed phenotypic variance (R2), a parameter commonly adopted to evaluate the effect strength of a given QTL, was calculated for these detected QTLs by using the QTL IciMapping V4.1 software with previously introduced settings19. The QTLs with over 10% of R2 were defined as major-effect ones and the others were defined as minor-effect ones according to the previous study19.