Cucumber(Cucumis sativus L.)is an annual climbing herb belongs to the Cucurbitaceae family,and is one of the most important economic crops around the world.Recently,cucumber breeding for efficient varieties with desirable traits has gained much attention.Heterosis is an important method for the breeding of new cucumber varieties,but the its prediction methods are mainly based on experience.Moreover,it is a laborious method where a large number of hybrids are screened for potential hybrids,it requires a large investment of manpower,material resources,and land resources.Thus,the selection efficiency is low and the breeding cycle is long.To improve breeding efficiency and save breeding cost,parental characteristics can be used to effectively predict the performance of its potential hybrids.However,the main yield and quality traits of cucumber are mostly quantitative traits,which are affected by both genetics and environmental factors,which might make the prediction process more difficult.The genomic selection model can effectively predict the phenotype of potential hybrids by modeling the genotype and phenotype information of the training population.However,up till now,genomic selection is still in its infancy stage in horticultural crop breeding,especially in cucumber breeding.Up to our knowledge,there have been no reports on the application of genomic selection model to cucumber breeding.The genome selection has theoretical and practical significance to efficiently explore and predict heterosis in yield and quality traits based on their trait inheritance and genome-wide information.In this study,we have collected the sequencing information of parental lines,and the phenotypic information of training populations in field trials.Afterward,we explored the SNP-BLUP model based on the genotype matrices and the GBLUP model based on the genetic relationship matrices,where we discussed the prediction effects of these two models on the important agronomic and yield traits heterosis of cucumber.The findings from our research work are as follows:1.The analysis of cucumber germplasm for genetic diversity and domestication selection:In this study,we collected eighty-two representative cucumber germplasms,and divided into six groups based on the resequencing results and population structure analysis,i.e.,wild type,semi-wild type,European greenhouse type,European and American open field type,South China fresh type,and North China fresh type.According to the grouping and domestication analysis results,the domestication process of cucumber is divided into domestication stage from wild type to semi-wild type and the improvement stage from a semi-wild type into four cultivated groups.The domestication stage has experienced more severe domestication selection compared to improvement stage.2.The phenotypic data collection of training group: Diallel cross and NC II two training populations have been created based on the existing cucumber germplasms,which contains268 combinations in total.A total of 13 important agronomic traits information of the training populations in autumn 2018,spring 2019,autumn 2019,and spring 2020 were collected from plants grown in a plastic tunnel,and yield heterosis was calculated.The results from analysis show that the cucumber inbred lines in the European greenhouse group have an excellent general combining ability of yield.Compared to the phenotypic data in autumn,the phenotypic data in spring generally have longer days from sowing to female flower opening,higher female flower node rate,and yield.The heterosis of cross-ecotype hybrids is more obvious,but the issue of bitter gourd is might to occur.3.SNP-BLUP model construction: According to the phenotype data and genotype data information of the training population,a total of 11 models including ridge regression,LASSO,Bayes A,Bayes B,Bayes C,Bayesian ridge regression,Bayesian LASSO,RR-BLUP,random forest,the support vector machine with radial kernel function and the support vector machine model with polynomial kernel function have been established for additive SNP-BLUP model training.Model training results show that different models have similar predictive effects on the same phenotypic data,and there was a significant positive correlation between trait heritability and model predictive ability.In higher heritability traits,a smaller training population the results in more accurate predictions.For most of the traits,about 500 molecular markers are enough to meet the best prediction requirements.In our study,the additive-dominance Bayes B model was used as a control to establish a non-additive SNP-BLUP model.We used three model validation strategies,such as the crossvalidation within the same population and season,the cross-season validation within the same population,and the cross-population validation to estimate the model predictive ability.The results show that under the cross-validation within the same population and season,the additive-dominant Bayes B model was significantly better than the additive Bayes B model,but the improvement effect for high heritability traits was not obvious.Most traits performed well under cross-season and cross-population validation strategies.4.GBLUP model construction: The GBLUP model based on GCA model theory was established on the phenotypic information and the genetic relationship matrix of training populations.The GBLUP model based on the orthogonal prior hypothesis of genetic variance components was used to estimate the genetic variance of each trait in different seasons,such as additive,dominance,additive-additive,and residual genetic effects.The cross-validation within the same population and season,the cross-season validation within the same population,and the cross-population validation results show that the GBLUP model with non-additive effects can predict more accurately than the additive GBLUP model for most of the traits.The predictive effect of cross-season validation within the same population depends on the correlation of phenotypic data among seasons.The GBLUP model for cross-population validation within similar environments,including non-additive effects has more potential for prediction,while for cross-population validation within larger environmental differences,the additive GBLUP model was enough to obtain the best predictive effect.5.Heterosis prediction model construction based on SNP-BLUP and GBLUP: The dominance model component estimated by SNP-BLUP model and the SCA estimated by GBLUP model had significantly positive correlations with the cucumber yield heterosis.However,the additive model component estimated by SNP-BLUP model,the GCA estimated by GBLUP model,genetic distance,and the genetic relationship matrix all have weak or no significant correlations with the cucumber yield heterosis.Heterosis prediction models including GD model,Dominance model,and SCA model have been developed based on genetic distance,dominant component,and SCA predictors respectively.Taking the predictive ability of the GD model as a control,the predictive ability of the Dominance model has increased ranged from 0.15 to 0.74,and the predictive ability of the SCA model has increased ranged from 0.15 to 0.59. |