| Cultivated soybean[Glycine max(L.)Merr.]is native to China and its wild ancestral,the annual wild soybean(Glycine soja Sieb.et Zucc.),is widely distributed in East Asia.Wild soybean to adapt to a long-day of high latitudes,the photoperiod sensitivity is weakened in the process of spread.Cultivated soybean is from wild soybean by long-term artificial domestication.It is believed between wild and cultivated soybeans are obvious differences,but that have no absolute confine.Cultivated soybean should adapt to the natural environment similar to wild soybean,but also adapt to the cultivation of farming system in various ecoregions during artificial domestication and spread,as well as to satisfy the human needs of quality,yield and uses.Cultivated soybean is mainly source of oil and protein,which plays an important role in human life.Therefore,a variety of unique phenotypic traits are formed and preserved,which contain a rich genetic variation of the genome,and a trace of natural and artificial selection in soybeans.The genetic composition of soybean’s morphological,ecological and quality characters are studied,and the genetic mechanism of evolution by artificial and natural selection is analyzed,which is significance to the genetic research,improvement and production of soybeans.36952 SNP linkage disequilibrium blocks(SNPLDBs)and total 100092 alleles are constructed using the genome-wide SNPs of 182 annual wild accessions(WA),396 farmers’landraces(LR)and 446 released cultivars(RC)collected from different eco-regions in China,together they formed the Chinese soybean germplasm population(CSGP).The association analysis of 10 quantitative characters,such as first flowering,100-seed weight and so on,is.carried out by RTM-GWAS,and 4 qualitative traits by Chi-square.The evolution relationship of wild and cultivated soybeans is conjectured by the diversity of phenotypic traits and molecular markers,and spreading regularity of alleles.The main results are as follows:1.Genome characteristics of the CSGP during evolutionMore SNPs and recombination events have been accumulated in WA by long history and formed high genetic diversity(θw),high recombination rate(ρ),short linkage disequilibrium distance(LD).On the other hand,RC is the opposite of WA,LR is in the middle.Most alleles of WA are inherited to LR,and some alleles are lost and some new alleles are accumulated in LR.RC is similar to LR,but a large number of alleles are lost and some alleles are obtain from WA by directly immigration,and a few new alleles are accumulated.The number and frequency of allele is changed by artificial selection,that of led to loss of genetic diversity(π),population differentiation(FST)and change of population structure.In addition,there is no absolute boundary between wild and cultivated soybeans.The analysis showed that the genomic characteristics of sub-populations are changed artificially,the decrease of θw is mainly due to the increase of LR,the loss of π and the increase of FST is the increase of low frequency allele in domestication,on the other hand,the massive loss of SNP and low frequency alleles are the major reasons in improvement.2.genetic exchanges between wild and cultivated soybeansThe main phenotypic characteristics of WA are purple flower,brown pubescence,black seed coat,indeterminate stem termination,small SW(100-seed weight)and long Ff(first flowering time),which change to the characteristics of white flower,gray pubescence,yellow seed coat,determinate stem termination,greater Sw and short Ff in LR,and RC continue the change trend of LR,and the gap between WA and RC is greater.The alleles of evolutionary traits’ related loci are widely gene exchange between different eco-regions within WA and LR.The GD between eco-regions of WA and the corresponding eco-regions of LR were the smallest,and LR were affected by WA from the north and the south.The above shows that each eco-regions of the existing cultivated soybeans and each eco-regions of WA are linked,and the existing cultivaed soybeans are the result of the common domestication of multiple eco-regions.Based on these result,we should recognize the fact that the evolution of soybeans has been co-exist pattern of artificial evolution,migration and gene exchange,they jointly promote the evolution of soybeans.The new alleles of cultivated soybean may be achieved by mutation and recombination.It is most reasonable to obtain alleles of cultivated soybean from wild soybean.The analysis of the number of alleles and the high recombination rate showed that the formation of 100seed weight of large-grain type depended on mutation and recombination,and the phenotypic variation at first flowering time might depend more on recombination.3.Speculating on the origin of cultivated soybeanThe domestication of cultivated soybeans is complicated due to many factors such as natural crossing,mutation and propagation.The new alleles of LR that suggested that the domestication of south region may be more remote and spread wider,and so the time of domestication of cultivated soybean in north region is later.An analysis of allele migration between WA and LR showed that southern WA is closely related to southern LR.These results suggest that the migration of alleles indicates that the southern WA has the closest relationship with southern LR.In addition,the average GD and FST among southern WA and southern LR is small,and phylogenetic tree analysis showed that the genetic relationship between southern wild and southern cultivated soybeans is relatively closer,which indicated that southern wild soybeans is possibly the origin of cultivated soybeans ancestral population.The average GD and FST among southern LR and other eco-regions of LR are smaller,indicating that the relationship is the closest and may be the origin population.Phenotypic and allele analysis also showed that cultivated soybean originated from southern wild soybeans.Therefore,the ancestral cultivated soybeans are produced and acclimated in southern China(eco-region III and IV)and gradually propagated to other eco-regions.4.Phenotypic variation of the CGSRP and genetic diversity of QTL14 traits,including morphological,ecological and quality traits are analyzed,of which 9 quantitative traits vary widely.The differences among sub-populations are prominently different,and most of number of traits are prominently different among eco-regions within sub-populations(Duncan-test,p<0.01).Most traits have higher generalized heritability(66.78%~99.07%).Most of quantitative traits indicated a prominently different among genotype,environment,genotype× environment by the ANOVA.The correlation among most traits is extremely significant,but there is a higher correlation coefficient among only closely related traits.There is a moderate significant correlation between 100-seed weight,first flowering and other traits.It is found that the traits of obvious artificial selection had higher heritability(such as 100-seed weight,first flowering and oil content).There are great differences among eco-regions within WA in first flowering and grow period,and within LR in first flowering,growth period and oil content,and within RC in 100-seed weight,oil content and protein content.The above indicated that the domesticated target traits in different acclimation stages are different.In addition,the traits of artificial selection have a difference among different subpopulations or eco-regions.10 quantitative characters on soybeans are analyzed by RTM-GWAS,and then obtain a total of 654 QTLs,involving a total of 3483 alleles.There is a great difference in the number of QTLs and alleles of different traits,the phenotypic interpretation of first flowering and 100-seed weight is the highest(95.97%,95.68%respectively),and hypocotyl length is the lowest(60.36%).Directly selected traits’ QTLs and alleles’ number,and the rate of phenotypic interpretation is large than other traits,such as 100-seed weight,first flowering and grow period.The number of QTLs and phenotypic interpretation rate of ecological traits(first flowering,growth period,plant height,number of nodes and so on)are the highest on Gm06,while the quality traits(100-seed weight,oil content and protein content)are relatively concentrated on Gm20.The nucleotide diversity of QTLs is more than the genome(224),and cultivated soybeans(LR 203,RC 194)are more than WA(191).22 QTLs have significantly FST among subpopulations,significantly FST among eco-regions of sub-populations is increase with artificial selection from WA to LR,and to RC(14,39,69,respectively).The distribution of most of alleles were significantly different among sub-populations(Chi-square,p<0.05),and the significant difference of QTLs’ distribution among subpopulations is increase with artificial selection.Tajimas’ D of 21 QTLs are significantly less than 0,and 33 QTLs are significantly larger than 0.Obviously,the CSGP is bottleneck effect more than directional selection in.Directional selection of WA is basically equal to bottleneck effect,while LR and RC are bottleneck effect more than directional selection.5.Traits’correlation and pleiotropismA total of 72 QTLs tended to have pleiotropism and 33 SNPLDBs are involved.A large number of SNPLDBs only involve 2 traits,but only a few SNPLDBs involve multiple traits.Different traits have different number of pleiotropic QTLs.There were more pleiotropic QTLs due to a large artificial selecting pressure of traits,such as first flowering time,mature period,reproductive stage and 100-seed weight.Pleiotropic QTLs distributed in each chromosome,with the most on chromosome 6(6).There are between first flowering and mature period has 7 pleiotropic QTLs,and there is pleiotropic QTLs between first flowering and reproductive stage,and among other traits basically had the same QTLs.Most of them had high correlation with pleiotropic QTL of traits,such as first flowering and mature period,mature period and reproductive stage,plant height and number of stem nodes.The others are the interaction of characters in the selection process,such as 100-seed weight and hypocotyl length.The third types of traits showed higher negative correlation,such as plant height and 100-seed weight.The fourth main types are high correlation but not direct selection relationships,such as 100-seed weight and reproductive stage.The analysis suggests that one allele may not be consistent with the effect between two traits,and this phenomenon may provide a new way of thinking for breeding improvement.Coordination of various traits in a complex breeding process is particularly critical,for example,when the allele of Gm05BLOCK3783850637338521435 is selected,in eco-region-I can be considered allele-2 due to short frost free period;in eco-region-Ⅱ has a slightly longer frost free period should use allele-1;in the southern eco-regions have a long frost free period,and the spring,summer and autumn soybeans exist at the same time,it should be combined with the sowing and farming system of reasonable selection of alleles to obtain high yield.6.Genealogical tracing of specific alleleCombine the QTLs’ alleles of 100-seed weight and the pedigree relationship of soybean,to analyze the transmission of alleles between WA and RC,the distribution of new alleles in accessions within LR and RC.The alleles’ sources of new and direct immigration are clarified.First,the direct transmission alleles could have existed.The second part of the ancestral parent information loss makes the new or direct immigration of alleles information is not clear.Third,analysis and inference indicated that partial direct transmission and newborn of alleles were originated from landraces,due to the loss of alleles in the breeding process of small populations,or the indistinct confine between LR and RC.7.Optimal combination of some agronomic charactersThe optimal combination of single and multiple characters is designed by using the QTL obtained by GWAS.The analysis found that most of the traits can obtain excellent combination of selection conditions.The utilization of alleles in some wild materials is the key to the improvement of protein content.The optimal combination of 100-seed weight is easier to obtain,and the optimal combination design of multiple traits in the third ecological area is emphatically analyzed. |