Cotton is the largest cash crop in China and its cotton fiber and cottonseed have important economic value.At present,the main cultivated cotton species in the world is allotetraploid Upland cotton(Gossypium hirsutum),and fiber quality improvement is one of the momentous strategy in current Upland cotton breeding.Comprehensively understanding the genetic variation and population structure of cotton germplasm resources,mining and analysis of fiber quality related locus and candidate genes are of great significance for the heredity improvement of fiber quality.In this study,4,180 cotton accessions resequencing data were collected to construct a large population variant genotype map and the population structure and genetic diversity of different cotton subgroups were systematically analyzed,and by integrating genome wide association analysis(GWAS)and e QTL(expression quantitative trait locus)results for summary data-based mendelian randomization analysis(SMR),to mine and analyze candidate locus and genes related to fiber quality traits.The main results are as follows:(1)Construction of a large-scale cotton population genetic variation map.In order to systematically understand the variation resources of different cotton subgroups,resequencing data of 4,180 cotton accessions were collected,covering six major cotton genera including Upland cotton,with geographic distributions covering hundreds of regions on six continents.Through alignment,quality control and variation identification,the largest cotton population variation map to date was constructed,including 12,903,345SNPs and 1,381,741 In Dels,with average densities of 5.67/kb and 0.60/kb,respectively,of which 13,468 SNP/In Dels were large-effect variants,affecting 10,775 genes.(2)Population structure and genetic diversity analysis.Based on the results of population structure analysis,cotton species and geographic location information,4,180cotton accessions divided into 8 subgroups(G0~G7),including wild cotton subgroup(G0),landrace from Central America subgroup(G1),landrace from South China subgroup(G2),cultivated and improved Upland cotton subgroup(G3~G6)in different geographical regions in China and Sea-Island cotton subgroup(G7).Genetic diversity analysis showed that there were significant differences in the level of genetic diversity among different subgroups.Among them,the wild/semi-wild cotton subgroup(G0~G1)has higher genetic diversity(π:G0=1.30×10-3,G1=1.11×10-3)and smaller linkage disequilibrium(LD)decay distances(64.77 kb,109.97 kb),while the cultivated Upland cotton subgroups(G2~G6)had lower genetic diversity(2.33×10-4~4.94×10-4)and larger LD decay distances(174.76~498.02 kb),the genetic diversity(6.93×10-4)and LD decay distance(213.29 kb)of the Sea-Island cotton subpopulation were located in between.The fixation statistic(FST)analysis showed that the FST between cultivated cotton subgroup(G2~G6)and wild/semi-wild cotton subgroup(G0~G1)was larger(0.189~0.472),while the FST among G2~G6was small(0.008~0.048).(3)Mining candidate genes related to fiber quality traits by multi-omics.Using the genotype data of 1,240 cultivated Upland cotton accessions and four groups of phenotypic data related to fiber quality for GWAS analysis,three significant loci related to fiber elongation rate(FE1~FE3),3 to fiber length(FL2~FL4),4 to fiber strength(FS1~FS4)and one to micronaire value(q MV_He1782.D11.1)were identified respectively,of which FS4 was newly identified in this study.SMR analysis based on GWAS and e QTL results of fiber development detected 10 candidate genes significantly related to fiber elongation rate,4 to fiber length and three to fiber strength.Combined with gene function annotation,variation phenotype association and expression differences between different genotypes,3 candidate genes(Gh ABS4、Gh ACD2、Gh Rop GEF)for fiber length,one candidate gene(Gh Rop GEF)for fiber strength and one candidate gene(Gh PRR)for fiber elongation were mined.In conclusion,by constructing the most systematic cotton population variation map,this study analyzed the differences in population structure and genetic diversity of different cotton subpopulations,and combined transcriptome and phenotype data to mine five candidate genes related to fiber quality.These results will help to deepen the understanding of the genetic variation of different subgroups of cotton,and provide rich data resources for assisting the research on cotton fiber quality-related genetics breeding. |