Font Size: a A A

Identification,Expression And Molecular Evolution Of COL Superfamily Genes In Cotton

Posted on:2015-03-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:R ZhangFull Text:PDF
GTID:1313330512472662Subject:Crop Genetics and Breeding
Abstract/Summary:PDF Full Text Request
Cotton is one of the worldwide important economic crops.The cotton genus currently includes 50 species distributed in arid and semi-arid regions of the tropics and subtropics.Most of these species are diploid(n=13),while five are allopolyploid(AD-genome;n=26).Gossypium tomentosum is endemic to the Hawaiian Islands,while G.mustelinum is restricted to a relatively small region of northeast Brazil and G.darwinii is native to the Galapagos Islands.In addition to these three true wild species,G barbadense and G hirsutum are two cultivated allopolyploid species that have been independently domesticated over a vast geographical area,with a wealth of morphological forms spanning the wild to domesticated continuum.Due to human-mediated influences and agronomic improvement,domesticated G barbadense and G hirsutum have been modified by parallel changes and exhibit extraordinary morphological variation,e.g.,the loss of photoperiod sensitivity,transformation from perennial shrubs and small trees to more compact,highly productive annual plants.The study for photoperiod of flowering time improves our understanding of the domestication-related genes/traits that have enhanced cotton adaptation and diversification during the evolutionary process.The CONSTANS(CO)transcription factor is a central regulator of the photoperiod pathway,which functions by mediating between the circadian clock and floral integrators.Several statistics have been developed to test the neutral equilibrium(NE)model and identify the selected genes based on different features of sequence data.Strong positive selection causes a reduction of nucleotide diversity;the second feature of the data is the frequency distribution of polymorphisms,and selection could skew the population frequency of genetic variants relative to NE expectations and linkage disequibrium(LD)can increase because of selection.Selection of COL homologs in other plants such as rice and maize appears to be common during parallel adaptation and diversification of plants.Due to the conserved function of CO in photoperiodic flowering and the limited information in cotton,we first identified the candidate COL family genes in cotton and then performed gene classification,chromosome location,the original expression,diurnal expression pattern and molecular evolutionary analysis were also examined.The results are as follows:1.We identified 23 cotton COL genes encoding both B-box and CCT domains using information derived from the publicly available database from the draft D5 genome(G.raimondii)and divided these genes into three subfamilies(?,?,and?),as reported by Griffiths et al.There are eight genes in group?(COL1-8),which are predicted to encode two B-box and one CCT domain,except for COL8,encoding a protein with one intact B-box,one incomplete B-box,and one CCT domain.Three genes in group?(COL9-12)were predicted to encode proteins containing one B-box and one CCT domain.The remaining 12 genes(COL13-23),which are in group?,encode one B-box,a second diverged zinc finger,and one CCT domain.2.Q-PCR analysis was performed to obtain the expression patterns of 20 COL genes in various organs of TM-1.The result revealed that six out of eight genes in group?were highly expressed in leaves,while the expression of COL6 and COL7 were abundant in flower,and COL2-5 also expressed higher in cotton fiber.The three genes in group?all showed preferential expression in leaves.Generally,the expressions of group?are detected in a wide range of tissue except COL14,COL20 and COL22 with abundant expression in stem,leaves and fiber respectively.3.To examine the circadian rhythm of the candidate COL genes in cotton,we investigated the expression level in the seeding leaves of TM-lorH7124 when the third leaf fully open under long-day(LD)(16h light/8h dark)or short-day(SD)(8h light/16h dark)condition respectively.The diurnal expression patterns of the eighteen COLs suggested their conserved function in regulating the light signaling pathway in cotton.4.Previous reports have shown that CO/Hd1 is involved in a conserved pathway regulating flowering in plants.Due to the conserved function of CO in photoperiodic flowering and the limited information in cotton,we focused on the eight COLs in group I,which are also clustered in the same group with the Arabidopsis CO and rice Hd1 flowering time loci,and we studied their sequence,structure,and molecular evolutionary rate variation in 25 cotton accessions,as well as Thespesia populneoides(Roxb.)Kostel as a phylogenetic outgroup.The eight genes are highly conserved,and their full-length genomic DNA sequences are ranging from 1,030 bp(COL6)to 1,611 bp(COL1),with exception of frame-shift mutation for 1bp deletion in few species.This intron ranges from 77 to 680 bp in length,with the longest intron present in COL1,COL2,and COL8 compared with that of the other family members.For the same subgenome in different cotton species,insertion/deletion events occurred in introns or exon?of COL2,COL6,and COL8,leading to their length variation,while the remaining five genes had the same length in the same subgenomes of different cotton species.The structures of A-and D homeologs from the same gene were further analyzed.Length differences were present in homeologs of COL4 and COL7,which were caused by insertions/deletions in exon?or?.5.Phylogenetic tree analysis revealed that the outgroup Thespesia populneoides(Roxb.)Kostel was the most divergent member of this group and clustered into an individual clade,while the other members were divided into two principal clades;the A-genome and A-subgenomes comprise one monophyletic clade,while the D-genomes and D-subgenomes represent another monophyletic clade.The Ks values of 98.37%in all tetraploid cotton species examined were higher in the A-D and At-Dt comparison than in the A-At and D-Dt comparisons,and the Pearson's correlation coefficient(r)of Ks between A vs.D and At vs.Dt also showed positive,high correlations,with a correlation coefficient of at least 0.797.The results above all suggested that the duplicated genes of the A-and D-subgenome from allotetraploid species have evolved independently after polyploid formation.6.Pairwise comparisons of nucleotide diversity(?)for the combined sequence of the eight COL genes and each gene between subgenomes within each allotetraploid species was performed,respectively.The average?value of the combined sequence in the D vs Dt(0.01051)were significantly greater than the value in A vs At(0.00586)(P=4.9E-21).In detail,six genes,including COL2 to COL5,COL7 and COL8,showed significantly higher nucleotide diversity in the D-subgenome than in the A-subgenome of the allotetraploid species examined.However,COL6 showed significantly higher nucleotide diversity in the A-subgenome than in the D-subgenome.There is no significant difference in the A vs At and D vs Dt in COL1.These results indicate that the eight COLs in group I harbor different evolutionary rates between homeologs of the allotetraploid species,and most genes of the D-subgenomes have been evolving more rapidly than those of the A-subgenomes.7.To further explore the domestication forces acting on allotetraploid species,we divided the tested allotetraploid species into three types,including tetraploid wild species,semi-domesticated and domesticated species of G hirsutum or G barbadense.The nucleotide polymorphism in wild species is significantly higher compared to G hirsutum(0.00369 vs 0.00139)(P=0.001)and G barbadense(0.00369 vs 0.00035)(P=7.3E-7),indicating a genetic bottleneck associated with the domesticated cotton species..Three characteristic domains in eight COLs exhibit different evolutionary rates,there were no diffrences in ? between B-box and the Var domain(P=0.285),and the two domains evolve signifacally faster than the CCT domain(0.00284 vs 0.00119,P=0.028 for B-box and 0.00243 vs 0.00119,P=0.014 for Var domain,respectively).These results demonstrate that the B-box and Var domains have been quite variable,while the CCT domain is highly conserved in sequence and function among 25 cotton accessions.8.To test the departure from neutrality,Tajima's D(1989)and Fu and Li's D and F(1993)were estimated to test whether the nucleotide polymorphism data of the eight COL genes fit the neutral model..We showed that both COL1 A-subgenome and COL2 D-subgenome in G hirsutum significantly deviated from the neutral expectation with a negative value,indicating an excess of low frequency alleles.And the negative values are consistent with the possibility of recent positive selection in G hirsutum.Fu and Li'D and F were significantly positive in COL8 D-subgenome of G hirsutum at P<0.1,this result suggest that the allele of COL8 maintained a high frequency variants and might experience balance selection.Taken together,COL1,COL2 and COL8 endured greater selective pressures during the domestication process.
Keywords/Search Tags:Gossypium, CONSTANS, Photoperiod, Selection, Molecular evolution
PDF Full Text Request
Related items