Font Size: a A A

Genome Organization And Evolution Of Duplicated Genes In Yellowhorn(Xanthoceras Sorbifolium Bunge.)

Posted on:2023-12-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:H LiuFull Text:PDF
GTID:1523307292973739Subject:Tree genetics and breeding
Abstract/Summary:PDF Full Text Request
Genomics is the foundation of modern biotechnology and supports the innovation and utilization of germplasm resources.In-depth analyses of genomic sequences,and identification of protein-coding genes and regulatory elements and important genomic structural features,provide important prerequisites for revealing important biosynthetic pathways and regulatory networks for resistance and other important traits.These studies are fundamental for subsequent functional studies and germplasm innovation,including the identification of gene function,genomic selection,and gene modification and editing.Yellowhorn(Xanthoceras sorbifolium Bunge.)is an important woody oil tree unique to the northern China.It is outstanding in stress tolerance and thus an important native afforestation species.Genetic research on this species started only recently,thus there is little knowledge about its genome organization,the distribution of structural variants and their effect on gene expression,and the key genes involved in verylong-chain fatty acid biosynthesis pathway and regulation.This thesis assembled a high-quality genome of yellowhorn and conducted comparative analyses among three genomes of yellowhorn,each representative of a breeding variety,to characterize genomic structural variation.To assist studies on centromere sequence composition and dynamics,the thesis established a new program of centromere identification.Using transcriptome and gene co-expression analysis,the thesis identified the key genes and gene regulations involved in the very-long-chain fatty acid(VLCFA)biosynthesis pathway and stress tolerance.Through the comparisons among species and functional annotation,the thesis clarified the evolution pattern of duplicated genes in yellowhorn and revealed that gene duplication was the genomic basis for the formation of stress resistance in yellowhorn.The main results of this thesis are:(1)This study assembled a high-quality genome of yellowhorn,breeding variety “JGXP”,and annotated all the genes.The chromosome-level genome of yellowhorn was assembled with a size of 470 Mb,and identified 22,049 high-quality protein coding genes.Repetitive sequences account for 66% of the genome,among which LTR retrotransposons(LTR-RTs)are the most abundant repetitive sequences,accounting for 30% of the genome.The high proportion of LTR-RTs in the yellowhorn genome is maintained by a moderate birth rate and a low removal rate.(2)This study identified genomic structural variations among yellowhorn genomes,and found that inversions were the main type of structural variation.Structural variation in yellowhorn is common and unevenly distributed on chromosomes.The main structural variants are inversions in yellowhorn,and the large inversions generally only appear at the ends of chromosomes with high gene density and recombination rate.These structural variations reduce the expression level of genes that they intersect.Among the three cultivars of yellowhorn,the core genes accounts for 65–68% of the total number of genes in each genome.The proportion of non-core genes in cultivars “JGXP” and “ZS4” is similar(27–28%),while this proportion in cultivar “WF18” is lower(14%).“WF18” contains a higher proportion of cultivar-specific genes.The expression level of core genes is significantly higher than that of non-core genes and cultivar-specific genes.(3)This study established a new computational program for identification of genome centromeric sequences based on chromatin interaction information.Centromeres control chromosome segregation in cell division,and play important roles in maintaining stability of genome structure and the timing of genome replication.The principle of identifying centromeric sequences based on chromatin interaction map has been verified,but the existing tools are difficult to implement in real data analyses.This thesis established a Python program that can be widely applied in genomic analysis for identification of centromeric sequences.This program is released and publicly available.(4)This study identified and characterized the centromere sequence composition of yellowhorn.LINE1 and Gypsy retrotransposons rather than tandem repeats are the main components of yellowhorn centromeres.This is the first report on centromeres formed by LINE1 retrotransposon in eukaryotes.The centromeric region of yellowhorn has low gene density and high GC content,and high density of LINE1,Gypsy and Copia retrotransposons.Further phylogenetic network analysis showed that LINE1 and Gypsy retrotransposons in the centromere region of yellowhorn evolved independently of other genomic regions.The centromeric specific retrotransposons were formed recently,their generation time is significantly shorter than that of the non-centromeric retrotransposon,and their lengths are significantly longer than that of the non-centromeric retrotransposon.This study provides a unique example for understanding the formation and mechanisms of plant centromere sequences.(5)This study clarified the evolutionary pattern of genome and duplicated genes of yellowhorn.Yellowhorn and grape shared the ancient hexaploidization event(γ)common to core eudicot without recent whole-genome duplication(WGD).Since the γ event,yellowhorn ancient genome experienced 9fission and 15 fusion events,and formed the 15 modern chromosomes.Tandem duplication(TD)and proximal duplication(PD)are the main model of gene family expansion in yellowhorn.Genes generated by TD and PD are more recent,while WGD produced genes are from the ancient γ even.TD and PD genes are subject to relatively relaxed purifying selection.Divergence in gene expression is common between two copies of duplicated genes;60–66% of gene pairs of PD and TD genes show differential expression,while 73–78% for the gene pairs from the other three classes of gene duplications,including WGD,transposed duplication(TRD),and dispersed duplication(DSD),are differentially expressed.In the absence of recent whole-genome duplication,gene duplication is an important mechanism of stress tolerance in yellowhorn because the recent tandem and proximal duplicated genes are found involved in the biosynthesis and metabolism of stress resistance and a variety of secondary metabolites.The wholegenome duplicated genes are mainly involved in biological functions such as growth,development,and stress resistance.(6)This study identified a total of 38 and 94 candidate genes involved in the biosynthesis of VLCFA and flavonoids,and the potential regulation of key genes and transcription factors in the two pathways.We classified the subfamily distribution of the 18 KCs genes and found the β subfamily was lost in yellowhorn compared with Arabidopsis thaliana.This thesis provides important genomic resources for basic and applied research in yelowhorn,and will guide genomic selection and trait improvement through gene editing and more effective breeding.Consequently,this study will improve the socio-economic benefits and utilization of yellowhorn.
Keywords/Search Tags:yellowhorn, centromere, structural variation, duplicated gene
PDF Full Text Request
Related items