Font Size: a A A

Data Collection And Analysis For Construction And Application Of The Cotton Multi-omics Database

Posted on:2021-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:F DaiFull Text:PDF
GTID:2393330611457299Subject:Master of Agriculture
Abstract/Summary:PDF Full Text Request
As one of the most important economic crops in the world,cotton occupies an important position in national basic industries such as agriculture and textiles.Since 2012,the genomes of various cotton species have been assembled one after another,and research on cotton omics data has also become a hot spot.With the raw data of cotton sequencing in the public platform database having gradually increased,through the collection of cotton omics data and analyzing,a comprehensive multi-omics online database has been established to facilitate users to query and visualize the results of these analyses,which is helpful for cotton breeding,cotton molecular biology and bioinformatics.The project's main findings are as follows:(1)This study collected and additionally annotated the latest versions of the genomes of different cotton species.Functional annotation of the genes were carried out by comparing to multiple databases,the KEGG,GO,protein annotation,and homologous genes of the genes were re-predicted.And we collected SSR primer information developed by multiple cotton organizations,removed the redundancy and combined the latest version of the island cotton and upland cotton genomes to develop a set of more than 60,000 pairs of genome SSR primers covering the entire genome.Comparing the genome of the upland and sea island cotton,1358873 InDel sites were identified,then a set of high-density InDel primers covering the upland and sea genomes was designed based on these sites.(2)This study is based on cotton multi omics data in public databases,combined with a part of the our research group's own data,A total of 1180 resequencing,314 conventional transcriptome data and 384 epigenetics data were collected and analyzed.23,244,964,28,481,795,and 32,21,933 mutation sites have been excavated from upland cotton,sea-island cotton and other cotton species,respectively.Based on the transcriptome expression information,construct an expression association network covering 47,361 genes.(3)Based on data mining of cotton multi omics,a cotton multi omics database was constructed,namely Cottonomics(cotton.zju.edu.cn).The database contains multi omics information of four cotton species,including genomic information,genomic variation,transcriptome expression information,and epigenetics modification information.Cottonomics provide multiple analysis modules such as online genome browsing,mutation query of thousands of cotton samples,visualization of transcriptome expression,and online browsing of various epigenomics data.In addition,the database also integrates multiple functions such as online BLAST,gene sequence extraction,homologous gene conversion,and primer search.
Keywords/Search Tags:cotton, multi-omics analysis, variation, functional annotation, regulatory network
PDF Full Text Request
Related items