Construction Of The Soybean Genome Database SoybeanGDB

Posted on:2023-08-24

Degree:Master

Type:Thesis

Country:China

Candidate:H R Li

Full Text:PDF

GTID:2543306809955049

Subject:Biological engineering

Abstract/Summary:

PDF Full Text Request

Soybean(Glycine max)is an important crop in the world,providing a large amount of protein and edible oil for human,in addition to a wide range of uses.With the development of sequencing technology,more and more plant genomes were decoded.These genome data are usually stored in professional databases,playing important roles in the study of plant genes functions and molecular assisted breeding.In recent years,the gold reference genome of China’s cultivated soybean Zhonghuang 13 and other high-quality soybean genomes were published.However,these data were not been properly stored in specialized databases.In this study,40 high-quality soybean genomes and high-quality SNPs(Single Nucleotide Polymorphism)and InDels(Insertion and deletion)among 2898 soybean varieties were collected.Based on these datasets,we conducted transposon identification,structural variation analysis,homologous gene identification and gene function annotation in all 40 genomes.Finally,a comprehensive soybean genome database SoybeanGDB was constructed using the R/shiny software package,providing multiple functional modules for data retrieval,analysis and visualization.The main results of this study are as follows:1.Thirty-two high-quality genomes were collected,and structural variations,transposons,transcription factors and transcriptional regulatory factors were identified,Functional modules were built in the database for query of these information.2.A genome browser was constructed for each of the 40 genomes the using JBrowse2 for users to view the information of different genomes.3.A total of 31,870,983 SNPs and 6,127,057 InDels identified among 2,898 soybean samples were collected.After filtering low-quality data,15,446,616 high-quality SNPs and4,136,231 high-quality InDels were obtained.Based on these result,functional pages for querying SNP and InDel information,linkage disequilibrium analysis,nucleotide diversity analysis,single nucleotide polymorphism analysis,allele frequency analysis,etc.were constructed.4.The gene expression information of Zhonghuang 13 across multiple tissues and developmental stages was collected.Functional pages for gene expression query and gene coexpression analysis were constructed5.The chromosome sequences,protein sequences,gene sequences and CDS(Coding sequence)sequences of 40 high-quality genomes were extracted,to build the BLAST page,with blastn,blastp and other sequence alignment functions.6.Based on the high-quality SNPs and InDels and the chromosome sequence of Zhonghuang 13,the primer design interface was constructed.The homologous genes among the 40 genomes were identified,and the functions of primer design and querying homologous gene information were provided.7.GO(Gene Ontology)and KEGG(Kyoto Encyclopedia of Genes and Genomes)annotations were performed for protein-coding genes of 40 genomes.The annotation information was integrated into the database,and function modules for annotation retrieve and gene functional enrichment analysis were provided.Finally,after a period of design,development and test,SoybeanGDB was constructed and deployed on the cloud server.The users can visit https://venyao.xyz/SoybeanGDB/ for online use of SoybeanGDB.

Keywords/Search Tags:

Soybean, Genome database, SNP, InDel, Zhonghuang 13, GO enrichment analysis, Genome browser

PDF Full Text Request

Related items

1	Construction And Application Of Common Carp Genome Database
2	Construction Of Rice Pan-genome Browser
3	Database Construction Of Yak Genome
4	A Reference Genome Sequence Of Gossypium Hirsutum TM-1 And Its Usages In Cotton Compative Gemomics Analysis
5	Construction And Application Of Ruminant Genome Database
6	Studies On Genome Evolution And Membrane Transporters Recognition Of Forest
7	Using The Comparative Genome Method To Study The Pigâ€™s Economic Traits Related InDel
8	Exploration On The Construction Of Oat Genome Database And Its Application In Whole-genome Selection Breeding
9	Generation Of GmFT1a And GmFT4 Mutantsusing A Widely-adapted Soybean Variety Of Zhonghuang 39
10	The Evolutionary Characteristics Of INDEL Variation During Dog Domestication Revealed By Genome-scaledpopulation Analysis