Font Size: a A A

Discovery,Functional Analysis And Database Construction Of Human And Invertebrate Structural Non-coding RNA

Posted on:2021-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:L J HouFull Text:PDF
GTID:2370330611961966Subject:Engineering
Abstract/Summary:PDF Full Text Request
Non-coding RNA accounts for more than 98% of the human genome and plays an important role in gene expression and regulation.Non-coding sequences in eukaryotic genomes include 5'UTR,3'UTR,and introns.Non-coding RNAs include tRNA,rRNA,small nuclear RNA(snRNA),small nucleolar RNA(snoRNA),and telomerase RNA.These functional non-coding RNAs usually have good secondary structures.The structure of RNA is closely related to its function,non-coding RNA with different functions has different secondary structural characteristics,and the secondary structure of RNA is very conservative in long-term evolution.Therefore,structural non-coding RNA often has important biological functions.By improving the bioinformatics pipeline of previously established structural non-coding RNA,we have established bioinformatics pipelines for structural non-coding RNA discoveries in animal genomes,and have discovered a large number of new structural non-protein-encoded RNAs(ncRNAs)in human and invertebrate genomic species.We used an established scoring system to sit through structural non-coding RNA in non-coding RNA libraries to improve the accuracy of predictions.We use RNAcode to evaluate the coding capabilities of these structural non-coding RNAs to further remove structural RNA with potential coding capabilities.At the same time,we use Infernal cmscan to remove known functional non-coding RNAs.For some structurally good non-coding RNAs,we use the perl program to extract the basic information about its location,host gene,location of the gene,and so on,which may be relevant to its biological function.We used RBPmap to predict the binding proteins of the motifs,RegRNA 2.0 is used to analyze the regulatory elements it may contain,and for further analysis,we also performed functional clustering of its predicted binding proteins through the online website metascape,and for structural non-coding RNA that may be associated with splicing,we analyzed the diseases that may be related to TCGA-SeqDatabase.In summary,using CMline,we acquired a large number of structural non-coding RNAs in the human genome and invertebrate genome,and then screened them using a scoring program and performed functional analysis of some high-score motifs using established functional analysis processes and validated their expression in cells by designing specific primers.We found 17,329 structural non-coding RNAs in the human genome and 26,975 structural non-coding RNAs in the genome of invertebrates.Then we collected the information of the host genes,chromosome locations,genetic locations,representative species,representing sequences,binding protein and regulatory elements contained in the motifs.Tools such as Linux,Apache,MySQL,and PHP scripts have successfully uilt the Str_ncRNA database.These studies will strongly promote the discovery of new structural non-coding RNAs and their functional studies.
Keywords/Search Tags:Humans and invertebrates, Structural ncRNA, Non-coding RNA function, Database
PDF Full Text Request
Related items