Font Size: a A A

Study On The Characteristics Of SORF Contained In LncRNA

Posted on:2019-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhaoFull Text:PDF
GTID:2370330545488828Subject:Microbiology
Abstract/Summary:PDF Full Text Request
Long noncoding RNA(LncRNA)is initially defined as a sequence that can be transcribed but not encode proteins.Many people think the noncoding region that don't have biological function is dark matter area.With the rapid development of genomics and informatics,further studies have shown that non coding regions can transcribe ncRNA and these ncRNA sequences have some specific functions and play an important regulatory role in organisms.So the research related with ncRNA has been a hot spot.Recently,it has been found that the small ORF(sORF)contained in ncRNA can encode short peptides,and these peptides play important physiological functions in some stages of growth and development of organisms.In addition,due to its short sequence and low abundance,sORFs has long been widely neglected and considered not to be unable to be translated.With the rapid development of research technology,a large number of short peptides encoded by sORFs have been discovered,so the research of sORFs has entered a new stage.At present,there is a lack of research methods and database resources for sORFs with encoded peptide capability.The existing research methods are immature and the research directions are not focused.In particular,there are few studies on lncRNA sequences that are considered to be uncoded,so it is of great scientific significance to reveal the sORFs features in lncRNA for the study of non coded RNA.In this context,this paper systematically studies the distribution characteristics and coding characteristics of sORFs in lncRNA based on a variety of bioinformatics methods.The distribution of sORFs in lncRNA shows that sORFs is common in lncRNA,and its number is closely related to the length of lncRNA and the length of sORFs itself.Then,using a variety of sequence analysis methods,the coding sequence from the NCBI database is used as a reference.The coding features of the sORFs sequence are analyzed from the nucleotide sequence composition,the amino acid sequence composition and the function three levels.In order to compare the nucleotide composition characteristics between sORFs and coding sequences in lncRNA,the analysis of trinucleotides frequency preference and 75 characteristic parameters based on TN curve and Z curve of nucleotide sequences using the principal component analysis method.The results show that some sORFs in lncRNA have the same the nucleotide composition characteristics as the coding sequence.In order to compare the ordered region and disordered region of amino acid sequence composition characteristics between sORFs and coding sequences in lncRNA,the analysis of trinucleotides frequency preference and codon frequency preference using the principal component analysis method.The results show that some sORFs in lncRNA have the same the ordered region and disordered region of amino acid sequence composition characteristics as the coding sequence.At the same time,functional analysis of amino acid sequences was carried out using Blast.The results showed that there were sORFs sequences with functional characteristics in lncRNA.In summary,some sORFs with coding sequence characteristics are in lncRNA,and 91 sORFs sequences with coding characteristics are obtained accordingly.The results are consistent with the discovery of sORFs encoded in lncRNA in recent years.It provides reference value for experimental study of peptide encoding ability of lncRNA.This study can make stable for theoretical basis for finding new short peptides with biological functions,and furnish a new idea for lncRNA and sORFs research in the future.
Keywords/Search Tags:LncRNA, sORFs, Sequence analysis, Peptide
PDF Full Text Request
Related items