Font Size: a A A

The Design And Implementation Of NcRNA Gene Finding Model

Posted on:2007-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:N Y GuanFull Text:PDF
GTID:2178360215469957Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Bioinformatics is a field which combines computer science and life science. It processes,stores,searches and analysises data produced in life science using theories and correlative algorithms of computer science. With increaing of biologic sequence data, more and more focuses have been put on processing data by efficient algorithms. Gene finding is one of these focuses which is to predict either the regions coding proteins or the regions regulating gene expression but do not coding any protein from DNA sequences.The finding question of non-coding RNA (non-coding ribonucleic acid, ncRNA) genes was studied in the thesis. Its method is using the technique of csHMM (context-sensitive hidden markov model) and the species evolutionary relationship to set up a new computational framework which is able to distinguish non-coding RNA genes from genome.The strong emphasis of the thesis was laid on using csHMM model and species evolutionary relationship to set up the secondary structure model of non-coding RNA. Firstly, basic secondary structure model of non-coding RNA was set up using csHMM model. Secondly, probabilities of emitting paired residues were computed from amino acid mutation matrix representing the species evolutionary relationship to form a new computational framwork of non-coding RNA gene finding called pair-csHMM. Thirdly, we modified the Inside-Outside algorithm of csHMM model to optimize pair-csHMM, whose aim was to distill feature of RNA secondary structure from known RNA sequence. Finally, a prototype system was implemented to find non-coding RNA gene.The main difficulties encountered in the thesis were the establishment of the non-coding RNA model and its parameter optimization. Not only the secondary structure conservation of non-coding RNA but also its sequence conservation between evolutionary processes was integrated into the non-coding RNA model using csHMM model. And the Inside-Outside algorithm of csHMM was modified for training the non-coding RNA model to make it more accurate. The result of testing indicates that the new framwork can be used to find non-coding RNA genes.The new ideas were summarized as follow: (1)The csHMM model was used to predict non-coding RNA genes. The result testing indicates that the model improves the differential of non-coding RNA gene finding. (2)The species evolutionary relationship was introduced into pair-csHMM model. The result of testing indicates that the nearer the evolutionary distance between the aligned genome and non-coding RNA model the more it can predict non-coding RNA genes. (3)A prototype system called RNA-cs was implemented to predict non-coding RNA genes.
Keywords/Search Tags:Bioinformatics, non-coding RNA, gene finding, HMM
PDF Full Text Request
Related items