Font Size: a A A

Research Of Human Gene Splicing Donor Site Recognition

Posted on:2005-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:J LeiFull Text:PDF
GTID:2120360122991248Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
The Human gene splice site recognition is the one of most important subject inBioinformatics. Gene splicing, especially the splicing of pre-mRNA is a importantmeans of gene expressing, it affects the constitution of gene, determines directly theselection and synthesis of protein, and so indirectly determines the characters and thefunction of organism. This paper made attempting research to the human genesplicing donor site sequence feature and site recognition. The main research resultswe have got are as follows: 1) We established a database of human gene donor site, and made dinucleotidestatistic to the data, analyzing the relationship of the donor site sequence feature andit's feature dinucleotide. The result shows there are certain rules for the presence ofthe feature dinucleotides besides the donor site. When the feature dinucleotides of oneside are absent, correspondingly, the present probability of the feature dinucleotides ofthe other side will increase greatly, otherwise, the effect is the same. 2) We made research to the bioinformatics methods for gene splicing siterecognition, and then selected BP neural network to build model to make analysis tothe relationship of exon,intron and gene donor splicing. The results shows that theinformation that splicing site differ from false splice site is located in the exon andintron of two sides of splicing site. This can be limited to 50 dinucleotides. And theintron includes more feature information compared to exon. 3) We provide a gene donor site recognition method that only relies on thefeature dinucleotide motif. The rate of true positive of this method can achieve 83%.The rate of false negative can achieve 90%. This approves the function of featuredinucleotide to the gene splice site recognition. However, the rate of true positive ofthe motif method we referred to can achieve 90%. This also demonstrates theaffection of the non-feature dinucleotide to the gene splicing. 4) We build a model of donor splice site recognition based on learning vectorquantization network. Using this model, we made research to the effect and feasibility, II北京工业大学工学硕士学位论文and made comparison between the two learning algorithm of LVQ, which are LVQ1and LVQ2.1. The experiment demonstrates that the LVQ network can resolve theproblem of gene splice site recognition. The rate of true positive of LVQ1 is betterthan that of BP network, and the rate of false negative of LVQ2.1 is better than that ofBP network. This thesis has got the support of National Science Foundation of China. Theproject is Research of Some Bioinformatics Problems Under Complex System.
Keywords/Search Tags:gene splicing, donor site, neural network, motif, learning vector quantization(LVQ).
PDF Full Text Request
Related items