Font Size: a A A

Several Problems About The Bioinformatics

Posted on:2009-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:J X ZhangFull Text:PDF
GTID:2178360272957409Subject:Industry Technology and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,genome projects have given rise to an exponentially growing amount of genetic information.How to find out useful information in the huge amounts of data is the problem that scientists focus on in current and future.One of the most important and basic problems is the gene identification,namely the identification of protein-coding regions in DNA sequences through computational means.In present,a number of methods for gene detection,based on distinctive features of protein coding sequences have been proposed.For example:the method based on correlation function,neural net-based method,Fourier-based analysis,statistics and so on.The methods of the identification of protein-coding regions in DNA sequences can be classified as two kinds,one is based on the difference between the coding and uncoding regions in DNA sequences.Other is based on the signs of the protein-coding regions in DNA sequences,For example:The distribution of the codon and stop-codon.In this study,firstly we introduce the status,basal concepts,researchful content and methods of the Bioinformatics,Then we use the three different methods to find out the CpG island and determine the possible position of the gene.We present a new method to denoting DNA sequences(R14)based on the distributions of "stop-codon" and'reverse complementation stop-codon".Using the theory of Shannon entropy,we ameliorate the measure of the Jensen-Shannon divergence andβ- KL divergence,And Compare with the previous results of experimentation obtained by our method,Showed that recognition efficiency based on the new information measures with the vector(R14) rise 89%,And more than that of by Bernaola's methods presented 70%.And the time of the calculation is reduced remarkably.
Keywords/Search Tags:coding and uncoding regions, Bioinformatics, CpG island, (?)14, β-KL Divergence
PDF Full Text Request
Related items