Font Size: a A A

Mathematical Representation And Analysis Of DNA Sequences

Posted on:2008-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:H F GuoFull Text:PDF
GTID:2120360218955373Subject:Basic mathematics
Abstract/Summary:PDF Full Text Request
With the development of the genome projects of human and some model or-ganism, the focus of biology shifts from accumulation of biological data to the anal-ysis and interpretation of them, and thus bioinformatics, also named computationalmolecular biology, emerges as a new and developing interdiscipline. The researcharea of bioinformatics is very wide, including sequence comparison, gene recognitionby computers, molecular evolution and comparative genomics, RNA and proteinstructure prediction, codon origin and evolution of the genetic code, assembly ofcontigs, structure-based drug design, and so on. Most of them have a commonrequirement-the biological data must be transferred into a certain mathematicaldescription, this leads to that the mathematical description of the biological macro-molecules becomes a basic but very important topic in bioinformatics.There are three chapters in this paper. In the first chapter, some elementaryknowledge of biology is given. In the second chapter, we bring forth a new 3Dvector based on the nucleotide ratio to characterize and analyze DNA sequences.Via calculation we find the new scheme is effective in relevant study.In the third chapter, we construct a 4-component vector based on the infor-mation theory to characterize DNA sequences, and apply the introduced vector tothe comparison of protein coding, non-coding sequences and random ones. Throughcalculation, we find that there are remarkable differences between coding and non-coding sequences, and vector from random sequences is closely related to that fromthe coding sequences. Our results are deviated significantly from the fact that non-coding regions are more related to random sequences, the reasons for which shouldbe studied furthermore.
Keywords/Search Tags:Bioinformatics, DNA sequences, Nucleotide ratio, Entropy, Coding region, Noncoding region
PDF Full Text Request
Related items