Font Size: a A A

Research On The DNA Sequences Analysis Based On Graphical Representations

Posted on:2010-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:G H HuangFull Text:PDF
GTID:2178360275981832Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the completion of HGP (Human Genome Project) and the implement of the genome project of some model organisms, a huge amount of biological molecular data has been generated constantly. The storage, management, analysis and research of these sequence data have promoted the integration of molecular biology, computer science and mathematics and then lead to the birth of bioinformatics that is becoming one of the valuable and frontal field of today's life sciences and natural sciences and at the same time one of the core area of natural sciences in the 21st century. Their research mainly focuses on both Genomics and Proteomics, that's to say, in terms of protein and nucleic acid sequences, to analyze structures and functions of biological information in sequences; Their research is very rich, which includes sequences comparison, gene identification, molecular evolution and comparative genomics, RNA and protein structure prediction, computer-aided drug design, and so on.Graphical representation that has been recently developed and applied to DNA sequences analysis is a powerful and visual tool, and can discover a wide range of biological structures and functions information hidden in DNA sequences. Reviewing the current graphical representations involving 2-dimension, 3-dimension and higher dimension and their application, the paper has proposed a new 2D graphical representation of DNA primary sequences and then applied it to analyzing similarity and mutation among various DNA sequences. Its main content has been obtained as follows:(1) Denoting four kinds of basic nucleotide A, T, G and C respectively as four different two-component vectors, we have presented H-L graphical representation for DNA sequences. The advantage of the graphical representation is that long-range distinct patterns can be recognized visually. On the basis of the presented graphical representation, we have proposed an approach to analyze mutations between DNA sequences.(2) On the basis of the H-L graphical representation, we have presented a new quantitative measure of similarity and dissimilarity among various DNA sequences,and studied the similarities among multiple nucleotide sequences by comparing their corresponding curves, with the beta-globin genes from 7 species as an example. In comparison with previously published method, our approach is very easy to understand and its computation is single. This provides a quick and efficient way to analyze similarity and dissimilarity among various DNA sequences for both computational scientists and molecular biologists, and further supplied analysis of evolution among them with an optional method.(3) Improving the graphical representation, "four lines", for DNA sequences, we have presented an approach to search optimal alignment and judge mutations based on the improved graphical representation.
Keywords/Search Tags:Bioinformatics, DNA sequences, Graphical representations, Sequences comparison, Gene recognition, Sequence alignment, DNA mutation
PDF Full Text Request
Related items