Font Size: a A A

DNA Sequences Classification And Computation Scheme Based On Symmetry

Posted on:2011-12-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:X G SongFull Text:PDF
GTID:1100360305465729Subject:Physical chemistry
Abstract/Summary:PDF Full Text Request
Advancement of DNA sequencing techniques accelerates the increase of DNA sequences data; one important challenge is to identify the biological significations of the huge amounts of DNA sequences. To explore the complex relationships of the structure-to-function in essentials, it is necessary to make some attempts and endeavors from mathematical and physical points of view. Simplicity and symmetry play central roles as guiding principles in nature. However, different symmetry breakings make our world show itself complex external phenomena. If the essential principles can be deeply understood and utilized in some ways, then the current patterns may be complex but derivable from relatively simple generative rules.Actually, about 1.1% to 1.4% of human genome is sequences encoding protein. However, a variety of repeat sequences containing multifarious novel symmetrical structure account for more than 50% that always playing a key role in formation, organization and regulation of genes, and is important information of biological evolution. For example, the studies about direct repeat and palindromic sequences in human sexual chromosomes X and Y provide help to discuss the sequence evolution features such as evolution stability. This works focus on the mathematical modeling for the various symmetrical sequences and theoretical research of quantity and structure.Here we present a new scheme for understanding the structural features and potential mathematical rules of symmetrical DNA sequences using a method containing stepwise classification and recursive computation. By defining the symmetry of DNA sequences, we classify all sequences and conclude a series of recursive equations for computing the quantity of all classes of sequences existing theoretically; moreover, the symmetries of the typical sequences at different levels are analyzed. The classification and quantitative relation demonstrate that symmetrical DNA sequences have recursive and nested properties in structure and quantity. Primary experiment demonstrates the quantity of various perfect symmetrical patterns in prokaryotic genome is closely related with replication origins, and the content can be a measure for identifying prokaryotic DNA replication origins. The method is different from previous researches which search objects involving only direct repeat, mirror, reverse repeat sequences and imperfect symmetrical sequence, the result shows that perfect symmetrical sequences have important biological significance. The classification and recursive equations achieve our goal of understanding "forest" by studying "trees" in some sense. It is worth noting that the method can help us to achieve the information about structure and amount of longer sequences in accordance with that of shorter sequences using recursive formulas like Fibonacci numbers. The study may be a new method for the study of the symmetry origin and the growth mechanism from part to whole of DNA symmetrical sequences, it also offer a new thinking for the theoretical characterization and structural analysis of DNA sequences.A DNA sequence search algorithm was designed based on matrix store and combination principle to search DNA sequences and site in matrix. The store matrix is suppositional, namely sequences exist in the algorithm. Additionally, two rules about growth mechanism of sequence were obtained. According the rules, one can search the related sequence in bigger matrix using the shorter sequence in smaller matrix, or vice verse.
Keywords/Search Tags:DNA sequence, Symmetry, Classification, Recursive equation, Matrix search
PDF Full Text Request
Related items