Font Size: a A A

The Implementation Of Duplicated Genome Rearrangement Algorithm

Posted on:2010-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:A Y SunFull Text:PDF
GTID:2178360278973521Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Genome rearrangement is a biological process which changes the arrangement sequences of the gene in genome. It can be summed up to three main operations: translocation; reversal and transposition. Rearrangement distance refers to the minimum number of translocation operations transform one genome to another. In duplicated genome each chromosome appears in a pair. The reconstruction problem of duplicated genome requires finding out the minimum distance of translocation operation transforms one duplicated genome to a given one. As to this problem, Nadia El-Mabrouk has put forward one polynomial-time algorithm. With the help of Delphi integrated development environments , this paper has implemented this algorithm to a software named duplicated genome reconstruction: (1) has designed optimizated data structure; (2) has presented careful implementation method; (3) has experimentally verified the accuracy of algorithm.This thesis first presents the research background of the reconstruction problem of duplicated genome and the basic concept and theoretic knowledge the algorithm involved. Then it introduces in detail the attainment of the algorithm of the data structure through an exemplification of genome. Ultimately the paper emphasizes and analyses the design of this duplicated genome reconstruction software. When the user inputs an initial genome, the software will deal with it automatically and the user can get a resulting genome, that is, duplicated genome.The main management processes of duplicated genome rearrangement are as followings:The first step is to form natural graphs by initial genome. The initial genome is a reconstructed duplicated genome composed by even chromosomes. Each gene arises twice. Initial genome is described through breaking graph. In the breaking graph each gene mark is presented by two vertices. A black edge links two vertices of two neighborhood genes. Each end of the chromosome quotes a special vertex separately, which is demonstrated by the mark "O", that is, O point. In algorithm, we use the chain-structure to describe the breaking graph of the initial genome. Each black edge is corresponded to a knot in the chain-structure. Knots in the same chromosome are connected from left to right by the point. Chains of multi-chromosomes are connected to a chain-structure by a group of point data. When marks of initial genome are input, the process automatically produces the chain-structure of the initial genome .After dealing with the chain-structure, the process produces a chain-structure corresponding to the natural graph.The second step is to amalgamate natural graphs into supernatural graph. Natural graphs can be divided into two categories according to edges they contain: one is composed by even edges, and the other, odd edges. The natural graph composed by even edges can be developed into supernatural graph when its black edges are arranged from above to below. The natural graph composed by odd edges is divided into two types, natural graph including 0 point and natural graph not including 0 point. These two types of natural graphs should be amalgamated into pairs and sequences. Then they may be developed into supernatural graph. If either type remains one natural graph, these two natural graphs should also be amalgamated into pairs and sequences. Hence a supernatural graph can be obtained. Each supernatural graph is corresponded to a chain. Multi-chains of supernatural graph are connected to a developing chain-structure.The third step is to turn supernatural graphs into resulting genome. Handle the chain which supernatural graph corresponding to successively. Each time a gray edge is formed (one gray edge corresponds one knot). Along with the continuous formalization of gray edges, they are adjacent to each other and at last form chain-graph of the resulting genome. The resulting genome is a perfect duplicated genome, that is to say, In duplicated genome each chromosome appears in a pair.In algorithm, because initial genome, natural graph, supernatural graph and resulting genome are all described in developing chain-structure, we have elevated the flexibility and efficiency of the process. In the actual implementation of this software, each gene can only appear twice, the process has reinforced the accuracy of the test. Repeated running debugging of the process has verified the accuracy ofalgorithm of duplicated genome reconstruction.
Keywords/Search Tags:computational biology, duplicated genome arrangement, translocation
PDF Full Text Request
Related items