Font Size: a A A

The Application Of ACO And Coding Method In Sequence Analysis

Posted on:2010-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:W Y ChenFull Text:PDF
GTID:2178360275482446Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of HGP (human genome project) and the research on different species gene sequences, more and more molecular sequences data have been generated. The analyses and processing of these data accelerate the development of Bioinformatics. Sequence analysis is one of the most important operations in bioinformatics. Sequence analysis helps to predict the functions of novel genes within any species. On a broader scale these algorithms have also been used to determine homologies between proteins in order to predict structural and functional relationships. This dissertation mainly studied the sequence alignment, the numerical coding method of DNA sequences, mutation analysis, the graphical representation of DNA sequences and the similarity analysis of biological sequences.First, we described a new method for pairwise alignment. We associated the process of aligning with the plan by the modified dot plots. And, we select the next position by the number of pheromone and the matching score of the candidates. By the proposed algorithm, we can find the best aligning result, needn't the scoring matrix. The experimental results indicate that the algorithm can achieve better results.Second, we described a new representation for multiple sequences at first. And we use this representation to the multiple sequence alignment, which is a central problem in computational biology. By this representation, we can take every possible aligning result into account. We also defined the representation of gap inserting, the value of heuristic information in every optional path and scoring rule. In this kind of multidimensional graph, we use the ant colony algorithm to find the better path which denotes better aligning result. In our article, we bring forth the instance of three-dimensional graph and four-dimensional graph. We bring forth their multidimensional graph and their ichnographic representation. We advanced a special ichnographic representation to analyze multiple sequence alignment. We called it dispersion graph. And we explained the aligning meanings of this kind of graph. In the end, we give an example of finding the best aligning result by three-dimensional graph and ant colony algorithm. Experimental results show that the algorithm can improve the solution quality on multiple sequence alignment benchmarks.Third, we introduced a sort of numerical coding method of DNA sequences. Based on this representation, we can transform a DNA sequence to several binary sequences. The introduced system can be applied to characterize and compare the DNA sequences. In our algorithm, the aligning operation is exclusive-OR. And by the result of our operation, we can judge mutations. Moreover, based on the result of our coding method, we present a 3D graphical representation of DNA sequences. And we also introduced this binary coding method to RNA secondary structures. We explained the coding method of RNA secondary structures, and we analyzed the mutation types based on this coding method.
Keywords/Search Tags:Pairwise alignment, Multiple sequence alignment, Ant colony algorithm, Graphical representation of sequence
PDF Full Text Request
Related items