Font Size: a A A

On The Models Of DNA Classification Based On GA And Multiple Sequence Alignment

Posted on:2010-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:J Y HuFull Text:PDF
GTID:2178360278462240Subject:Computer applications and technology
Abstract/Summary:PDF Full Text Request
In this paper, two models of DNA classification were Designed and implemented. In the classification, the accuracy was essential. Howere can't be too much subjective color. Other- wise, even if the classification is accurate temporarily, it may lose practical significance.The first model of DNA classification was adopted by the multiple mequence alignment to identify similarities and homology. In this mode, discussed three methods:The first method from the evolutionary tree. Although it was not a viable option in small errors, however, to a certain extent, it reflected obvious distinction of different types of DNA sequences in the homology.The second was from three directions of the amino-acids.①from the evolutionary tree;②from the polarity.③from the "strand propensity" of Jalview that coloring the amino-acids. The third was the most feasible, which accuracy rate of classification was 100%. it was some subjective. However, if you know where DNA sequences began, it has the advantage that can not be ignored. It was simple, intuitive, high-accuracy, the biological significance.The third was from Z curve. It was a description of DNA sequences, which was characterized by a corresponding one. Z curve was abstract and not subjective, and not losing the characteristics of DNA at the maximum extent. Based on the model of the Z curve, 20 artificial sequences and 182 natural sequences was classified. All of them can be classified and the accurate rate was 100%.The second model was based on genetic algorithm. The main tasks was as follows:①In this paper, the genetic algorithm was used for classification, and a new method was constructed, which named a basic classification of the genetic algorithm. From the view of test data and based on this method, the classification accuracy rate was 97.80 %. With the increase in the number of iterations, the accuracy of this general classification system, was stabilized at around 95%, which was a local convergence.②I n order to improve the accuracy of classification, the basic classification of genetic algorithm was improved and a genetic algorithm optimization classification was made. Based on the new method, the classification accuracy rate was 99.45 %. Dynamic performance on the algorithm was very good. With the increase in the genetic generations, the accuracy would be higher and be close to 100%.Finally, Two models were compared and analyzed. Z curve method and genetic algorithms had their own advantages. Proposal to combine both in further study.
Keywords/Search Tags:DNA, Multiple Sequence Alignment, GA, Classification
PDF Full Text Request
Related items