Application On DNA Sequence Analysis Using Ant Colony Algorithm

Posted on:2009-08-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y Li

Full Text:PDF

GTID:2178360272476368

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

After the Human Genome Project was completed, the era of post Human Genome is coming. Various post genome project are being planned and putting into practice, which has sparked off vast amount of biological molecular data. To exploit such invaluable data and discover inherent biological information from the data is from all aspects important and of scientific value for human being. Research faced with the challenge of addressing related works are devoting all of their efforts to the studies of uncovering biology secrets from the data. Up to now, many meaningful methods and tools have been developed and implemented for manipulating genome data, however, the results of discovered patterns and prediction turn to be out of the expects. Hence, researchers still have a long way to go.In the first chapter of this article introduced the Data Mining, ant colony algorithm, as well as biological information technology research current and future trends. Through research we found that these three technologies are in the initial stage, and in the future will have a profound impact on the current bioinformatics is the study of the dynamic new era, many scientists say it is the human genome research during the harvest, it will To give people a variety of important achievements in basic research, will bring enormous economic and social benefits. Application and Development, both of these findings have important theoretical value, but also directly to industrial and agricultural production and medical services practice. In the next few years, DNA sequence data will grow at a rate of unexpected people, as soon as possible the use of these data, our country can take the international scientific community in the forefront. In the second chapter describes the functions of data mining, data mining, as well as the process of data mining algorithms commonly used: the classification rules mining, clustering, as well as the rules of mining association rule mining, so that readers of this three methods have a clear understanding And to explore the advantages and disadvantages of the corresponding algorithm, we found the database to be analyzed along with an increased range, increased capacity, to carry out excavation on the efficiency of the algorithm will require more and more the idea that we can bring to the ant colony algorithm Data Mining in the past in order to improve the efficiency of the algorithm. In the third chapter details on the ant colony algorithm simulation of the basic ideas, the realization of the ant colony algorithm to discuss the basic ant colony algorithm, as well as the advantages and disadvantages of improvements, and given the improved ant colony algorithm used by the chapter The presentation, we can clearly understand that ant colony algorithm is a new kind of simulated evolutionary algorithm, though not long study, unlike the other heuristic has been formed as a systematic analysis methods and a solid mathematical basis. Parameter choice is to rely on more experimental and experience, not to determine the theorem, and its long-time computing, in its theoretical and practical aspects of many issues still need more in-depth research and resolve. However, it has a positive feedback, parallel computing and strong robustness and many other advantages, can be expected, with an in-depth study, the ant colony algorithm will give us a complex combinatorial optimization problems to solve outstanding optimization algorithm. Through the first three chapters introduce us to discuss the next step of laying a theoretical foundation, through integrated data mining technology, bio-ant colony algorithm, as well as the development of information technology trends, this article will compare the current commonly used in ant colony algorithm, data mining technology into the DNA data Sequence analysis, First of all details of the composition of DNA data and the characteristics of the biological sense, refers to the sequence similarity of the two existing in the same sequence and similar sites, and the sequence homology is the sequence of the two have the same ancestors, biological Science often believe that if the sequence similarity between the more than 30%, which is likely to be homologous. As a result, if the two sequences are similar enough, you can speculate the two may have been inside the base sequence of the fragment sequences, or the replacement of the missing sequences as well as the restructuring process, and other genetic variation evolved, they may have a common evolutionary ancestor . So, from the other known features of the sequence found in the unknown sequence with a high degree of similarity, the predictable sequence. For this reason, to use this article for the next big space to explore the use of the ant colony algorithm for DNA sequences than the right, through the simulation experiments to prove the effectiveness of the algorithm, the details associated with ant colony algorithm, the ant colony algorithm, as well as Classification ant colony algorithm, the ant colony algorithm associated with the first of consecutive pre-property value, the use of the group of ants in our definition of the graph of super-peak within the various sub-peak, the frequency of collection were excavated in order to Respectively, constitute the rules of the former two-piece, after the incident, and in accordance with the rules on the quality of the rules of double merits. At the same time design a reasonable probability formula to guide the selection of all types of ants and effective construction, mining rules. Ant colony algorithm will be the initial cluster of data in a random display of two-dimensional plane, and then the plane in some virtual ants, the ant behavior and above the basic model described in ant behavior similar to Is that they are not burdened by the current observation of objects and objects around the same, but to determine whether similar. They are not burdened by the current observation of objects and objects around the same, but to determine whether similar. This final will be similar to the data clustered. Classification ant colony algorithm for the definition of the path attribute nodes and node-type label connection, in which each attribute node at most once and must have a grade category nodes, each corresponding to the path of a classification rules, classification rules can be seen on the excavation The path of the search. Ants use to find the shortest path to the formation of the Food and the principles of ant colony algorithm can be used to classify the rules of the excavation, but the search here is not the shortest path, but the optimal path. That the best path to the optimal rule categories. Path rules can correspond to the classification of capacity (effectiveness) and length (simplicity) to measure the merits and demerits of the path. Finally, take advantage of a simple DNA data are verified through the ant colony algorithm to generate the mining rules will have better effectiveness, obtained by the use of the rules, can accurately forecast the new genetic data types and functions, detection of diseases and disabilities The genetic causes of disease diagnosis, prevention, treatment of the disease found that new drugs and new methods.

Keywords/Search Tags:

Ant Colony Algorithm, Data Mining, DNA sequence database, Association Rules, Cluster Analysis, Classification analysis

PDF Full Text Request

Related items

1	The Applied Research Of Association Rules Mining Based On Colony Algorithm In Marketing
2	Research And Application Of Weighted Association Rules Algorithm Based On Cluster And Compression Matrix
3	Research Of Applying Ant Colony Optimization In Data Mining
4	Data Mining Of A Number Of Ways In Chinese Medicine Database
5	Mining Based On Association Rules And Cluster Analysis Of Abnormal Weather
6	Mining Association Rules In DNA Database
7	Some Key Problems In The KDD
8	The Research & Implement For Mining Association Rules Of Definite Semanteme
9	Research Of Intrusion Detection System Based On Cluster Analysis And Association Rules
10	Research And Application Of Time Series Association Rules Based On Fuzzy Set