Font Size: a A A

Mining Research, Based On The Integration Algorithm Of The K-nearest Neighbor Classification

Posted on:2011-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:L Y SunFull Text:PDF
GTID:2208360305959817Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of database technology and Internet technology, data mining technology has been further development and widespread concern. Meanwhile, classification data mining as an important research content has been widely used in pattern recognition, artificial intelligence and knowledge engineering. Therefore, researching into the subject not only has important theoretical significance, but also has important applications in reality.The thesis contains the following aspects:1. An overview of classification technology and analysis of the main classification algorithm, focuses on the principle of k nearest neighbor classification algorithm and the development present.2. An improved combination k nearest neighbors method based on simulation annealing is proposed, which introduce the simulated annealing technology to achieve random feature subset selection, and then use Vote Act to decide the final output of combination classifier. It is shown that the classification performance is better than the traditional k nearest neighbor algorithm from the simulation experiment.3. In view of the search process of simulated annealing algorithm is random, the classic simulated annealing algorithm stopping criterion does not ensure the quality of solutions, the improved simulated annealing algorithm is introduced. On the basis, the combination k nearest neighbors method based on improved simulated annealing is further proposed. The simulation experimental shows that the combination k nearest neighbors method based on improved simulated annealing has better classification performance than the one of the combination k nearest neighbors method based on traditional simulation annealing.4. A new fast k nearest neighbor algorithm based on the fuzzy-rough sets is proposed, taking into account fuzzy and rough uncertainty due to the overlapping classes and the attribute insufficiency, introducing p-tree data structure to improve the traditional k nearest neighbor method. With the traditional k nearest neighbor method and fuzzy k neighbor classifier comparison shows that the method can not only improve the classification performance, but also can improve the classifier speed. The simulation experiment has proven the method's validity and feasibility.
Keywords/Search Tags:Classification, K Nearest Neighbor, Simulated Annealing, Fuzzy Set, Rough Set
PDF Full Text Request
Related items