Font Size: a A A

Study On Classification Algorithm Based On Natural Nearest Neighbor

Posted on:2016-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2308330479484812Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
There exists a lot of problems of classification in various discipline fields and their applications. With the rapid development of information technology, the amount of data is increasing in a rapid way. Faced with the increasing data, researchers are eager to get valuable information or the ability to predict the future from them, therefore, the classification algorithms have received more and more attention. As an important research field of data mining, the classification technology has been widely used in various fields and played a very important role in them, showing great practical value.In the field of classification in data mining, a large number of domestic and foreign scholars have proposed K-nearest neighbor classification algorithm and many other effective K-nearest neighbor improvement algorithm based on the concept of K nearest neighbor. But in the actual application environment involving K-nearest neighbor classification algorithm, K values of different parameters have a significant impact on the final classification results and performance. Meanwhile, when the experimental data sets have different characteristics, the selecting of a specific value of K has no reliable theoretical basis and reference information since it usually depends on a large number of experiments or the experiences of users during the experiment. Therefore, the selecting of appropriate parameter k value in KNN algorithm has been a difficult research focus.To solve these problems, this paper proposes the classification algorithm based on Natural Nearest Neighbor. The specific works are as follows:① Researching and analyzing the research background and the practical significance of classification technology, and illustrating the current research status of domestic and foreign researches.② Introducing the definition and process of classification algorithm; and then focusing on the investigating on the algorithm idea, the advantages, disadvantages and its typical algorithms of several commonly used classification algorithm; and finally, describing the common evaluation criteria of classification algorithm.③ Introducing the concept and the key idea of Natural Nearest Neighbor(3N). The advantage of the Natural Nearest Neighbor technique is that neighbors of each sample are adaptively formed by the algorithm which does not need to set any parameters. Then this paper improves the search algorithm of Natural Nearest Neighbor and verifies that the improved algorithm is no longer sensitive to noises through the experiment. Finally, the density characteristics and stability of the Natural Nearest Neighbor is introduced, and its characteristics on random and real data sets is verified through experiments.④ Proposing the classification algorithm based on Natural Nearest Neighbor(CAb3N). Through analyzing, the defect of Naturally Nearest Neighbor when used in the classification of high-dimensional data is found, and a new way of weight assignment for training data based on 3N search algorithm, related 3N definitions and the weight way is proposed to improve the accuracy rate of classification. Then the algorithm classifies the test samples by using 3N algorithm and the training data with the weight.⑤ Through comparing the CAb3 N proposed in this paper with the traditional KNN algorithm and weighted KNN algorithm on UCI real data set, the experiments verify the effectiveness of the proposed CAb3 N algorithm. And through the comparison between the weighted CAb3 N and unweighted 3N classification algorithm, the experiments verify that the weight assignment can enhance the accuracy of classification algorithm.
Keywords/Search Tags:classification, k-nearest neighbor, natural nearest neighbor, weight assignment
PDF Full Text Request
Related items