Research On Non-IID KNN Classification Algorithm

Posted on:2019-11-13

Degree:Master

Type:Thesis

Country:China

Candidate:H J Li

Full Text:PDF

GTID:2428330548986992

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Data mining is the process of mining valuable information from data.Classification algorithm is one of the mainstream research topic of data mining.The task of classification algorithm is to map unknown category data items to the corresponding categories by classifier.KNN(K-Nearest Neighbor)algorithm is one of the most widely used classification algorithms in data mining area.In this paper,the KNN classifier is studied and analyzed.Aiming at solving the shortcomings of the KNN algorithm,we made some improvements in the decision rule and similarity measurement.The main work of the paper are as follows:The decision rule of the traditional KNN classifier is to count the class of the k nearest neighbors after selecting neighbors,so as to predict the class labels of the tested instances.Obviously,this simple statistical discrimination method does not effectively use the information of neighboring samples.In order to overcome the shortcomings of KNN algorithm decision rules,this paper introduces the concept of Nearest-neighbor-support and Category-reliability to generate new decision rules.Firstly,by measuring the similarity between the sample to be tested and the nearest neighbor sample,we introduce the concept of Nearest-neighbor-support.Second,by considering the distribution of the sample,the Category-reliability of each category is calculated.Experiments show that the ND_KNN algorithm improves the performance of the classifier and is an effective and stable classification algorithm.When traditional KNN classification algorithms measure the relationships among objects in a data set,they often think that each object is identically and independently distributed(IID),ignoring the interactions and effects between objects.The CS_KNN algorithm is based on the idea of Non-IID.Its research focuses on mining the interaction relations among the characteristics,categories and attribute-values of objects.Firstly,by measuring the importance of each feature on the classification,we study the Non-IID of features and categories to form the weight coefficient of the class feature.Second,we use the weight coefficient of the class feature to form the intra-object non independent and identically distribution function between the objects.Then,we analyze the effects of different features and generate non independent and identically distributed functions.Finally,the Non-IID relations among features,inside the features,and categories among objects are fused into similarity measures to construct association similarity rules.Experiments show that the CS_KNN algorithm based on the Non-IID idea has significantly improved the classification effect compared with the traditional KNN algorithm.

Keywords/Search Tags:

KNN algorithm, classification, decision rule, Non-IID, similarity

PDF Full Text Request

Related items

1	Research Of Web Text Classification Based On Decision Tree Classification Algorithm
2	Research And Application Of Classification Algorithm Based On Decision Tree Rules
3	Decision Fusion Of Multi-band SAR Target Detection And Classification Results
4	Association Rule Based Personalized Recommendation Research
5	Classification Rule Mining In Financial Applications
6	Research On The Automatic Lassification Algorithm Of Archive Text Based On Decision Tree
7	Research On Mining Classification Rule Based On Rough Set
8	Inductive Decision Tree Classification Model In The Military Transport Vehicle Management System
9	The Research And Application Of Attribute Reduction And Decision Rule Generation For Decision Tables Based On Rough Sets Theory
10	Research On Classification Rule Mining Based On Genetic Algorithms