Font Size: a A A

Data Distribution Guided Fuzzy-rough Nearest Neighbour Algorithm

Posted on:2016-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y N FeiFull Text:PDF
GTID:2308330470978503Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Rough set theory is a mathematical tool that can dealing with the incomplete data and inperfect data. However, due to the definition of rough set theory, rough set can only be used to operate with discrete data sets.Fuzzy set theory is a important kind of calculation method. It can not only deal with imprecise information, but also can deal with uncertain information. The fuzzy set theory emphasizes the fuzziness between information. The theory of rough set is the non-resolution of information.The fuzzy rough set is obtained by combining the fuzzy set theory with the rough set theory. The fuzzy rough set combines the advantages of the fuzzy sets and rough sets. It can not only deal with the discrete data, but also deal with the real-valued data, or the mixture of bouth.K nearest neighbor algorithm is a classical algorithm, this algorithm has the characteristics of intuition, without prior knowledge of statistics, unsupervised learning, it has become a commonly used classification algorithms. The concept of fuzzy rough set is introduced into the k nearest neighbor algorithm, and the fuzzy rough nearest neighbor algorithm (FRNN) is obtained, the fuzzy rough nearest neighbor algorithm can deal with the relationship between the sample and the category classification in a more flexible way.But in the actual course of testing, the training sample always contains some noise points, and influence the accuracy of fuzzy rough nearest neighbor algorithm, so how to reduce the interference of noise, and improve the accuracy of classification of the fuzzy rough nearest neighbor algorithm, it is a problem to solve.According to the fuzzy similarity relation, this paper presents two kinds of improved fuzzy rough nearest neighbor algorithm, by calculating the similarity degree between the sample data and the training data. According to the density of data distribution, we take the representative data from the density concentration region as the nearest neighbor. We test the algorithm by using UCI data sets, and compared the results of five commonly used algorithms. The results of the experimental show that the improved fuzzy rough neighborhood algorithm can effectively reduce the interference of noise points, and better classification results are obtained.
Keywords/Search Tags:Fuzzy-rough Sets, Nearest-neighbour Approach, Classification
PDF Full Text Request
Related items