Font Size: a A A

Research And Application Of Classification Algorithm Based On Rough Set

Posted on:2016-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LiuFull Text:PDF
GTID:2308330464954759Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Rough Set theory was put forwarded by famous Polish scientist Z.Pawlak in 1982. It’s a effective mathematical tool in dealing with imprecise and uncertain data, and it never requires any prior knowledge but only rely on data set itself. Rough set theory has become one of research hot area in data mining, machine learning and etc. In this paper, the basic algorithm of rough set was studied and improved, it tries to combines the rough set and traditional weighted K nearest neighbor (KNN) algorithm to solve the problem of uncertainty, and then apply the improved algorithm to Universal test data set (UCI) and abnormal crowd behavior recognition. The main work is as follows:1、Rough set rules extraction. Rules extraction mainly involves:attributes discretization, attribute reduction and attribute value reduction. ①Rough set condition attributes discretization. For the rough set theory is only applicable to discrete attribute information system. However, some practical problems in real life always consist of continuous type of attributes. Therefore, in order to make use of the rough set theory advantages such as dealing with imprecise and uncertain data effectively, and never requiring any prior knowledge but only relying on data set itself to the practical problems, we must discredited the continuous attribute in practical problems. The paper analyzed the continuous attributes discretization methods based on genetic algorithm and particle swarm optimization algorithm in details. In the process of iteration, Genetic algorithm can make selection to well reserve the last generation superior individual, and Genetic mutations Increases Species Diversity, but it is easy to be trapped in local optimum. In the progress of iteration, particle swarm optimization algorithm increased the global parts, but its convergence speed faster and can not protect the superior individual well. Above all, the article put forward the continuous attributes discretization method based on combining particle swarm optimization algorithm with genetic algorithm. ②The rough set attribute reduction is one of core question in rough set theory. The article mainly analyzes the attribute reduction algorithm based on discernibility matrix or based on information entropy, and analyzes their limitations, and finally adopt attribute reduction algorithm based on binary discernibility matrix. ③Attribute value reduction algorithm is also a main question in studying rough set theory. The article mainly analyzes the heuristic value reduction method proposed by Chang Li-yun, Wang Guo-yin and etc, and mainly examines each condition attribute in the information table, according to the different influence of attribute value of information table to make different mark and proposed some potential problems in the algorithm and make corresponding improvement for it.2、In the process of extracting rules, the rules is generally obtained by studying the training data set. Training data is usually obtained by quantitative sampling. For samples that not belong to the training data set but to the original sample data set, rough set isn’t correct classification for it. In order to make the rough set classified correctly samples that haven’t learned. The article put forward a method combining rough set with the weighted K nearest neighbor and improved accuracy of rough set. At present, combining with other soft computing methods is also a main question in studying rough set. Current study can be mainly divided into two aspects:on the one hand, rough set has the ability to deal with uncertain problems, however, it can only obtain the uncertainty area but dealing with the uncertain problems it relies on some other soft computing methods. On the other hand, other soft computing methods effects are not so good in analyzing multi-dimensional data. But by using of ability of rough set attribute reduction, it will remove some unnecessary attribute, reduce the dimension of original data and finally improve the efficiency of other soft computing methods. However, doing regression forecast analysis after rough set rule extraction, the article will analyze exist some new un-recognized samples and proposed weighted nearest neighbor (KNN) method that compute attribute weight according to the attributes’ importance to deal with it. Finally, we apply the improved algorithm to UCI database test and get better results.3、Applying the improved rough set algorithm to abnormal crowd behavior recognition, by extracting the features in abnormal crowd behavior and through attributes discretization processing, we will obtain the decision table of abnormal crowd behavior. And then we do the attribute reduction and attribute value reduction to get decision rules from the decision table. At last, we do the prediction and analyze the results.To sum up, the article puts forward the attributes discretization algorithm based on combining genetic algorithm and particle swarm optimization algorithm. Analyzing the attribute reduction and attribute value reduction, it proposes existing some potential problems and improved. Doing regression forecast analysis after rough set rule extraction, and existing some new un-recognized samples, the article proposed weighted nearest neighbor (KNN) method to deal with it. And finally, the improved rough set algorithm can be widely applied to abnormal crowd behavior recognition.
Keywords/Search Tags:Rough set, attributes discretization, attribute reduction, attribute value reduction, weighted K nearest neighbor, abnormal crowd behavior
PDF Full Text Request
Related items