Font Size: a A A

Research Of Algorithms Of Attribute Reduction And Classification In Data Mining

Posted on:2010-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y H LiFull Text:PDF
GTID:2178360278497034Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data Mining means the process of extracting cryptic and potential helpful information from a mass of Data. It is one kind of brand new Data analysis technology and popular in the filed of banking finance, insurance, government, education, transportation and national defense etc. The theory of rough sets, presented by Polish mathematician Pawlak Z., is a powerful mathematical tool for analyzing uncertain, fuzzy knowledge. Based on the rough sets, this dissertation focuses on the core issues including attribute reduction and classification in data mining. It points out the shortcomings by studying the theory and method of attribute reduction algorithms in complete information system. And an improved algorithm for attribute reduction based on rough sets is proposed. By analyzing the traditional decision tree algorithm with instance, the problems from the traditional decision tree algorithm are pointed out and the improved of traditional decision tree algorithm, which is named decision tree constructing algorithm based on the weighted mean attribute significance(WMAS), is put forward. Main research results are as follows:1. A concept of the weighted mean attribute significance, which considers both the importance of attribute and its contribution to classification, is proposed based on the study of attribute significance in various attribute reduction algorithms.2. How to achieve efficient attribute reduction in rough sets has always been an important aspect of study. Current research has focused on how to get the sub-optimal solution of attribute reduction as it has been proved that searching the optimal solution of attribute reduction is an NP problem. First, this article will discuss the classical reduction algorithms, and then an improved algorithm for attribute reduction based on rough sets is presented, which consider not only the attribute significance but also the amount of information of attribute. It can get one reduction of information system, while the computing is decreased and speed is increased without solving the core.3. By studying the classic decision tree based on information entropy, we find out that it is confined to the problems that some sub trees appear repeatedly in the decision tree and some attributes are measured for many times on certain route of the decision tree. In order to overcome the defect, the attribute selection criterion, based on the Weighted Mean Attribute Significance, is proposed. And furthermore, we proposed decision tree constructing algorithm WMAS based on weighted mean attribute significance. It reduces the complexity and improves the classification accuracy.And it is verified with instance and experiments that the algorithm is advantageous. Significance; Attribute Reduction; Decision Tree...
Keywords/Search Tags:Data Mining, Rough Sets Theory, Weighted Mean Attribute
PDF Full Text Request
Related items