Font Size: a A A

Study Of Data Mining Based On Rough Set And Granular Computing

Posted on:2014-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:L ChenFull Text:PDF
GTID:2248330395984262Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Abstract: Data sets in the world are expanding by leaps and bounds. Mining hidden within thedata, information or knowledge of modeling, is becoming increasingly important. It makes the datamining has become a hot research issue. The study finds that the indiscernibility information oftenexists in the data, many data mining algorithms can not adapt to the processing of these data. Todeal with indiscernibility problem, a lot of data mining algorithms combine with rough set theoryand granular computing theory. Research mainly includes the following aspects:1. A single-dimension hierarchical granulated attribute reduction algorithm. In handlingcontinuous information attribute reduction, neighborhood granulation conditions are not same.Distance metric as a standard to measure the approximate relationship of different dimensions ofdistance calculated using the same approximate threshold, will inevitably lead to error on theclassification accuracy. A single-dimension hierarchical granulated attribute reduction algorithmconstructs neighborhood system in the same threshold condition, and uses hierarchical granulatedrelationship to calculate the classification accuracy. Experiments show that, the algorithm still hasbetter attribute reduction effect in high classification accuracy.2. Rough K-means clustering algorithm based on imbalanced degree of cluster. Past roughK-means algorithm and its improved method, focus on the boundary of the object indiscernibilityand the differences of data points between clusters, not concerning about differences of the datadistribution in a cluster. Imbalance degree can effectively reflect importance of the data object in acluster with distance to the mean center. Simulation analysis of UCI data show that the clusteringalgorithm can make inner-cluster more compact, more inter-cluster separation.3. Improved the imbalance degree of cluster. Not only the distance, but also some intensiveareas can make an influence on the distribution of data. The importance of some removed data, butwith a high density, should also be seen. Rough K-means clustering algorithm based on densityself-adaptive imbalance degree of cluster makes mean centers assemble, moving step more accurate,and more flexible. The simulation results show that the clustering algorithm has a high accuracy,and improve the speed of convergence of the algorithm.In summary, the data mining algorithm based on rough sets theory, provides support fordealing with indiscernibility, and has better theoretical value and significance.
Keywords/Search Tags:Attributes Reduction, Clustering, K-means, Rough set, Granular Computing
PDF Full Text Request
Related items