Font Size: a A A

Modified Fuzzy C-means Clustering And Continuous Attribute Discretization Algorithm Research

Posted on:2012-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2208330335480078Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Discretization of continuous attributes is one of important research works in data mining. As supervised discretization dosen't considers the compatibility between attributes, the deviation will be caused, and the unsupervised discretization is sensitive to the uneven distribution and noisy data sets. In the real world, the classification boundary between data objects is very vague, so it is very difficult to determine that classification data should belong to the kind. Without the prior knowledge guidance, artificial partition of data sets, not only destroy the relevant information between the data objects, but also make that the final result can not be convincing. For the defects of traditional fuzzy clustering algorithm which are noise data sensibility and ignore the correlation between attributes, fuzzy C means clustering and discretization of continuous attributes have been studied in this paper. The main research tasks are as follows:(1) For the defects of fuzzy c-means (FCM) algorithm which are random of initial clustering center and noise data sensibility, a fuzzy clustering algorithm DCFCM is presented by using large density region. Firstly, the algorithm selects initial clustering centers and the candidate initial clustering centers by making use of the large density region and change of samples'density values, then the initial clustering centers based on the distance of initial clustering centers and the candidate initial clustering centers are determined, so that it effectively overcome the defect that given randomly initial clustering center make FCM algorithms converging to local minimum easily. Secondly, the algorithm uses density function as samples'weights and optimizes its membership function, so that the algorithm's ability of anti-noise is improved. In the end, the experimental results validate that the algorithm has good effect in selecting initial clustering center, clustering effect and ability of anti-noise.(2)Based on above, a soft partition discretization algorithm based on DCFCM algorithm is presented. The algorithm makes use of the principle of compatibility between decision attributes and condition attributes in decision table as the criteria to adjust dynamicly the parameters of DCFCM, so that the optimization result of discretization is achieved. By using the UCI and astronomical spectral data, the experimental results validate the effectiveness of the algorithm.
Keywords/Search Tags:Discretization, Fuzzy Clustering, Density Function, Compatibility, Dynamic parameter adjustment
PDF Full Text Request
Related items