Font Size: a A A

The Study Of The Weighted Average Density Self-adaptive Clustering Algorithm Based On Grid And Its Application

Posted on:2013-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z HeFull Text:PDF
GTID:2268330425959808Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Clustering analysis is a piece of important content in the field of data mining, andalso a research hotspot. Clustering analysis can not only reveal the inner distributionof data, but also be used as a data preprocessing technology, for instance, outlierdetection. Clustering analysis is widely used in information retrieval, trend analysis,remote sensing images, etc.Based on the in-depth study of the problems of grid meshing and boundary pointextracting in clustering analysis, the weighted average density and self-adaptiveaccommodating threshold are proposed. On the basis of these two concepts, the gridmeshing method, the boundary point extracting one and the basic ideas of thealgorithm are improved. As a result, a weighted average density self-adaptiveclustering algorithm based on grid is formed.The main research content of this paper is as follows.(1) The meaning of data mining is expounded, and the knowledge found by meansof data mining, the function of data mining, the components and process of datamining systems are discussed. On the basis, the basic meaning, the applicationrequirements, and the commonly-used algorithms of clustering analysis are discussed.The grid meshing methods, the basic ideas of algorithms, and their advantages anddisadvantages of traditional grid clustering algorithms and several improved gridclustering ones are analyzed.(2) According to the scale change ways in a grid clustering process, a new gridmeshing classification method that grid meshing is divided into uniform grid meshing,edge length self-adaptive one and area self-adaptive one is put forward, and thecharacteristics of these3grid meshing methods are analyzed and the differencesamong them are compared; According to two different kinds of density calculatingmethods in a clustering process, a new boundary point extracting classificationmethod that boundary point extracting ways are divided into window extendingmethod and nearest neighbor extending one is put forward, and the characteristics ofthese2boundary point extracting ways are analyzed and the differences among themare compared.(3) The concepts, weighted average density and self-adaptive accommodatingthreshold, are presented. On the basis of these two concepts, the grid meshing method, the boundary point extracting one and the basic ideas of the algorithm are improved.As a result, a weighted average density self-adaptive clustering algorithm based ongrid is formed.(4) On the basis that the algorithm processes of the weighted average densityself-adaptive clustering algorithm is constructed, the algorithm is experimentallysimulated, including the validation verifying, the check of the impacts of parameterchanges on the clustering results, the time performance test. Aimed at the clusteringlinked data set and the clustering non-linked one, the differences of clustering resultsbetween the improved algorithm and the SCI one are compared, and the belowsensitivity of the improved algorithm to the parameters, the high clustering accuracyand the lifted validation of clustering to the clustering linked data set are verified.Finally, the improved algorithm is applied to the intrusion detection, and it is provedthat the clustering of the improved algorithm to network intrusion data set has highaccuracy.At last, the content of this paper is summarized and further feasible researchproblems are discussed.
Keywords/Search Tags:data mining, grid clustering, weighted average density, self-adaptiveaccommodating threshold, grid meshing, boundary point extracting, intrusiondetection
PDF Full Text Request
Related items