Font Size: a A A

Research And Application On Density Clustering Algorithm Based On Grid

Posted on:2010-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:X BaiFull Text:PDF
GTID:2178360272480039Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With information technology and database technology are developing so fast, the amount of data that need to be analyzed and managed is increasing rapidly. Facing the huge volume of data, it is urgent to develop the technology that can be used to convert data into useful information and knowledge automatically, data mining then appears to satisfy the demands. Clustering algorithm is one of main methods in data mining, deeply research on how to improve the performance of it is of great significance.In this thesis, it is found that DBSCAN algorithm is necessary to calculate the similarity information for many data-points in the neighborhood of every data-point when execute the algorithm. So, the time complexity of the algorithm will be high when the amount of data is large, which limit the application of DBSCAN algorithm in some extent.Aimed at the problems above, the fast DBSCAN algorithm based on grid is presented in this thesis. Firstly, grid and Eps-peripheral of unit grid are introduced into the new algorithm and data-partition is created. Secondly, DBSCAN algorithm is used in every data-partition to get local clustering result. Thirdly, all local clustering results are merged with merge theorem. Then, the improved algorithm is applied to the software failure data pretreatment. This reduces the abnormal data-points of failure data that are harmful to the parameters estimation for software reliability, thus improving the precision of software reliability prediction. Finally, the experiment is performed with the improved algorithm and DBSCAN algorithm for comparing and analysis, the result shows that the improved algorithm is better than DBSCAN in speed and precision of clustering.
Keywords/Search Tags:density clustering, grid, reliability, failure data
PDF Full Text Request
Related items