Font Size: a A A

Research On Density Clustering Algorithm Based On Data Field

Posted on:2014-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:J YangFull Text:PDF
GTID:2268330401462381Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Cluster analysis is an unsupervised learning method which is aimed at exploring the nature of the data and partitioning them into different classes. It has become one of the main research hotspots of machine learning field and data mining field. Density-based clustering algorithm is an important cluster analysis tool. The idea of using new metrics and density-connected to determine clusters provides cluster analysis a new solution. In recent years, researchers have proposed many density-based clustering algorithms and applied them to large-scale spatial data mining, image segmentation, microblogging text analysis and other practical fields.Density-based spatial clustering of applications with noise algorithm DBSCAN is one of the representatives of typical density-based clustering algorithm. Under the condition of unknown class number, DBSCAN not only divides out the arbitrary shaped clusters, but also identifies noise data of the dataset. However, the algorithm depends heavily on the two input parameters Eps and MinPts, and is difficult to deal with multi-density dataset.Therefore, as the data field theory has the very advantages of taking into account the interaction between data, reasonably describing the overall distribution of the dataset, this paper combines it with DBSCAN to study density-based clustering algorithm based on data field. Main work includes the following three aspects:1. Combining data field thought with DBSCAN to propose an improved DBSCAN clustering algorithm based on data field, the new algorithm is applicable to datasets contain several kinds of densities and multiple morphological clusters. Firstly, uses data field theory to describe the whole information of the dataset, then assists Eps and MinPts for the late clustering computing through the introduction of average potential difference. It is worth nothing that the new algorithm simply needs user to input parameter MinPts, the values of average potential difference and Eps are determined by considering the distribution of the cluster where the selected core point is in real time. Finally, obtains the clustering results by utilizing density-reachable. By comparison with the K-means algorithm, DBSCAN algorithm and data field clustering algorithm, experimental analysis shows that the proposed algorithm can get better clustering results.2. In order to explore the proposed algorithm’s practical application ability, this part studies how to apply the new algorithm into image segmentation and examines whether the value of mi which is a parameter of data field’s potential function, influences the finally clustering result. Considering the importance of pixels in image display, this paper associates mi’s value with pixels and changes its value by a series of nonlinear image processing. In addition, this part also proposed two simple display methods in order to make the clustering result’s display more in line with human visual perception. By processing several pieces of sample images and comparing the segmentation results with other image segmentation algorithm, the proposed algorithm can be applied to image segmentation and find out that mi’s value do affects the finally results.3. For the sake of providing user a good interface and intuitive comparison of algorithms performance. This part designs and implements a cluster analysis system based on data field by using C#NET, MATLAB programming language and SQL Server2008database. The system includes cluster analysis and comparative results’display of K-means algorithm, DBSCAN algorithm, data field clustering algorithm and the improved DBSCAN clustering algorithm based on data field on UCI datasets, synthetic datasets and image data.This article combines data field theory, provides a new solution to deal with multi-density datasets for DBSCAN. Not only applies the new algorithm into image segmentation, but also explores mi’s impact on the clustering results.
Keywords/Search Tags:Density-based clustering, Data field, DBSCAN, Potentialfunction, Image segmentation
PDF Full Text Request
Related items