Font Size: a A A

Research On Several New Clustering Algorithms

Posted on:2019-10-31Degree:MasterType:Thesis
Country:ChinaCandidate:J K ZhongFull Text:PDF
GTID:2428330572952121Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of computer and Internet technology,all kinds of data show explosive growth.How to extract knowledge accurately and effectively from huge data has become a key problem to be solved in today's society.Clustering analysis is a very effective way in data mining technology,and its purpose is to find out the internal structure hidden in the data set.In recent years,more and more scholars have paid attention to cluster analysis,but there is not enough attention to the class boundary information.The combination of cluster analysis and other disciplines becomes more frequently,and the computational model of biological visual system is one of them,which provides a new biological viewpoint for clustering analysis.In this paper,the defects of clustering algorithm are researched,and a clustering algorithm using boundary information is proposed.In addition,this paper applies visual scale space theory to cluster analysis,and a grid clustering algorithm based on visual system is proposed.The main work of this paper consists of the following two parts:(1)In view of the shortcomings that many clustering algorithms such as K-means are not suitable for the non-convex dataset and the Affinity Propagation(AP)algorithm may not discriminate class' border,we proposed a new clustering algorithm by using boundary information.The algorithm uses transitive clustering to expand the current set until the complete class is formed.Because using transfer method,this algorithm can achieve good clustering results for both convex and non convex datasets.Because the boundary points describe the hidden data structure,it is very important for clustering analysis,so this algorithm counts the number of data points in the neighborhood of each data point as its density,regards the point whose density below the average density as boundary points and counts the number of boundary points.If the number of boundary points is less than a given threshold,then the contour is outlined with the boundary points and select a non boundary point arbitrarily to start the transitive clustering process,if boundary point is met in the transfer process then stopped outward transfer,which effectively prevent clustering data points of different classes into a class.Otherwise,the data set is too sparse,which means the difference between the boundary point and the non boundary point is not obvious,that is,boarder can not be sketched accurately by boundary points,so clustering is carried out by transfer ideas directly without distinguishing between boundary point and non-boundary point.According to the number of boundary points,different clustering schemes are adopted to make the algorithm get ideal clustering results for both sparse and non sparse data sets which expanding the application scope of this algorithm.Experimental results on artificial datasets and standard datasets show that the algorithm proposed in this paper is efficient.(2)Aiming at the problem that grid width is difficult to be determined in grid clustering algorithm,we design a method to calculate the grid width and apply the scale space theory in the visual system to clustering analysis,then a grid clustering algorithm based on visual system is proposed.By analyzing the time complexity and accuracy of traditional grid clustering algorithms,the reasonable grid width is determined,so that the time complexity is reduced and the accuracy is ensured.Enlarging grid width according Weber law,so as to achieve the effect of continuously increasing observation scale in visual scale space theory.There is a clustering result for each grid width,and the clustering result with the most frequent occurrences is defined as the final clustering result.This algorithm applies the scale space theory in the visual system to clustering analysis,which can accurately find the class structure of data set.The experimental results on artificial datasets and standard datasets show the grid clustering algorithm based on visual system is effective and efficient.
Keywords/Search Tags:Clustering Analysis, Density, Boundary Point, Vision System, Grid Clustering
PDF Full Text Request
Related items