Font Size: a A A

Research On Density Based Spectral Clustering Of Incremental Data

Posted on:2020-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:R N WangFull Text:PDF
GTID:2428330575970810Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Clustering analysis is an important branch of data mining and machine learning.It is an effective tool for people to explore the internal laws of things.Spectral clustering is one of the clustering algorithms,which transforms the data objects in the data set into vertices in the graph.he spectral clustering algorithm transforms the clustering problem of data sets into the optimal partitioning problem of graphs based on the spectral graph partition theory,so as to maximize the internal similarity of subgraphs and minimize the similarity between subgraphs.In contrast to traditional clustering algorithms,the spectral clustering algorithm is more ordinary in ideology and manipulation,and it overcomes the disadvantages of the traditional algorithm,which is easy to plunge into the local optimal solution.It can converge to the global optimal solution and act on data sets of all shape.Traditional spectral clustering algorithm defines a similarity measure,and then calculates the similarity of each pair of data objects based on this measurement.The similarity matrix is composed of the similarity degree,and the similarity matrix is transformed into an appropriate Laplacian matrix.According to the eigenvalues and corresponding eigenvectors of Laplacian matrix,one or more eigenvectors are selected for clustering.The specific research content is as follows:Firstly,this paper introduces the average density on the basis of the classical cutting criterion,then proposes the Min-max cut criterion based on the average density(MDcut),and proves the relevant properties theoretically.Since the classical Gaussian kernel function cannot fully describe the similarity relationship,a density-based spectral clustering method(DSC)is proposed by constructing a new similarity measure.The algorithm is compared with three popular clustering methods on five UCI datasets.Experimental results show that this method can not only describe statistical similarity effectively,but also the clustering result greatly improve.In order to deal with incremental data,this paper performs the density-based spectral clustering method on incremental data.For a static data set,it is necessary to to consider the impact of newly added object objects on the density of existing data objects,but for dynamic data,the density of some objects in the data set can be influenced by a new object in the data set.Therefore,on the basis of classic Gaussian kernel function,this paper constructs a similarity measure based on density change.By considering the influence of new data objects on the eigenvalues and similarity of the original data set,a spectral clustering method based on the change of eigenvalues is proposed.Compared with the two spectral clustering methods,the experimental results show that this method can not only be applied to incremental data,but also cluster incremental data effectively.
Keywords/Search Tags:Spectral clustering, Density, Incremental data, Similarity degree
PDF Full Text Request
Related items