Research On Clustering Algorithm Based On Graph Model

Posted on:2017-04-08

Degree:Master

Type:Thesis

Country:China

Candidate:P S Niu

Full Text:PDF

GTID:2180330482987185

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With rapid developing of the application of social network, communication network, and biological network and so on, the regarding graph model data has presented explosive growth over the past few years. Graph, as a data structure, has its own representing method and information, and one graph model may contain hundreds to millions of vertices. However, the associated information of these vertices and their connecting edges has different meanings in different fields and with the expansion of the graph size, how to efficiently analyze the information and acquire the useful information has become a mainstream research direction. As an important tool in machine learning, clustering analysis has been widely applied to text excavation, bioinformatics, pattern recognition and other research fields. With the widespread use of graph model data, graph clustering has become one kind of important clustering method and also one of the most powerful techniques of graph data analysis.The similarity matrix is always constructed by the distance of nodes. But there are multiple paths of equal length and the K shortest path between nodes. And the relationship between these paths will influence the similarity of nodes. Therefore, contribute to a better balance between nodes similar to consider the distance relationship between nodes. Regarding this defect, this paper proposes a graph clustering algorithm DRGC based on top-k shortest path. This algorithm adopts some idea of spectral clustering, and we use the top-k shortest path between vertexes to build similarity matrix. Instead of eigen-decomposition, we use the auto-encoder to implement data reconstruction, which can reduce the time cost greatly. At last we use the non-parametric Bayesian model to do clustering. Due to the Dirichlet process has good clustering properties and it can perform data partitioning automatically, this algorithm can get reasonable partition for data set without pre-defined cluster number.In order to overcome single clustering algorithmâ€™s problems of datasets sensitivity, this paper proposes a clustering algorithm based on Majority Voting Rule. It uses the DRGC, k-means, spectral clustering as base algorithm. Then it takes the clustering result with highest modularity as base label. We unify the cluster labels of different base algorithms by analysis the relationship between them, and give the final clustering result by Majority Voting Rule. In the end, emulation experiment was made for these two proposed algorithms and the experiment proofs that the two algorithms possess great clustering property and can obtain more accurate clustering partition results.

Keywords/Search Tags:

Graph clustering, K short distance, Clustering ensemble

PDF Full Text Request

Related items

1	A Fast Clustering Method For Large Single-cell RNA-seq Data Based On Spectral Clustering
2	Research On Ensemble Clustering Algorithm Based On Three-way Decisions
3	Research On Clustering Algorithm Based On Graph Structure
4	Quantum Clustering Algorithm Based On Manifold Distance And Its Applications
5	Research On Large Scale Graph Clustering Optimization Algorithm
6	Research On Attributed Graph Clustering Based On Graph Representation Learning
7	Research Of Graph Representation Clustering Based On Sparse Subspace
8	High-throughput Analysis Of Biomolecular Data Using Multiple Hierarchical Consensus Clustering
9	Research On Connectivity Enhancement Strategies For Affinity Graph In Sparse Subspace Clustering
10	Design And Implementation Of Streamline Selection System Based On Unsupervised Learning