Research On Fast Graph Clustering Algorithm On Large-Scale Data

Posted on:2022-02-04

Degree:Master

Type:Thesis

Country:China

Candidate:F Z You

Full Text:PDF

GTID:2518306509970179

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As an important technology in data mining,clustering analysis has been applied to various fields.Among them,spectral clustering is more and more widely used as a representative algorithm of graph clustering.With the development of the information age,the scale of data sets has become larger and larger,and the data that needs to be processed is also becoming more and more,which will make it difficult to use traditional methods for large amounts of data.This article focuses on how to accelerate graph clustering under large-scale data sets,and systematically researches the selection methods of key nodes and graph clustering acceleration algorithms under large-scale data sets.The main research contents are as follows:(1)To improve the usability of the spectral clustering algorithm on large-scale data sets,a fast graph clustering algorithm based on the selection of key nodes is proposed.This algorithm consists of three steps:A fast node weight evaluation method is established based on the compactness and separation of clusters;The key nodes are selected to replace the original data set to construct a bipartite graph,and the approximated eigenvectors of the data are obtained by singular value decomposition;Multiple approximated eigenvectors are integrated to improve the robustness of the approximated spectral clustering results.In addition,this new algorithm has been compared with other representative spectral clustering algorithms using experimental analysis on benchmark data sets.This demonstrates that the new algorithm can identify complex class structures in data more efficiently than other clustering algorithms.(2)In order to further improve the accuracy of graph clustering under large-scale data sets and increase the operation speed,a fast graph clustering algorithm based on the improvement of bipartite graphs is proposed.Based on the original algorithm,the algorithm selects key nodes again in the process of constructing the bipartite graph and reduces the scale of the matrix required for singular value decomposition.When the size of the data set is n � n,the size of the singular value decomposition matrix is diminished from d � n to d � m.Through the experimental analysis with the bipartite graph-based clustering algorithm,the results show that the new algorithm improves the calculation speed while maintaining the clustering accuracy.(3)In order to display large-scale data graph clustering algorithm,a fast graph clustering system for large-scale data is designed.The system includes data import,algorithm parameter setting,result display and other modules.It arranges some of the algorithms mentioned in this paper,and clearly shows the effectiveness of different algorithms in different data sets.The research results of this paper enrich the clustering research under large-scale data sets and put forward more possibilities for clustering algorithm research in the era of big data.

Keywords/Search Tags:

Cluster analysis, Graph clustering, Bipartite graph, Cluster ensemble, Spectral clustering

PDF Full Text Request

Related items

1	Research On Key Technologies Of Co-Cluster And Co-Clustering Ensemble
2	Categorical Relation Graph Construction And Clustering Analysis For Categorical Data
3	Research On The Effectiveness Element Theory And Method Of Clustering Ensemble
4	Study Of Cluster Ensemble Methods Based On Hierarchical Clustering
5	The Research Of Clustering And Ensemble Clustering Based On Cluster-Mode
6	Research On Weighted Cluster Ensemble Algorithm Based On Validity Evaluation
7	Research On Ensemble Clustering
8	Research And Application Of Bidding Data Mining Based On Graph Clustering
9	Research On Dynamic Graph Clustering Algorithm
10	Construction Of User Clustering Movie Recommendation System Based On Bipartite Graph Networks