Study Of Label Propagation Clustering Algorithm Based On Data Features

Posted on:2020-01-19

Degree:Master

Type:Thesis

Country:China

Candidate:X Zhang

Full Text:PDF

GTID:2428330596487359

Subject:Engineering�Computer Technology

Abstract/Summary:

PDF Full Text Request

Driven by the global wave of informationization,various types of structured and semi-structured data have accumulated over time.Data mining is a tool to extract the valuable laws contained in these massive and complex data.Cluster analysis has become an important research direction in the field of data mining with its unsupervised characteristics.In this paper,we takes clustering definition,clustering process and clustering evaluation index as the starting point,expounds and analyzes the advantages and disadvantages of different types of classical algorithms,and proposes a new clustering algorithm based on the idea of label propagation.The label propagation algorithm is an efficient and simple graph-based semi-supervised learning method,but some label information needs to be provided as the initial parameter when the algorithm is executed,which leads to the reduced adaptability of the algorithm.Therefore,based on the idea of label propagation algorithm,this thesis proposes a data point density based label propagation algorithm(NDLP)and a data point importance based label propagation algorithm(NILP).The NDLP algorithm determines the initial label information by measuring the density of the data points,and then performs label aggregation and iterative update according to the initial label,thereby completing data clustering.The NILP algorithm first determines the initial label point according to the density of the data points,and then adds labels according to the importance of the data points.In the label transfer process,the corresponding label update rules are formulated according to the importance of the data points,and finally the clustering task is completed.The NDLP algorithm performs experiments on four synthetic data sets and two real data sets.In the experiment,Normalized Mutual Information and Adjusted Rand Index were selected as clustering quality evaluation criteria.Compared with the four classical clustering algorithms,the clustering evaluation index corresponding to this algorithm has obvious advantages.Firstly,the NILP algorithm selects the same experimental dataset and comparison algorithm as NDLP for validity verification,and then conducts experiments on four artificially synthesized datasets containing circular clusters,and selects the same clustering evaluation index as the original algorithm.The results show that the accuracy and efficiency of the NILP algorithm in the experiment is better than the original algorithm.

Keywords/Search Tags:

cluster analysis, label propagation, data point density, data point importance

PDF Full Text Request

Related items

1	Research And Application Of Fuzzy Clustering Segmentation Of 3D Point Cloud Data
2	Research On Simplification Method Of 3D Point Cloud Based On The Importance Of Point
3	The Research About Partition-based And Density-based Clustering Algorithm
4	Change Point Detection Method Study Based On Clustering Analysis
5	Lidar Point Cloud Data Point Mapping Method
6	Lidar Point Clous Data Processing And3D Terrain Resgistration Method Research
7	Research On Point Cloud Data Processing Technology In 3D Reconstruction
8	Analysis And Design Of High-performance Floating-Point Unit
9	For The Non-equilibrium Hybrid Data Classification And Its Application
10	Rearch On The Modeling And Analysis Of The Tree-dimensional Point Cloud Data Of Trees