Font Size: a A A

Research And Implementation Of Clustering Algorithms

Posted on:2020-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y CaiFull Text:PDF
GTID:2428330620955839Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Clustering algorithm which is one of the important research contents in data mining and pattern recognition is widely used for data analysis and processing in various industries.Over the years,researchers have proposed various clustering algorithms for different application needs.These algorithms have their own advantages and limitations.The main task is to propose improvement schemes for K-Means,PAM and DBSCAN algorithm.Multiple data sets are used to test and evaluate the improvement effect.The main work of this paper is included as shown below:Research and implement the improved K-Means clustering algorithm.In view of the unstable classification effect and the high dependence on the initial centers,an improved K-Means algorithm based on the MST is designed and implemented.Firstly,the MST is constructed.After cutting the k-1 largest branches,the central points of the K clusters are calculated as initial central points.The experimental results show that the stability of the algorithm has been greatly improved.Research and implement the improved PAM clustering algorithm.Aiming to decrease the high time complexity of PAM algorithm,an improved PAM algorithm which stores the nearest distance is designed and implemented.The process of updating clustering is simplified by storing the nearest distance of each non-central point.Experiments show that the improved algorithm is faster than PAM algorithm on the same data set.Research and implement the DBSCAN clustering algorithm.Aiming at the disadvantage of DBSCAN algorithm that has poor classification ability on multi-density distribution data sets,an improved algorithm of multi-parameter and merging adjacent clusters is designed and implemented.On the basis of VDBSCAN clustering results,clusters with smaller distances are merged.Experiments show that the improved DBSCAN algorithm not only maintains the ability to recognize clusters with multi-density distributions,but also improves the accuracy.Research and implement the K-Means clustering algorithm on hardware.Firstly,the K-Means algorithm is implemented by hardware-software cooperation.The experimental results show that the improved algorithm does improve the speed.However,at the same time,the occupancy of hardware resources is also greatly increased.After that,in view of the excessive time consumed by K-Means algorithm in distance calculations,a clustering algorithm based on k-d tree is adopted to reduce the number of distance calculation.rewrites and adjusts the unsupported parts of the algorithm on hardware.Moreover,the algorithm is optimized according to the characteristics of the hardware.The experimental results show that the improved method can effectively improve the speed of the algorithm and avoid high resource occupation.
Keywords/Search Tags:Clustering Algorithm, K-Means, PAM, DBSCAN, FPGA
PDF Full Text Request
Related items