Font Size: a A A

Research And Application Of An Improved BIRCH Algorithm Based On Link

Posted on:2020-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:J W ChenFull Text:PDF
GTID:2428330575978897Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Over the past decades,with the rapid development of information technology and Internet technology,the number of users of the network has been expanding.From 2G,3G to the evolution of 4G and 5G network era,the speed of network transmission is faster and faster.The progress of these technologies has changed the ways of our learning,working and living.Of course,it is also accompanied by explosive growth in data volume.It's a very valuable research problem we are facing at present,that how to deal with and analyze the magnanimous data.As the saying goes,"birds of a feather flock together".Clustering algorithm is an important method of data analysis,and a key algorithm of data mining.By analyzing the similarity between data,we find common groups to classify data.Clustering algorithm is an unsupervised learning algorithm.Unsupervised learning means that data is needed,but result is not required,and attempts are made to explore and discover some patterns.Clustering algorithm is also widely used in many fields,such as user purchase pattern analysis,image color segmentation,pattern recognition and so on.At the same time,in some machine learning and data mining algorithms,clustering method is often used on data preprocessing.In this paper,we will introduce an improved hierarchical clustering algorithm model,which is a link-based BIRCH clustering algorithm.By using the link concept of ROCK clustering algorithm,we can effectively improve the disadvantage that BIRCH algorithm can not cluster nonspherical type clusters.Firstly,a small threshold is set to construct micro-clusters and get outliers by BIRCH algorithm.Secondly,the neighbors of micro-clusters are determined according to the specified rules,and a neighbor table is obtained.The last step is to merge micro-clusters,finding the cluster with the largest rationality measure of each cluster in turn and merging them into a cluster(the rationality measured requires more than the threshold of rationality measurement).If such a cluster is not found,it will not be processed.According to the algorithm model established in this paper,firstly,I validate the correctness and effectiveness of the improved algorithm by using the classical data set with class labels and the three-dimensional data set manually generated.At the same time,I display the partial clustering results.After confirming the validity of the model,the improved BIRCH algorithm is applied to the image classification model for clustering image features,and the construction process and experimental results of the classification model are analyzed in detail.Clustering algorithm,as an unsupervised learning algorithm,has no so-called optimal solution,but reasonable clustering can help us analyze the characteristics of data and find meaningful patterns.It is very meaningful using appropriate clustering algorithm on different data sets and analyzing them in combination with their application environment.
Keywords/Search Tags:BIRCH algorithm, Link, Clustering algorithm
PDF Full Text Request
Related items