Font Size: a A A

Research And Application Of Improved CHAMELEON Algorithm Based On Condensed Hierarchical Clustering Method

Posted on:2021-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:J J ZhaoFull Text:PDF
GTID:2428330629952705Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,the Internet technology and information industry have continuously told the development that people have started to pay attention to the term"big data"and have gradually been mentioned by people.Massive data in the information age are also used to describe and define it.With the continuous passage of social time,the importance of massive data has gradually been realized by people.In the era of big data,the first challenge that humans must meet is how to process massive amounts of data,how to process massive amounts of data,how to study and discover the potential information between massive amounts of data,and it has now become a very urgent issue that we need to solve.Study the question of value.Cluster analysis is a commonly used method for processing large amounts of data,and through this method,some meaningful suggestions are obtained from massive data.Clustering calculates the similarity between data objects and aggregates a group of data points with higher similarity into the same group.This group is called clustering cluster.The similarity of data in the same cluster is high,and the similarity of data in different clusters is highly different.Clustering classifies data objects in an unsupervised manner during the learning process.Cluster analysis is often used to analyze data in many fields,and it has become a commonly used technique for analyzing data.Cluster analysis has been widely and deeply applied in the fields of machine learning,data mining,pattern classification,and image segmentation.This article will introduce in detail a clustering algorithm model proposed in this paper,a model based on the aggregation-based hierarchical clustering method that is improved by the CHANELEO algorithm.The CHAMELEON clustering algorithm uses dynamic modeling to cluster.The clusters generated by clustering can be high-quality clusters of different sizes,shapes,and densities.Although the CHAmeleon algorithm has such advantages,there are still shortcomings.The CHAMELEON algorithm is difficult to choose the value of k when constructing k-nearest neighbor graph G_k,and it has a great impact on the results.The division of G_k by the multi-level partitioning hMetis algorithm for segmenting large hypergraphs is a rough division,which can easily lead to uncertainty in the results.In view of this shortcoming,this paper draws on the basic idea of??the AGNES algorithm,proposes a new agglomerative hierarchical clustering method,and replaces the traditional hMetis algorithm with the traditional hMetis algorithm to divide G_k generators.cluster.The experiments show that the clustering effect is ideal.Aiming at the clustering algorithm model introduced in this paper,firstly,the classical artificial data set with labels and the public data set from UCI were used to fully verify the effectiveness of the proposed improved algorithm.In order to clearly see the effectiveness of the improved algorithm model from the clustering effect,the clustering results of the low-dimensional data set are displayed by drawing the result image.After verifying the validity of the model,the model of the CHAmeleon algorithm improved by the aggregation-based hierarchical clustering method was applied to the analysis and research of students'extracurricular practice data,used to discover the potential of students in extracurricular practice,analyze the experimental results,and discover and study Factors Affecting Students'Extracurricular Practice Performance.
Keywords/Search Tags:CHANELEOON algorithm, AGNES algorithm, cluster analysis, extracurricular practice
PDF Full Text Request
Related items