Font Size: a A A

Research On Clustering Algorithm Based On Filtering Mechanism

Posted on:2021-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:W X WangFull Text:PDF
GTID:2428330602973928Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cluster analysis is widely used in various fields,such as biological engineering,business drainage,financial investment,medical image research,user analysis,etc.It is an unsupervised classification method,and it is a machine learning method without training set and training process.In the process of clustering analysis,we should try our best to ensure that the data objects of the same category have a high similarity while the data objects of different categories have a low similarity.Aiming at the problems of clustering algorithm with low clustering accuracy and sensitive clustering results to parameters,the algorithm is not effective in processing high-dimensional data.The main innovations of this paper are as follows:(1)a hybrid attribute data clustering algorithm(MC-FM)based on mean shift theory and filtering mechanism was proposed.The algorithm using improved mixed attribute similarity,similarity between objects was measured by using the local mean shift of each object,use k neighbor and mean shift,according to the filtering mechanism to distinguish between core and non-core object.Finally,the non-core objects are divided into corresponding clusters to form the final clustering results.The synthetic data set and UCI data set were used for experiments,and the effectiveness of the algorithm was verified based on the experimental results.Compared with similar algorithms,the MC-FM algorithm has higher clustering accuracy.(2)A clustering algorithm based on density and Mk NN(MO-Mk NN)was proposed.The algorithm uses the number of mutual neighbors among objects to obtain the filtering factor,and distinguishes the core objects from the non-core objects according to the filtering factor obtained.The core objects are searched in a breadth-first manner according to the neighboring relationship to obtain the clustering prototype,and finally the remaining objects are divided according to the k-nearest neighbor to form the final clustering result.The synthetic data set and UCI data set were used for experiments,and the effectiveness of the algorithm was verified according to the experimental results.Compared with similar algorithms,MO-Mk NN algorithm has higher clustering accuracy.
Keywords/Search Tags:clustering algorithm, density, high dimensional data, mixed attributes, filter
PDF Full Text Request
Related items