Research On Fuzzy Kernel Clustering Algorithm Driven By Viewpoint

Posted on:2021-01-15

Degree:Master

Type:Thesis

Country:China

Candidate:X C Song

Full Text:PDF

GTID:2428330614460393

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

In the era of big data,a large amount of available data is stored for analysis to obtain useful information,but the data is confusing,and researchers need to classify and partition certain data to obtain similar data.This process is called clustering.Clustering is to classify data according to a certain standard,and the resulting data set is called a cluster or class.The goal of clustering is to make the difference between data of different classes as large as possible,while the difference of data in the same class is as small as possible.In general,it is necessary to calculate the similarity between different data through a certain measurement method,such as Euclidean distance,Mahalanobis distance,and cosine distance.Clustering issues are involved in many fields.From the current situation,it can be seen that with the continuous deepening of the informationization in different fields of society and the large-scale application of intelligent technology,cluster analysis is a reliable analysis technology in data mining,machine learning,etc.All areas have been widely used.Various clustering methods have also been continuously proposed,each of which has advantages and disadvantages and is applied to different types of data.At this stage,the comparative analysis of clustering methods is an important research topic in the industry.On this basis,the corresponding clustering effects can be clearly obtained.Among them,the research on fuzzy clustering algorithm is an important branch.Judging from the current state of the existing fuzzy clustering algorithm,due to various factors,the effect of clustering is not satisfactory.For example,the usual practice for the selection of clustering centers is to initialize them randomly.The result of this is that the results of clustering are very unstable and uncertain.In addition,there is no effective control when iterating,which directly leads to the situation of being sensitive to the initial clustering center during the calculation process,and further reduces the stability of the running results.More importantly,because there are more noise points and outliers in the data,this greatly reduces the effectiveness of clustering.In addition,with the improvement of information collection capabilities,a data source can often collect data through different means and different perspectives,so that the so-called multi-view data is formed.Faced with this type of data,traditional clustering analysis algorithms cannot distinguish the differences between different views,so it is difficult to obtain good processing results.This dissertation mainly carried out thefollowing work:1)First,the long development process of the fuzzy clustering algorithm is sorted out,from the initial FCM to the KFCM after the introduction of the kernel function,then the viewpoint-based fuzzy clustering algorithm V-FCM and the feature-weighted fuzzy clustering algorithm EWFCM,And finally introduces DVPFCM based on density viewpoint.The central idea and operating mechanism of the above algorithm are elaborated carefully,and the objective function of each algorithm is analyzed in detail.The characteristics and deficiencies of each algorithm are pointed out.2)In order to solve the problems in the previous algorithms,the concept of "viewpoint" is introduced,and the clustering center can be controlled during the operation of the algorithm to avoid the interference of noise points,thereby improving the effectiveness of clustering.Based on the improvement of existing algorithms,a viewpoint-driven kernel function-based weighted fuzzy clustering algorithm(DVWEKFCM)is proposed.This algorithm determines the initial clustering center by looking for the density peak,and introduces a kernel function in the process of calculating the density of data points,in order to determine the initial clustering center more accurately.In order to deal with the clustering problem of high-dimensional data,we use feature weights to adjust the size of the role played by feature attributes of different dimensions in the clustering process,and reduce the irrelevant attribute weights as much as possible.After a lot of experimental analysis,it can be seen that the algorithm can determine the initial clustering center faster and reduce the interference caused by noise points,especially when processing data with higher dimensions.It shows obvious advantages.3)In order to improve the clustering effect of multi-view data,a weighted kernel fuzzy clustering algorithm(MV-Co-KFCM)for visible and hidden views is proposed.This algorithm first extracts the hidden information they share from several different visible views through the method of non-negative matrix factorization,which is also called shared hidden view.After that,the visible view and the hidden view are substituted into the objective function.In the algorithm,in order to coordinate different views,each view is assigned a weight.Experiments show that the proposed MV-Co-KFCM algorithm can achieve more ideal results than the traditional single-view algorithm on multi-view data sets.

Keywords/Search Tags:

fuzzy clustering, viewpoint, feature weighting, initial clustering center, multi-view clustering

PDF Full Text Request

Related items

1	Research On Partition Based Multi-view Clustering Algorithm
2	Research And Implementation Of Fuzzy Clustering Algorithm
3	Research On Fuzzy Clustering Based On Weightingwith Cluster Center Separation
4	Research And Application Of Multi-view Subspace Clustering Algorithm
5	Research On Problems Related To The Initial Center Selection In K-means Clustering Algorithm
6	Weighted Multi-view Clustering Method
7	Research On Multi-view K-means Clustering
8	Research On New Multi-view Clustering Algorithm
9	Research On Clustering Algorithm Of K-medoids And Its Application In Text Clustering
10	Precise Clustering Algorithm For Chinese Text Based On K-means