Font Size: a A A

Research On Risk Degree-Based Safe Semi-Supervised Fuzzy Clustering Algorithm

Posted on:2022-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:L GuoFull Text:PDF
GTID:2518306338990239Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Clustering is an important learning method of machine learning.Many scholars have studied the field of clustering and achieved a lot of research results.Semi supervised clustering is a future development direction of clustering,which is more suitable for practical application.With the further research on semi supervised clustering,it is found that the performance of semi supervised clustering algorithm is not always better than the corresponding unsupervised clustering algorithm.The reason is that the wrong prior information may mislead the clustering process.How to use the prior information safely in the process of clustering is a thought-provoking topic.This paper has done a series of work and proposed two kinds of semi supervised clustering algorithm.The main research work are as follows:(1)A semi supervised security algorithm(CES3FCM)is proposed to estimate sample confidence.This method uses joint local density and relative distance to estimate the confidence of labeled and unlabeled samples.Firstly,local density and relative distance are used to estimate the confidence of labeled and unlabeled samples.Then,the local graph of labeled samples is constructed to simulate the relationship between labeled samples and nearest unlabeled samples.Finally,the confidence weighted and graph regular term are introduced into semi-supervised fuzzy c-means clustering algorithm.The algorithm reduces the weight of noise samples and outlier samples,weakens the impact on clustering,and limits the output of risk labeled samples to that of nearest neighbor unlabeled samples.The algorithm is validated on ten UCI datasets,and the results show that the algorithm is effective.(2)Aiming at the problem that the confidence estimation of labeled samples is not stable in the current security semi supervised clustering algorithm,a secure semi supervised clustering algorithm based on D-S evidence theory(DS-S3L)is proposed.The algorithm uses the idea of ensemble learning to construct multiple base clusters,uses the cluster validity function to evaluate and select the base clusters,and then fuses the clustering results through D-S evidence theory to calculate the confidence of labeled samples.Finally,the k-nearest neighbor graph is used to limit the output of labeled samples with high risk to the output of adjacent unlabeled samples.The algorithm in this paper is compared with nine different clustering algorithms,the experimental results show that the algorithm has higher clustering accuracy and more stable effect in multiple datasets.
Keywords/Search Tags:Machine learning, Semi supervised fuzzy c-means clustering, local density, relative distance, D-S evidence theory, clustering ensemble, k-nearest neighbor
PDF Full Text Request
Related items