With the development in the field of big data,cluster analysis algorithms have gradually become an important tool in the field of data mining analysis.How to improve the accuracy of clustering algorithm and its robustness to noise in processing complex data has always been a research hotspot.Fuzzy theory is widely used in cluster analysis because of its potential advantage to solve uncertainty problems.And the application of fuzzy theory to clustering algorithm can enhance the effectiveness and adaptability of the algorithm to deal with various complex data,which provides a new research direction for the further development of the cluster analysis theory.Fuzzy C-means clustering(FCM)algorithm has been widely used in clustering analysis.It has a good clustering effect on noise-free data.However,the FCM algorithm only considers the clustering of the current sample,which is very sensitive to noise and outliers,and its anti-noise robustness is poor.Possibilistic fuzzy C-means clustering algorithm(PFCM)solves the defects of the weak anti-noise ability of FCM algorithm and the overlap of clustering centers in PCM algorithm.The objective function of PFCM contains two important characteristics,which are membership and typicality value.And gives a and b as coefficients for membership and typicality value.However,the values of a and b only rely on a large number of repeated experiments to obtain the appropriate parameter values in different algorithms,which increases the uncertainty and time cost of the algorithm.Therefore,this paper will improve and optimize the weighting coefficients based on the possibilistic fuzzy clustering algorithm to enhance its clustering accuracy,anti-noise and adaptability.The main work of this article is summarized as follows:(1)Considering that weighted possibilistic fuzzy clustering(WPFCM)does not obtain significant performance compared with possibilistic fuzzy clustering,so this paper proposes an enhanced self-adaptive weighted possibilistic fuzzy clustering(AWPFCM)algorithm.Firstly,the principle of maximum entropy is introduced to weighted possibilistic fuzzy clustering,and the weighted coefficients of fuzzy clustering and possibilistic clustering are subject to regularization entropy and a novel self-learning iterative weighted possibilistic fuzzy clustering is obtained,and its convergence is strictly proved by Zangwill theorem and bordered Hessian matrix.Secondly,a series of clustering validity functions for the proposed algorithm are constructed to determine the optimal number of clusters in the data set.In the end,to enhance the anti-noise robustness of the proposed algorithm(RAWPFCM),a robust loss function is applied in the adaptive weighted possibilistic fuzzy clustering,and RAWPFCM algorithm is obtained for noisy data clustering.Experimental results show that the proposed algorithm outperforms existing possibilistic fuzzy clustering-related algorithms,and the validity functions for the proposed algorithm can accurately determine the optimal number of clusters in the data set,meanwhile,the RAWPFCM robust algorithm effectively enhances the performance of the algorithm in the presence of noise.(2)In order to further improve the clustering effect of high-dimensional complex data,this paper further generalizes the algorithm to kernel space,uses Gaussian kernel function to optimize the original squared Euclidean distance,and enhances the clustering performance of high-dimensional data.WPFCM algorithm is generalized to kernel space,and Gaussian kernel is used to optimize original the squared Euclidean distance to enhance the clustering performance of the algorithm.To determine the optimal number of clusters for the proposed algorithm accurately,three validity functions of the proposed algorithm are constructed.Finally,Gaussian kernel induced distance in KAWPFCM algorithm is optimized by a robust loss function to strengthen the robustness of the algorithm,and the novel RKAWPFCM algorithm is proposed.Experimental results show that the proposed KAWPFCM algorithm is better than that of existing possibilistic fuzzy clustering algorithms,three validity functions can accurately determine the optimal number of clusters of the proposed algorithm,and RKAWPFCM algorithm has strong anti-noise ability in the presence of high noise.(3)The enhanced possibilistic fuzzy clustering(EPFCM)driven by Lambert Wfunction is an important additive partition clustering method,but how to effectively choose the weighted coefficients of fuzzy clustering and possibilistic clustering is a challenging task for this clustering.This paper firstly introduces the principle of maximum entropy into the enhanced possibistic fuzzy clustering,and an adaptive entropy weighted possibilistic fuzzy clustering(AEPFCM)driven by Lambert W-function is proposed.Then three validity functions for the proposed algorithm are constructed to boost the algorithm with automatically finding an optimal number of clusters.In the end,robust loss function is used to modified the distance metric of the proposed algorithm and the robust enhanced possibilistic fuzzy clustering with weight entropy regularization is obtained to solve the clustering problem of text data polluted by noise.Many experimental results indicate that the proposed algorithms are significantly superior to existing possibilistic fuzzy clusteringrelated algorithms,meanwhile the work of this paper has greatly promoted the development of possibilistic fuzzy clustering theory,and will have profound significance for practical application.(4)In order to further improve the clustering performance of the enhanced possibilistic fuzzy C-means clustering(EPFCM)algorithm and further extend it to the kernel space,an adaptive weighted enhanced possibilistic fuzzy clustering(KAEPFCM)algorithm is proposed,which is based on the kernel metric and Lambert W-function.It effectively strengthen the adaptability and clustering performance of the algorithm.At the same time,a robust loss function model is applied to optimize original distance of the presented possibilistic fuzzy clustering and a robust enhanced kernelized possiblistic fuzzy clustering with entropy regularization is constructed to cluster complex data contaminated by noise.In the end,to widely apply this presented algorithm,multiple validity functions is designed to solve the problem that the presented algorithm automatically determines the optimal number of clusters.Numerical testing results indicate that the presented algorithm outperforms many possibilistic fuzzy clustering-related algorithms well. |