Font Size: a A A

Probabilistic K-means Models Via Nonlinear Programming

Posted on:2022-11-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:B W LiuFull Text:PDF
GTID:1488306764493224Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clustering is an unsupervised machine learning method.Because it does not need a posteriori knowledge(that is,it does not need manual marking),it has been occupying an important position in the field of machine learning.K-means clustering is the most classical model of clustering algorithm,but it also has many important variants,such as fuzzy C-means model.The principle of fuzzy C-means model is the same as that of K-means clustering model.The fuzzy parameter m of the fuzzy C-means model ranges from 1 to infinity.When m approaches 1,the performance of the model is improved.When the fuzzy parameter m is 1 the fuzzy C-means model is theoretically equivalent to the k-means clustering model(that is,the fuzzy C-means model and the k-means model should be unified in the same framework in theory).but the problem of solving this model when the parameter m is 1 has not been solved since 1981,and the performance of fuzzy C-means with m = 1 is unknown.In view of this problem,this paper conducts the following research:1.Proposes probabilistic k-mean model and fast maximum step gradient projection method.To solve the problem of fuzzy C-means model with the parameter m = 1,a new probabilistic K-means model is proposed.Solving probabilistic K-means model is a nonlinear programming problem with linear equality and linear inequality constraints,theoretically it can be solved by(active)gradient projection method,but it lacks efficiency,so in order to efficiently solve the problem in this paper,and proposes maximum step active gradient projection method and its improved versions the fast maximum step active gradient projection method.The experimental results verify the performance of probabilistic K-means model solved by the proposed method from five aspects,namely,initialization robustness,clustering performance,descent stability,iteration number and convergence speed.2.Based on probabilistic K-mean,proposes Lp-norm probabilistic K-mean model and inverse recursive maximum step gradient projection method.A novel Lp-norm probabilistic K-means model is proposed to solve the problem of P-norm fuzzy C-means algorithm when the parameter m is 1.The probabilistic K-mean model is a special case of the Lp-norm probabilistic k-mean model.P-norm probability K-mean model is also a nonlinear programming problem with linear equality and linear inequality constraints.The gradient projection method has the disadvantage of low efficiency in solving this model.Therefore,in order to solve this problem efficiently,this paper proposes the inverse recursive maximum step gradient projection method.The experimental results demonstrate the performance of Lp-norm probabilistic K-mean model solved by the proposed inverse recursive maximum step gradient projection method to solve from four aspects: initialization robustness,parameter p-influence,clustering performance,and convergence speed.3.On the basis of probabilistic K-mean,propose kernel probabilistic K-mean model.The probabilistic K-means model has a common problem of partition-based clustering,i.e.,it is difficult to identify clusters in nonlinear datasets.In partition-based clustering,kernel method is usually used to solve this problem.Therefore,kernel method is introduced into probabilistic K-means model.Kernel clustering model is an extension of the original model,and the original model can also be regarded as a special case of the kernel clustering model.The kernel probabilistic K-means clustering model introducing the kernel method can also deal with the clustering problem of non-spherical datasets.Experiments show the effectiveness of the algorithm.4.The block maximum step gradient projection method is proposed,on the basis of the two gradient projection methods.For large datasets,the fast maximum step gradient projection method may take too long time,and the inverse recursive maximum step gradient projection method needs to preserve a large-scale inverse matrix,which may require more memory than the maximum memory.To solve these problems,the block maximum step gradient projection method is proposed to solve the clustering problem on large datasets.5.Iterative min cut clustering algorithm is proposed.Clustering linearly inseparable datasets has always been an important problem in the field of clustering.Clustering algorithms based on graph theory can effectively solve this kind of problem,but the solving graph cut models is an NP hard problem.To solve this problem,an iterative min cut algorithm is proposed,which is solved by using gradient descent method.The iterative min cut algorithm can map the linearly inseparable dataset into a linearly separable space by using only one formula,and finally the final clustering results are obtained.The performance of the algorithm is also proved on artificially synthetic and real datasets.
Keywords/Search Tags:K-means clustering, fuzzy C-means clustering, graph-based clustering, gradient projection method
PDF Full Text Request
Related items