Font Size: a A A

Research On Support Vector Clustering Method Based On Minimum Margin

Posted on:2022-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:J L ChenFull Text:PDF
GTID:2518306758492004Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Today,statistical analysis techniques are widely used in increasingly complex data tasks.Data mining technology,as one of them,obtains favorable information by mining data information between organizational structures to support subsequent model algorithm design and engineering iteration.In the field of data mining,clustering methods can dig out the hidden information between data.After the research of many researchers,many mature clustering methods have emerged,such as k-means,DBSCAN,etc.,which are widely used in the field of engineering.Support Vector Clustering,a boundary-based clustering algorithm,has several advantages over other clustering methods.Firstly,support Vector Clustering is an unsupervised clustering algorithm,which does not need to pre-set the number of clusters;secondly,it can identify clusters of any shape and number,and the algorithm is universal;third,the algorithm processes structured data through the kernel method.At the same time,it avoids the complex calculation of high-dimensional space;last,the algorithm uses slack variables to increase the soft identification of the boundary and enhance the robustness of the algorithm.However,the cluster contour formed by SVC is easily affected by the kernel width coefficient q and the relaxation coefficient C,and it is difficult to form a boundary contour with reasonable boundary points.At the same time,the method only considers the slack variables when constructing the hyperplane,and does not pay attention to more information about the statistical distribution.In order to obtain better cluster contours more easily,this paper proposes a support vector clustering method based on the minimum margin drawing on the margin theory and the large distribution machine,minimum distribution support vector clustering,MDSVC.It is used to improve the robustness of boundary point recognition,form a better cluster outline,and improve the generalization performance of the model.The method proposed in this paper reconstructs the hyperplane by minimizing the marginal mean and marginal variance,thereby solving the problem of underfitting or overfitting.For the solution of the model,this paper adopts a customized dualcoordinate descent method to solve the problem in order to obtain higher computing performance.Furthermore,we further analyze the properties of the proposed model and theoretically demonstrate an upper bound on the error expectation.We also investigate the relationship between the parameters of two newly introduced parameters,the margin mean and the margin variance,leading to useful insights into adjusting the number of support vector points.Finally,this paper visualizes the clusters through cluster boundaries.At the same time,the experimental results on the classical data set show that,compared with SVC,the support vector clustering method based on the minimum margin proposed in this paper,minimum distribution support vector clustering,can adjust the number and formation of support vector points.Significant improvements in better clustering boundaries and significant improvements in generalization performance.
Keywords/Search Tags:support vector clustering, margin theory, dual coordinate descent algorithm, cluster partition, convex optimization
PDF Full Text Request
Related items