Support Vector Clustering Based On Shared Nearest Neighbor

Posted on:2021-01-21

Degree:Master

Type:Thesis

Country:China

Candidate:D Q Zhu

Full Text:PDF

GTID:2428330629452707

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The clustering algorithm is an unsupervised algorithm that classifies data without teacher signals.The advantage of support vector clustering algorithm is that for any shape and number of data sets,theoretically,clusters of any shape can be identified by adjusting the parameters of the kernel function;it is not sensitive to noisy data points,and can avoid the effect of noisy data on the results;the kernel function is used to map the data into the feature space,which solves the problem of linear inseparability.The high cost and low performance of support vector clustering algorithm have become its shortcomings.This paper improves on the original support vector clustering algorithm in the following two parts:1.In the support function part of the objective function training solution,use another method to solve the problem,transform the constraint problem into an unconstrained optimization problem,and use the algorithm framework to utilize the GPU computing function to shorten the training process in time.2.In the division of clustering stage,the original algorithm needs to connect straight lines to randomly pick points between two data points,and find its distance from the center of the sphere in the feature space.This calculation increases the complexity of the algorithm and makes the calculation speed of the algorithm double.In this paper,we introduce the idea of shared nearest neighbors,partition the support vectors in advance,and then divide the remaining data points.The Support Vector Clustering Based on Shared Nearest Neighbor proposed in this paper retains the advantages of the original support vector clustering algorithm.It can be well divided for data sets of any shape;it can process data sets of various shapes by mapping the data set through the kernel function;noisy data points have little effect on the algorithm.However,because of the idea of shared neighbors introduced in the division stage,additional parameters need to be introduced,so the difficulty of adjusting the parameters of the algorithm is increased to a certain extent.The improvement purpose of this paper is to speed up the calculation speed of the algorithm and improve the accuracy of clustering and division,and increase the accuracy.In order to verify the improved algorithm in this paper,we use several sets of artificial data sets with distribution characteristics to make a preliminary check on the feasibility and effectiveness of the algorithm.Then use the real data set to verify the algorithm,and perform horizontal comparison with some other classic clustering algorithms,compare the results of the improved algorithm in this paper,and analyze and summarize.

Keywords/Search Tags:

Clustering, Support Vector Clustering, Shared Nearest Neighbor, The division of clustering, kernel function

PDF Full Text Request

Related items

1	Spectral Clustering Algorithm Based On Nearest Neighbor Relation And Ensemble Clustering
2	Research On Affinity Propagation Clustering Algorithm
3	Research On Speech Emotion Recognition Based On Kernel Function
4	Support Vector Clustering Analysis Of Radar Emitter Signals
5	The Research And Application Of Clustering Algorithm Based On Density
6	Research Of Kernel Methods For Support Vector Machine And Multiple Kernel Clustering Algorithm
7	Research On Clustering Algorithm Based On Shared Neighbor Affinity
8	Customer Transaction Data Clustering Analysis And Parallelism Based On Shared Nearest Neighbor
9	Clustering Technique With Shared Nearest Neighbors
10	Research On Noisy Data Clustering Algorithm Based On Natural Nearest Neighbor