Font Size: a A A

An Improved Algorithm For Support Vector Clustering

Posted on:2013-10-26Degree:MasterType:Thesis
Country:ChinaCandidate:L ChengFull Text:PDF
GTID:2248330371966635Subject:Cryptography
Abstract/Summary:PDF Full Text Request
With the widespread application of Internet, electronic information leading to the accumulation of information overload, how timely and effective mass information of interest to extract user information, access to the user want to learn the knowledge and data mining technology has become an important research topic. The clustering algorithm is to realize the massive data clustering, find useful information is an important tool, so the clustering algorithm as domestic and foreign researchers in recent years, a hot issue.Compared with other clustering algorithms, support vector clustering algorithm has two advantages:First, support vector clustering algorithm based on support vector point theory can identify clusters of arbitrary shape, and to ensure the stability of the algorithm; second, support vector clustering algorithm is introduced to facilitate the identification noise penalty factor data, can effectively deal with overlapping clusters. But the support vector clustering algorithm of the following defects, limited ability to identify the noise data; data set in clustering the training phase the presence of internal support vector point lead into a local optimum clustering process, affecting clustering results; cluster allocation stage calculation the time complexity of the adjacency matrix is the square of the scale series data set, affecting the speed of the clustering algorithm. To solve the above problems, the design of an improved support vector clustering algorithm.Implementation of the algorithm improved algorithm for data sets before the first pre-processing to improve the algorithm to identify the noise data; second training phase in eliminating the effects of cluster clustering quality of internal support vector points to increase the stability of the algorithm; again in the cluster distribution phase change of the data set to determine the balance through a strategy, the use of support vector points to solve this problem have the same effect; the final calculation of the cluster sampling method instead of a linear traversal labeling strategy. Improve the SVC algorithm based on the above strategies. SVC algorithm improved algorithm abandoned the lack of preservation of its advantages, while local optimization algorithm to solve the SVC, the quality and noise data clustering algorithm to reduce the time the problem of high complexity.In addition, use of classical simulation data sets, and compared with the SEP-CG algorithm and E-SVC algorithms. The results show that the improved algorithm can solve the local optimization problem, improve the algorithm’s accuracy, reduce time and space complexity of the algorithm, and achieve the desired results.
Keywords/Search Tags:Support Vector Clustering, Pretreatment Noise Data, Sampling, Radial parameters, Penalty factor, Local Optimum
PDF Full Text Request
Related items