| With the development of Internet technology, all kinds of problems such as information flooding and redundancy are becoming more and more. So how to help users find and extract the potential value of information promotes the research on classification of massive data. Clustering algorithm is a statistical analysis method which can be used for exploring and classifying the distribution of data sets. It is an important tool for data mining. At present, clustering analysis is widely used in all kinds of disciplines and industries.Normal clustering method can be summarized as the basis of the partition method, the model method, the density method, the grid method and the hierarchy method according to the different algorithms thoughts. With the deep study on clustering method and continuous improvement of clustering method system, the kernel clustering algorithm is gradually being concerned. Support Vector C lustering(SVC) algorithm is a kind of clustering analysis method based on kernel. Compared to other clustering algorithms, SVC has some special advantages: firstly, SVC has no special requirements on the shape and number of data sets and can identify clusters of any distribution. Second ly, SVC can identify a part of the noise data points and classify overlapping clusters. Thirdly, SVC can realize the nonlinear and linear transformation of the data space to the feature space based on kernel algorithms thought which can deal with complex structure data.However, SVC still has some defects. High cost and low performance a lso affect its wide application. A kind of point sorting segmentation clustering algorithm based on similarity degree can just make up for the disadvantages of SVC algorithm on the performance of the algorithm. The algorithm is fast in data processing and the clustering quality is better than the general clustering. However, all the sample points were ordered without processing directly based on the distance measure at ordering phase. The sample points of the same cluster can be disassembled and arranged among the elements of the other clusters which will lead to derangements among non- identical cluster elements and also affect the quality of clustering to a certain extent.Considering the advantages and disadvantages on both the support vector clustering and segmentation clustering method based on point sorting, we proposed a method called Partitioning C lustering Based on Support Vector Ranking(PC-SVR). This algorithm inherits the advantages of the two algorithms in theory and effectively avoids some of the ir shortcomings which not only ensures the quality of clustering but also improves the speed of clustering.To verify the feasibility and performance of PC-SVR algorithm, experiments were carried out using two sets of artificial simulated data sets and four sets of real data sets respectively. And clustering results were compared with other classical algorithms. The results show that the PC-SVR algorithm is feasible and the operation efficiency and clustering quality are better than the general clustering a lgorithms. |