Font Size: a A A

Improved KFCM Clustering Algorithms And Its Application On Divide-and-Conquer SVM

Posted on:2016-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z L BianFull Text:PDF
GTID:2348330488974076Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
The traditional clustering analysis is mainly based on Euclidean space as a feature space. Clustering algorithm based on partition such as crisp clustering and fuzzy clustering can handle data which is linear separable, but has a bad clustering effect when dealt with undivided linear data. To solve this kind of problem, kernel function can implicitly map samples from input space to a high-dimensional feature space, where samples become linearly separable. Clustering analysis based on kernel function has always been the hot spot of the research at domestic and overseas. It has a good application value and has been widely used in the image processing, text clustering, identifying suspect, customer segmentation and so on.In this paper, we give a brief introduction to some typically algorithm from clustering analysis which is based on partition and introduce some key points such the computation of iterative formula, kernel matrix and membership matrix in detail. We do experiments with the data from the UCI database to compare the effect of clustering. The result shows that algorithm based on kernel function has better clustering effect than other algorithms, and fuzzy clustering has better clustering effect than hard clustering. Then we propose DCA-KFCM and DCA-KFCM2 algorithms, which are both based on DCA and Kernel Fuzzy k-means clustering(KFCM). These algorithms are two new kinds of clustering algorithms, and will have a very good research prospect.With the development of the society, the data in real life has the tendency from less to more, from simple to complex, from low dimension to high dimension. It causes time consuming and the insufficient storage space when computing the kernel matrix. This paper introduces the approximate KFCM algorithm to deal with this problem. This algorithm does not need to calculate the whole kernel matrix. All we need to compute is the diagonal elements and the corresponding row of the chosen data. It reduces the computational complexity greatly. Experiment results shows that the approximate KFCM algorithm can deal with large scale data effectively.In order to proof the effectiveness of the approximate KFCM algorithm, we apply it into the divide-and-conquer SVM, celled approximate divide-and-conquer SVM. The divide-and-conquer SVM has faster convergence speed compared with other algorithm and achieve more accurate solution in less time than other algorithms. The approximate divided-and-conquer SVM firstly divides the full problem into smaller subproblems which can be solved independently and efficiently. We theoretically show that the approximate KFCM algorithm is able to minimize the difference between the solution of subproblems and that of the whole problem, and support vectors identified by subproblems are very likely to be support vectors of the whole problem. However, running the approximate KFCM algorithm is time consuming on the whole data set. So we use the two step approximate KFCM algorithm to find an effective partition. In the conquer step, the local solutions from the subproblems are ‘glued' together to yield an initial point for the global problem. The experiment results show that the approximate divide-and-conquer SVM can decrease the objective function value faster than other SVM methods.
Keywords/Search Tags:Clustering analysis, kernel function, KFCM algorithm, divide-and-conquer SVM
PDF Full Text Request
Related items