Font Size: a A A

Multi-Label Classification Based On Fuzzy Kernel Clustering And Fuzzy Support Vector Machine

Posted on:2012-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:W B ZhengFull Text:PDF
GTID:2218330338466278Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the advent of the information age, a phenomenon named "information explosion with knowledge lack" appeared because a variety of data was accumulated which is far beyond the scope of human processing. So data minning came into being and shows great vitality.Classification is the most common task of data minning. It is summed up by the known information to the law, which is used to forecast the unknown data. A special case of classification known as single-instance multi-label classification lies in that its sample may belong to multiple classes. Unlike single-label classification, in such matters, the multiple labels of samples make the attribution of blurred, which makes it difficult to be accurately classified. However, its use in everyday life is very extensive. Many scholars are committed to this so that many excellent algorithms and their improvement are proposed.In this thesis, a multi-label classification algorithm based on fuzzy support vector machine is designed. Support Vector Machine (SVM) is a new classification machine, which is proposed by Vapnik from the AT & T Bell Laboratories in the later nineties of last century. The method integrates optimal decision hyperplane, kernel function and convex quadratic programming is based on statistical learning theory and structural risk minimization. It is an effective way with great generalization and accuracy to solve the "over learning", "dimension disaster" and the "local minimum" problems. However, SVM is designed for the two-class, single-label problems, which means that it's can not be used to multi-class, multi-label problems. For this reason, a fuzzy support vector machine can be used in binary class double label data sets is designed. In this classifier, fuzzy theory is used to set a membership function for the samples to make a full use of the data information. It has better classification accuracy, and has no unpredictable area. In order to accurately describe the relationship between sample categories of membership, the thesis designs a membership function based on distance and density. Considering the specificity of multi-label classification, one versus one decomposition policy is used to decompose the multi-label problem into several binary class double label classification sub-problems. The voting technique is used to combine the results to solve the multi-label classification problems. To improve the training speed and reduce the effects of noise points to the optimal decision hyperplane, a kind of semi-fuzzy kernel clustering algorithm is employed to improve the performance of the algorithm.In the experiment part, several widely used evaluation criteria of multi-label classification algorithms are summarized. Results of experiments on UCI datasets are used to compare with some exist multi-label classification algorithms.
Keywords/Search Tags:Data Mining, Pattern Recognition, Multi-label Classification, Support Vector Machines (SVM), Fuzzy Kernel Clustering
PDF Full Text Request
Related items