Font Size: a A A

A Fast Multi-label Classification Algorithm Based On Binary And Triple Class Support Vector Machines

Posted on:2009-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:S P WanFull Text:PDF
GTID:2178360245976391Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Classification problem is to learn a model by some known samples and then to predict the samples which labels are unknown by this model. For the classification problem, it can be divided into binary class and multi class classification problems based on the number of different classes in the sample set or divided into single label and multi label classification problems by the number of labels on samples. Multi label classification problem is one of the most complex problems in the classification field, where its samples may belong to multi class or own multi labels and it contains binary class and multi class classification problems.For the multi label classification problem, a decomposition policy called "one versus one" is used to decompose the multi-label problem into several binary class single label or binary class binary label classification subproblems which can be disposed alone. For the binary class single label classification subproblem, a general binary class support vector machine classification algorithm is used to deal with it. And to handle the binary class binary label classification subproblem, a triple class support vector machine is proposed in this thesis. The samples with two labels simultaneously are called as samples of mixed class which are located between the positive class and negative class. Two parallel hyperplanes are used to classify samples of three classes. For improve the training speed, a fast algorithm is designed for the triple class support vector machine by decomposing a large scale quadratic programming problem into a series of smallproblems. At last, by modifying the popular SVMlight algorithm, a fast triple class trainingalgorithm is realized in this thesis.At the experimental part, some widely used evaluation criteria for multi label classification algorithm are summarized and used in the experiment of three benchmark datasets Yeast, Scene and RCV1-10C in this thesis. The result of experiment is compared with a lot of several existing multi label classification algorithm to validate the performance and speed of the algorithm. At the comparing of classification performance, none of the multi label classification algorithms can obtain the best at all the evaluation criteria, so a way by marking the score for every evaluation criteria of each algorithm and comparing the final total score is applied. The final result shows that the total score of multi label classification algorithm presented in this thesis always ranks the top four on all the experiments compared with other multi label classification algorithms, such as Rank-SVM, BoosTexter, AdtBoost.MH, ML-kNN, BP-MLL, BasicBP, OVR-SVM, OVO2BN-SVM, OVOC4.5, OVO-kNN and OVO-NB. At the comparing of computational time, for the experiments of RCV1-10C data set which have 23149 training samples, multi label classification algorithm of this thesis runs three times as fast as other two decomposition algorithms which are also based on support vector machine. At the testing time, the number of support vector in the algorithm of this thesis which can determine the testing time is also smallest.
Keywords/Search Tags:Multi-label classification, Support vector machines, "One versus one" decomposition policy, Fast algorithm
PDF Full Text Request
Related items