Font Size: a A A

Research On Multi-label Classification Algorithm Based On One-versus-one Decomposition Policy

Posted on:2014-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:X D FengFull Text:PDF
GTID:2268330401969476Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Classification is to lean a model by the known samples and then to predict the new samples with unknown labels. Multi-label classification is a special learning issue where a single instance is possibly associated with several labels at the same time and the labels are not mutually exclusive. Because of many real-world applications, it has been paid more and more attention to. Nowadays, there are mainly two strategies to design various discriminative multi-label classification methods:problem transformation and algorithm extension.Combining decomposition strategy with the SVM is an effective mean for multi-label classification problems. The multi-label classification method based on the one-versus-one decomposition strategy divides a q-class problem into q(q-1)/2sub-problems and then assembles all sub-classifiers into an entire multi-label classifier, which lessens the scale of the problem and relieves the problem of sample imbalance, but new problems appear, such as the triple class problem and threshold problem. To tackle the triple class problem after decomposition, we adopt ordinal regression, which treats the mixed class as a new class and divides the samples from three classes with two parallel discriminant hyperplanes, and implements the triple class SVM called OR-SVM. In the proposed method, we use the Bayesian rule to distinguish the relevant labels from the irrelevant labels.In our experiments, we validate the performance of our algorithm on10benchmark datasets, such as Yeast, etc. and choose nine evaluation measures to evaluate the performance of the algorithms, including hamming loss, accuracy, precision, recall, F1, ranking loss, one error, coverage and average precision. In the procedure of tuning key parameters for all the algorithms, based on3-fold cross validation, we investigate the average of ranking loss and hamming loss as a function to detect the optimal parameters on the training sets and then calculate the values of nine evaluation measures on the testing sets. The comparing of our algorithm with the existing multi-label algorithms, such as OVR-SVM, which uses the ranking method to organize the statistical results and uses the Friedman test to analysis the ranking result, shows that OR-SVM obtains comparable predictive performance and outperforms the existing methods on several evaluation measures such as ranking loss, one error, etc.
Keywords/Search Tags:Multi-label classification, Support vector machines, "One versus one"decomposition policy, Ordinal regression, Bayesian rule
PDF Full Text Request
Related items