Font Size: a A A

Research On Multi-label Classification Related Technology

Posted on:2014-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhangFull Text:PDF
GTID:2248330398458034Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Multi-label classification widely exists in the real world, and is the current research focus ofmachine learning and data mining. The proposal of multi-label learning has enriched and promotedthe existing classification problems. Multi-label classification can tackle with problems thattwo-class learning methods can not identify and solve. Currently, the research of multi-labelclassification is mainly focused on three aspects, i.e., looking for better classification algorithm forthe accurate classification of the instances, label ranking problems of the classified instances, andtankling with high-dimensiona data in multi-label classification. Firstly, this paper introduces theback ground and significant of research of multi-label classification, as well as the presentresearches and problems of researches. Secondly, the paper introduces the framework, evaluationmethods and data sets of multi-lable classification. Finally, we will focus on four issues. The firstis to looking for weighing and data sampling methods in order to reduce the time complexity andimprove the classification accuracy. The second is how to assign a complete preference order ofthe labels to every example. The third is to solve the problem that traditional multi-labelclassification algorithms can not manage high-dimension data. And the last is, on the basis of theabove studies and through a large number of experimental comparisons, to verify the effectivenessof algorithms proposed in this paper.This paper is to carry out the following tasks.1. ML-KNN is an approach that employs KNN to solve multi-label problems, but it suffers fromthe problems of high time complexity and low classification accuracy. This paper proposed amodified algorithm WML-KNN to solve these problems. It combined data sampling andweighting into one approach, and resulted in the time complexity reduction and the classificationaccuracy improvement of the minority class data. Experimental results show that WML-KNNworks better than other commonly used multi-label algorithms.2. Label ranking is a complex prediction task where the goal is to map instances to a total orderover a finite set of predefined labels. To solve the problem, a novel approach named Apriori basedlabel ranking (APR-LR) is proposed in this paper. APR-LR classifies the data using traditionalmulti-label classification methods, and ranks the labels of each instance based on the Apriorialgorithm taking the effect of neighbors of the instance into account at the same time. Experimentsshow that APP-LR outperforms the other label ranking algorithms on the two evaluation metrics.3. High-dimension data often appear in multi-label classification, and traditional multi-label classification algorithms can not manage this kind of data well. To solve these problems, a novelapproach named High-Dimension Multi-Label (HDML) is proposed in this paper. HDML employsthe locally linear embedding (LLE) to reduce the data dimension, and classifies an instance withK-means and KNN while taking into account of the label vector of its nearest neighbors.Experiments on real multi-label data show that, HDML outperforms the other multi-labelclassification algorithms on several evaluation methods.
Keywords/Search Tags:Multi-label learning, label ranking, high-dimensional data, weighing, k neaestneighbor, dimensionality reduction
PDF Full Text Request
Related items