Font Size: a A A

Research On Multi-label KNN By Exploitin Label Correlation

Posted on:2016-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:H F TanFull Text:PDF
GTID:2308330461991804Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of information technology and the progress of the society, multi-label classification has become an important part of the classification problem. The application of multi-label is used more and more widely in real life. However the multi-label classification is different with the traditional single-label classification. The traditional classification algorithm has no longer adapted to the multi-label data because the ambiguity of the multi-label data. As a result, researchers have put forward a lot of multi-label classification algorithm to deal with multi-label data. These methods can be divided into the following three categories:problem transformation method, the algorithm adaptation method and ensemble method. The problem transformation method change the data through making it from multi-label data set into single-label data sets, then classify the changed data set through traditional classification methods. The algorithm adaptation improved traditional classification method, then classify these multi-label data set. Ensemble method combined the problem transformation with the algorithm adaptation in order to achieve the better classification effect.Multi-label k nearest neighbor is an algorithm adaptation method and a kind of mostly used multi-label classification method, but in this method ignores the correlation among the labels, so that it can’t achieve good performance. For this question, this paper mainly studies the multi-label KNN by exploiting label correlation, mainly completed the following content:(1) A preliminary understanding to the concept of multi-label classification and some classic multi-label classification algorithm, and a summation of the main ideas of the multi-label classification algorithm and their characteristics and shortcomings.(2) It analyzed the selection of parameters in the multi-label KNN, first we study the neighbor number k and consider the most ideal classification results which number should be given. Then we analyze the multi-label KNN classification performance when choice the different similarity measure method. We conduct the experiment that consider the neighbor number k and the order of similarity measure method at the same time, through control the order choosing the most suitable neighbor number and order number.(3) An algorithm based on multi-label KNN by exploiting label correlation was proposed. Existing k neighbor multi-label classification algorithm has good effect, but in this algorithm does not consider the correlation between the labels. For this disadvantage we propose a multi-label KNN by exploiting label correlation (CML KNN), adding the basic idea of Classifier Chains (CC) to the k neighbor multi-label classification algorithm, improving the performance of algorithm through combined the label information with the original algorithm.(4) In order to get the optimal CML-KNN algorithm by adjusting the parameters in the proposed algorithm, and comparing the CML-KNN algorithm with the original algorithm and some existing algorithms have performed well in multi-label data sets, the experimental results show that CML-KNN algorithm is better than the original method and other comparison algorithm in this paper.
Keywords/Search Tags:multi-label, k nearest neighbor, similarity measure method, label correlation
PDF Full Text Request
Related items