Font Size: a A A

Research On The Multi-label Feature Selection And Classification Methods With The Label Correlations

Posted on:2017-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y P CaiFull Text:PDF
GTID:2348330488496685Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Different from traditional supervised learning framework in which each object is assigned to only one concept of label, the condition which one object may be associated with multiple labels simultaneously in multi-label learning is able to analyze the problems in the real world more effectively. For instance, an image may be tagged with "desert", "cactus", "sun" and so on. An article may be labeled "Nobel", "yoyo Tu" and "medical science". For nearly a decade, the successful application in many fields have made the research of multi-label learning to be an upstart. Meanwhile, how to effectively discover and exploit label correlations has been a core research issue and attracted a great deal of attention. Researchers have proposed a series of multi-label learning algorithms by exploiting label correlations. Nevertheless, there are a lot of redundant features and irrelevant features existing in high dimensional data which reduce the performance of Classifiers. So feature selection plays a core essential role among multi-label classification with high dimensional data. However, few multi-label feature selection algorithms consider the label correlations. In this respect, this dissertation focuses on how to discover and exploit the label correlations to improve multi-label feature selection and help multi-label classification, proposed three multi-label learning algorithms combined with label correlations. The main contributions of this dissertation are as follows: 1. We propose a multi-label ReliefF feature selection algorithm with label correlations (ML-ReliefF). In ML-ReliefF which is based on classical ReliefF, the correlations of label sets are used to divide the Hit and Miss instead of the simple distance metric, which effectively exploit the correlations between multiple labels. Meanwhile, the improved updating formula for the feature weights is more appropriate to the multi-label learning framework. Experimental results on multi-label datasets demonstrate that the ML-ReliefF algorithm can effectively select the best feature subset, and significantly improve the performance of multi-label classification. 2. A label correlations joined convex semi-supervised multi-label feature selection and classification algorithm is proposed (LCCSFS). The algorithm focus on the difficulty of obtaining the labels, improved CSFS, a semi-supervised multi-label feature selection and classification algorithm. LCCSFS automatically learned the pairwise and symmetric label correlations by constructing a covariance matrix, and effectively used the information of unlabeled data. So that it can unify the label correlations learning, multi-label feature selection and multi-label classification to one model framework. The experiment results show that with the addition of label correlations, the performance of semi-supervised multi-label feature selection and classification algorithm gains a great improvement.3. A novel multi-label feature selection algorithm by exploiting the label correlation locally (Loc-MLFS) is introduced. This algorithm taking advantages of local label correlations (the correlations are not shared by all instances) in multi-label feature selection algorithm. In Loc-MLFS, to achieve the use of local label correlations, the algorithm divides the samples into groups by Category clustering and use multi-label feature selection to each group. At the same time, the algorithm can be extended to a unified framework. Experimental results on the datasets demonstrate that Loc-MLFS achieves superior performance.
Keywords/Search Tags:multi-label classification, multi-label feature selection, label correlation, ReliefF algorithm, Semi-supervised learning, local label correlation
PDF Full Text Request
Related items