Font Size: a A A

Research On Multi-label Data Classification Technology

Posted on:2019-08-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:1368330575980693Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Recently,multi-label learning has attracted great attention due to its capabilities of improving performance in many current applications such as the image classification,multimedia image annotation,the social network data mining and so on.Affected by social needs,more and more scholars have conducted an in-depth study of multi-label learning.At present,multi-label learning has become one of the hot researches in the field of artificial intelligence.Different from traditional single-label learning tasks that each sample instance is associated with only one class,multi-label learning requires multiple outputs,where each instance can be associated with a set of category information.Because of the interrelationships between classes,dealing with multi-label learning problems is more complicated than single-label learning problems.Although multi-label learning has made great progress in research,there are still many problems that need to be solved by researchers.First,how to extract effective features is the key to solving the problem of multi-label classification.Secondly,in multi-label learning,labels are usually associated with each other.In this case,it is crucial to measure and capture the correlation in the label space for an effective prediction.Thirdly,due to the high cost of manual labeling,fast update rate and noise interference,only part of the label information of the data can be obtained.Therefore,it is necessary to solve the multi-label learning with missing labels.Finally,how to learn a nonlinear mapping that can effectively extract the discriminative information between samples is also a challenge for multi-label learning.This paper mainly studies the above three issues and proposes some new models and solutions.Contents of this paper are listed as follows:1.Inspired by that the L1 norm is robust to noises or outliers,this paper proposes a multi-label linear discriminant analysis based on L1 norm for dealing with the feature extraction problem.Most multi-label dimensionality reduction algorithms use the square of the L2 norm to measure the similarity between data or between labels,which is very sensitive to noise and outliers.These methods reduce the flexibility of the algorithm.Combined with the advantage of linear discriminant analysis,multi-label linear discriminant analysis based on L1 norm uses L1 norm to measure data samples or labels while ensuring that the within-class scatter is as small as possible and the between-class scatter is as large as possible,which improves the robustness of the model.This paper also proposes a nongreedy iterative algorithm for solving the proposed model.Different from the traditional greedy strategy,the local optimal solution of the objective function can be obtained by the proposed algorithm.At the same time,detailed theoretical analysis and experimental results proves that the convergence of the algorithm.In addition,the paper proves that the method can be extended to two-dimensional multi-label linear discriminant analysis.Finally,in order to verify the effectiveness of the algorithm,a large number of experiments are carried out on the commonly used single-label experimental database and multi-label experimental database.The results show that the proposed model achieves better performance than some supervised dimensionality reduction algorithm in single-label classification tasks or multilabel classification tasks.2.Based on kernel norms,this paper proposes a semi-supervised learning framework to process and capture the interrelationships between labels.Most of the existing semisupervised methods mainly look for correlations between labels by manually building a graph model.These graph-embedding approaches reduce the flexibility of the algorithm.In addition,due to the complex distribution of data in practical applications,especially when there are few labeled samples,it is difficult to describe the relationship between data through a manually constructed graph.The proposed model mainly constructs the category graph model adaptively by means of the kernel norm regular term,which reduces the influence of artificial parameters.Especially when the number of labeled samples is small,the graph model constructed by the proposed algorithm is more accurate.Based on the framework,two algorithms named NML-GRF(Nuclear-norm based Multi-Label Gaussian Random Field)and NML-LGC(Nuclear-norm based Multi-Label Local and Global Consistency)are proposed in this paper.At the same time,this paper proposes a non-greedy iterative algorithm to solve these two models.In order to deal with the out-of-sample problem,and make the new sample data be quickly classified,the proposed framework is combined with the linear classifier and two algorithms named NML-GRF2 and NML-LGC2 are proposed.Extensive experiments illustrate that the proposed methods achieve better performance than some state-of-the-art multi-label learning algorithms.3.This paper proposes a multi-label learning method named SVMMN(SVM-based Minimum Number of samples which live in the margin area)to solve the issue of multilabel learning with missing labels.Unlike traditional multi-label learning algorithms that require complete label information,SVMMN allows the obtained labels with missing information.Under the premise of keeping sample smoothness and label smoothness,SVMMN uses the principle of support vector machine to minimize the number of samples in the edge region while ensuring accurate classification.In addition,this paper proposes an efficient iterative algorithm for solving the objective function of SVMMN.Finally,a large number of experimental results show that the proposed iterative algorithm has good convergence.Compared with some multi-label learning algorithms,SVMMN further improves the accuracy and effectiveness of image annotation with missing labels,and has a good practical application.4.This paper proposes a method named multi-label discriminative deep metric learning for effectively capturing nonlinear relationships and discriminative information in multilabel data.The traditional metric learning methods only learn a linear mapping,which is often affected by the nonlinear relationship of data points,or uses a nuclear function,which is likely to cause problems such as poor scalability and reduced flexibility.Different from these methods,the model proposed in this paper mainly combines deep learning model with discriminative metric learning.The model can learn a nonlinear mapping through convolutional neural networks,and can use discriminative metrics to learn the discriminative information of samples.Finally,the experimental results show that compared with the traditional linear metric learning method,the proposed algorithm can further improve the accuracy of multi-label image classification.
Keywords/Search Tags:Multi-label learning, L1-norm, Nuclear norm, Missing label, Support vector machine, Deep learning
PDF Full Text Request
Related items