Font Size: a A A

Research On Metric Learning Algorithm Based On Multi-label Data

Posted on:2021-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:D K YangFull Text:PDF
GTID:2428330611462405Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of information technology,multi-label data widely exists in the real world,so multi-label learning has become the main research focus of artificial intelligence.It has been widely used in image classification,multimedia image tagging and text classification.Different from the traditional single-label learning in which each sample is only related to one label,multi-tag learning may be related to multiple category information,in which each instance can correspond to a set of labels.Usually multi-label data contains a large number of features,which may contain redundancy and noise,resulting in dimensional disaster problems in the learning process.It makes the multi-label learning problem more difficult than the single-label learning problem.How to extract effective features from multi-label data and enhance the classification performance has important research significance.At the same time,with the increase of the scale of the data,it is very expensive to obtain the label information of the data.How to use the geometric structure of the data and part of the label information to improve the classification performance is also a problem that needs to be studied.This paper will carry out in-depth analysis and research around the above two problems,and put forward some new models and solutions.The research content of this paper is mainly carried out from the following three aspects:(1)The traditional metric learning algorithm LMNN(Large Margin Nearest Neighbor)can only be used to learn the metric matrix of single-label data.To solve this problem,a weighted LMNN algorithm is proposed and applied to the metric matrix learning of multi-label data.In addition,as a linear metric learning method,the metric matrix learned by LMNN can not reflect the local geometric structure of the data.In this paper,a regular term is constructed based on the idea of manifold learning algorithm,and a weighted LMNN model is introduced.The weighted LMNN algorithm based on multi-label data proposed in this paper inherits the advantages of LMNN algorithm.It ensures that the distance between classes is smaller and the distance between classes is as large as possible.At the same time,it can keep the local geometric structure of the data as much as possible,and improve the robustness of the algorithm when the data is insufficient.(2)In the real world,there is usually a nonlinear relationship between sample data,and manifold learning algorithm is widely used as a nonlinear feature extraction algorithm.The traditional semi-supervised manifold learning methods are mostly proposed for single-label data and are not suitable for multi-label data classification.In this paper,a semi-supervised manifold learning algorithm based on multi-label data is proposed.Based on the local tangent space permutation algorithm(Local Tangent Space Alignment,LTSA),the local distance matrix is reconstructed according to the correlation of the label information of the labeled data to increase the local distance of heterogeneous data.At the same time,with reference to the idea of semi-supervised manifold learning,the high-dimensional data is projected directly into the label space,and the label information of the unlabeled data can be obtained without combining with the classifier algorithm.(3)Traditional manifold learning algorithms such as sparse manifold clustering embedded(Sparse Manifold Cluster And Embedding,SMCE)use a single local reconstruction weight to construct the local geometric relationship of the sample.For multi-label data,it is difficult to reflect the real local geometric structure of the sample by using a single weight.To solve this problem,a semi-supervised multi-weight preserving embedding algorithm based on multi-label data is proposed in this paper.For the sample data with c labels,in the local neighborhood of each sample point,c group weights are constructed to reflect the local geometric structure of the sample,and each set of weights reflects the local geometric structure of the sample point on a certain label.By keeping the local multiple weights of the samples in the low-dimensional space,the global optimization model of the algorithm is constructed.Finally,combined with the idea of semi-supervised manifold algorithm,the label information of unlabeled data is obtained directly.Experimental results on multiple data sets verify the effectiveness of the proposed algorithm.
Keywords/Search Tags:multi-label learning, metric learning, semi-supervised manifold learning, multiple weights
PDF Full Text Request
Related items