| In multi-label learning,an instance usually can be represented by multiple different labels.This one-to-many multi-label learning can more comprehensively describe the diversity of things.For unseen instances,the main task of multi-label learning is that an unseen instance is assigned to a suitable label set.In the previous multi-label learning problems,it is usually assumed that the features and labels in the training data set are complete.However,in practical application,this assumption can not be guaranteed,i.e.,the features or labels are missing for some data examples.Highly incomplete data is undoubtedly a huge challenge for multi-label learning.The performance of the classification model built on such incomplete data sets will be greatly reduced,which will affect the accuracy of predicting category labels for unknown instances,and further lead to the inability to distinguish ambiguous objects.However,most of the existing multi-label learning algorithms on missing data only focus on missing features or missing labels,and rarely consider the problem of missing features and labels simultaneously.Therefore,this thesis proposes a novel multi-label learning algorithm named MMFL,i.e.,Multi-label learning with Missing Features and Labels,which can deal with the problem of missing features and labels simultaneously,and integrate the recovery process of missing features and labels with the construction of classification model in a unified framework,so as to reduce the performance degradation of classification model caused by the missing features and labels.First,in view of the missing features and labels simultaneously in multi-label learning,we try to recover the missing values of features and labels by matrix factorization,and then learn a classification model from the latent feature space to the latent label space.Second,to overcome the problem of tail labels in matrix factorization,decomposing the original label matrix into low rank matrix and tail label matrix,we build an extra classifier for the sparse tail labels.Besides,the manifold regularization technology is used to keep the manifold structures of instance similarity and label correlation,so as to further improve the performance of the model.The effectiveness of our proposed method is verified by comparing it with the state-of-the-art approaches over nine multi-label data sets. |