Font Size: a A A

Research On Multi-Label Learning Under The Limitation Of Labeling

Posted on:2021-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:N C SunFull Text:PDF
GTID:2518306548490824Subject:Master of Applied Statistics
Abstract/Summary:PDF Full Text Request
With the rapid increase of data collection and label acquisition,there are more situations where an instance associates with more than one labels.It is different from traditional single label situation where each sample has and only has one label.To tackle this problem,a new learning paradigm,named as multi-label learning has been widely investigated.Nevertheless,since the increments of the amount of data for labeling and labels for acquisition,it is more difficult to give full labels for each data.Meanwhile,We can ask experts to make targeted annotations,and also have some prior according to the historical observations,such as the label proportion information.In view of limited labels or label proportion,this paper proposes relevant multi-label learning algorithms.The main work is as follows:(1)In the case that the labeled data is limited,combining with ECOC(Error Correcting Output Codes)mechanism,we put forward the active Learning algorithm MAOC(Multi-label Active Learning with Error Correcting Output Codes).The MAOC algorithm uses the ECOC classification model to predict the label,and combines the two strategies of prediction uncertainty and label base inconsistency to select the most valuable unlabeled samples.So that experts can mark them specifically,and then use the new labeled data set to learn the classification model,it will improve the classification effect and efficiency.Finally,the effectiveness of the algorithm is verified by experiments.(2)In the case that the marked data label is missing and has label proportion constraint,considering the effectiveness of label proportion in limiting the model flexibility,we propose the IMLLP(Incomplete Multi-label Learning with Label Proportion)algorithm based on the prior information of data label.The IMLLP algorithm simultaneously realizes the labeling of unlabeled data and the training of classifier.Through the consistency of labels and the constraint of the proportion of labels,the reconstruction of the labels for unlabeled data is fulfilled.When using the reconstructed complete label to train the classifier,the low rank and regularizer constraints are added for the sake of improving the robustness.It is more effective in dealing with the problem of marking unlabeled data than the traditional method.Finally,experimental results present the performance of our algorithm is superior to the related compared methods.
Keywords/Search Tags:Multi-label, Classifications, Active Learning, Missing Labels
PDF Full Text Request
Related items