Font Size: a A A

The Research On Noisy Label Problems Based On Label Distribution

Posted on:2022-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y P LiuFull Text:PDF
GTID:2518306740982819Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Supervised learning is a widespread machine learning paradigm,and supervised learning algorithms can effectively learn the mappings between the feature space and the label space driven by the training data.However,when there exist wrong labels,supervised learning models will learn wrong mappings and suffer a decrease in generalization performances.Such problems are called noisy label problems,and wrong labels are termed as noisy labels.Label ambiguity is an important cause of noisy label problems.Instances with ambiguous features are more likely to be assigned wrong labels.Label distribution(LD)is a new labeling format.Since LD assigns a continuous description degree to each label,LD is naturally suitable for handling label ambiguity.In noisy label problems,description degrees of all classes are recorded in LD,which is valuable for locating the noisy labels and mining the ground-truth labels.Inspired by the advantages of LD,solutions based on LD are proposed to deal with noisy label problems.The detailed strategy is disambiguating the label set with LD.The labels can be displayed in the binary format: 1 represents the relevant label,and 0 is the irrelevant label.According to this property,the label noise can be divided into two types.The first type refers to the case when irrelevant labels are converted to relevant labels.The second type refers to the case when relevant labels are converted to irrelevant labels.The com-bination of the two types of noise contributes to the diversity of noisy label problems.This thesis firstly investigates the noisy label problem with only one type of label noise,i.e.,the Par-tial Multi-Label Learning(PML)problem.In the settings of the PML problem,each instance in the training set is assigned a candidate label set,which contains noisy labels aside from the ground-truth labels.The research in this thesis leverages the LD-based disambiguation strategy to distinguish between ground-truth labels and noisy labels in the candidate label set.Experi-ments verify the effectiveness of LD for solving PML problems.Next,this thesis investigates the problem of Learning with Noisy Labels(LNL),which contains two types of label noise.In the settings of the LNL problem,there exist the cases where irrelevant labels are converted to the relevant labels while the original relevant labels become irrelevant labels.This corrup-tion process makes a more complicated noisy label problem.To deal with this problem,a label confidence generating algorithm is proposed based on LD to measure the quality of each label,and the measurement results are applied for boosting the model training.The above two parts of research work verify the feasibility and scalability of applying LD for solving noisy label problems.This thesis consists of five chapters.The first chapter mainly introduces the research back-ground and content of this thesis.The second chapter gives a detailed introduction to the def-inition of label distributions and related works.The third chapter introduces the research on partial multi-label learning based on label distributions.The fourth chapter introduces applying label distributions to deal with the LNL problem.The last chapter is the summary and future outline of this thesis.
Keywords/Search Tags:noisy label, label distribution, label enhancement, partial multi-label learning, learning with noisy labels
PDF Full Text Request
Related items