Font Size: a A A

Research And Implementation For Image Recognition Technology In Noisy Label Scenarios

Posted on:2024-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:W K ChenFull Text:PDF
GTID:2568306914458314Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the development of deep learning,image recognition tasks based on deep learning have been widely used in real life.The success of deep learning often relies on high-quality large-scale datasets.However,in the real world,collecting high-quality large-scale datasets is often very expensive,and noisy labels will inevitably appear during the collection of datasets.How to complete the image recognition task under the dataset with noisy labels has become a hot research issue in recent years.In conventional research methods,a more successful approach is to use a small loss strategy to separate noise samples from clean samples,and then treat noise samples as unlabeled data and use semi-supervised learning to complete model training.However,since the distribution of loss values of difficult samples is similar to that of noise samples,the sample division strategy based on small loss often accidentally damages difficult samples with rich information.When semi-supervised learning algorithms generate pseudo-labels,due to the existence of noise samples,the quality of pseudolabels is often not reliable,which further affects the performance of the model.In addition,conventional noise label processing algorithms are often unable to handle datasets in long-tail scenarios because their processing methods tend to suppress minority classes in long-tail datasets.In response to the above challenges,the research content and main contributions of this paper are as follows:(1)A sample partition algorithm incorporating hard sample recognition is proposed.This algorithm designs a sample division algorithm based on the training history by using the training history of the samples.It solves the problem of large damage to hard samples in the traditional sample division algorithm.(2)A noise label processing algorithm PGDF(Prior Guided Denoising Framework)based on sample priors is proposed.PGDF optimizes the pseudo-label generation algorithm by estimating the distribution transition matrix of pseudo-labels and real labels.Solved the problem of low quality pseudo-labels in noisy scenes.And the hard samples with rich information are enhanced to further improve the classification performance of the final model.(3)A noisy label processing algorithm PGDF-LT(PGDF-Long Tail)for long-tail scenes is proposed.This algorithm solves the problem that the noise label processing algorithm fails in long-tail scenarios.For long-tail scenarios,an unsupervised pre-training algorithm is incorporated on the basis of PGDF,and category information is introduced into the sample division module and the final sample weighting to further enhance it.The final classification performance of the model.In this paper,the three algorithms proposed in this paper are tested under multiple data sets,multiple noise forms and multiple noise ratios.The experimental results show the effectiveness of the proposed method.
Keywords/Search Tags:noisy label, image recognition, hard sample, semi-supervised learning, long tail scenario
PDF Full Text Request
Related items