Font Size: a A A

Research On Self-taught Learning Methods For Low-quality Images

Posted on:2024-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:J C YeFull Text:PDF
GTID:2568307136997319Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Self-taught Learning is an unsupervised approach that aims to use unlabelled data to learn abstract features that are common and have strong generalisation properties,enabling unsupervised model pre-training based on feature learning.Applying the pre-trained feature extractor to a supervised learning task can significantly reduce the huge demand for labelled samples for models of downstream tasks such as classification.Taking the image classification task as an example,selftaught learning mainly uses image reconstruction to train the classification backbone network in order to improve the generalisation ability of the backbone network.The advantages of the self-taught learning approach are the simplicity of the implementation and the fast convergence of the model pretraining.Low quality images with severe distortion are a challenging problem in the field of computer vision as they are missing key information and are difficult to be accurately analysed and understood by computers.Mixing low-quality images in self-taught learning can cause serious interference in the self-encoding representation phase,resulting in deep models that fail to learn generalised and abstract representations of the target,and ultimately severely affecting the accuracy of self-taught learning downstream vision tasks.Therefore,improving the robustness of the self-encoding representation in the absence of key information becomes the key to overcome the interference of self-taught learning methods in the face of low-quality images.Although the current cutting-edge mask-based self-taught learning methods can significantly improve model generalisation,they still face interference from low-quality images because the fixed masks used inevitably mask invalid regions that do not contain any information content,increasing the uncertainty of image reconstruction.This thesis provides an in-depth theoretical study of the challenges faced by self-taught learning,and the proposed innovations include the following two main aspects:(1)An interleaved auto-encoder is proposed to address the deficiency that low-quality images can seriously interfere with the robustness of self-coding representations and generalisation performance in self-taught learning.The proposed encoder combines a convolutional neural network and a Transformer into an organic whole by means of a Residual Connection Unit(RCU)design.The proposed organic whole can effectively mitigate the problem of gradient disappearance and explosion,while effectively aggregating the residual saliency information.In addition,to avoid the feature folding problem,short connections are added to each Transformer representation layer to achieve complementary transfer of residual information across layers,which helps to explore the potential association between local attention and global self-attention.Extensive experiments show that the proposed intertwined self-attention can maintain high generalization performance despite critical information masking.(2)An in-depth analysis of the reasons for the decay of mask self-taught learning performance due to low-quality images is presented,and a self-taught learning method based on progressive masks is proposed.The proposed method implements progressive attention masking through the Progressive Attention Masked Module design.Specifically,by continuously masking the unmasked and salient features of the image during encoder training,the encoder is forced to dig deeper into potential semantic representations that can still be exploited according to the training state at different training stages.In addition,a Decoding Knowledge Distillation Moduleis designed in the self-encoder to build a self-distillation model on both the encoding and decoding sides,equating the decoding side with the teacher and guiding the encoding side to improve the utilisation of its information.Extensive experiments have shown that our method has advantages over traditional methods,with a 5%accuracy gain in linear evaluation on the STL-10 dataset and a 3% accuracy gain in domain migration on CIFAR10 and CIFAR100.
Keywords/Search Tags:Self-aught Learning, Residual Connection Units, Jump Connections, Knowledge Distillation, Attention Masks
PDF Full Text Request
Related items