Font Size: a A A

Research On Image Feature Representation Based On Self-supervised Learning

Posted on:2022-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y J PengFull Text:PDF
GTID:2518306542463224Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Large-scale labeled data are required to train deep neural networks in order to obtain powerful performance in visual feature learning from images for computer vision tasks.However,supervised training requires a large amount of data with labels.To avoid the high cost of collecting and labeling datasets,the researchers proposed a self-supervised learning method that could learn better visual features from large-scale unlabeled images.At present,a popular solution in self-supervised learning is to propose a variety of pretext tasks and automatically generate pseudo labels for data based on the properties of the pretext tasks,so as to replace manual labeling to supervise network learning.In recent years,the research on pretext task in self-supervised learning mainly focuses on three aspects: 1)Based on image generation,using the technology of image generation to generate a variant of the original image,and using it as the pseudo label to constrain network learning;2)Based on image segmentation,the image is partitioned and the sequence is disrupted by fully considering the structural information of the object in the image,so that the deep convolutional neural network can predict the sequence of block disruption;3)Based on the geometric transformation,the image is performed a series of basic operations such as rotation,cutting,scaling and translation to form the transformation set.By training the network to predict the transformation,the advanced features of the image can be learned.Unfortunately,these methods cannot consider the color,structure,global information and local information in the image at the same time.Therefore,some certain limitations exist when the image objects are similar in appearance or color.This paper studies the challenge of designing pretext task in the self-supervised learning mentioned above,as follows:(1)To solve the problem of incomplete learning of local features in image transformation,this paper proposes a method based on local and global information.Firstly,the image is rotated in four angles to generate the first pseudo label.Then,in order to learn more fine-grained color information,the RGB channel are respectively converted into RBG,BGR,BRG,GRB,and GBR arrangement,and the second pseudo-label is generated.In order to comprehensively consider the influence of rotation and color,this work combine the two to form a unified pseudo labels.More often,this work adds the traditional feature extraction of the image to form a second kind of pseudo label to learn some local information,which can help the network effectively learn the overall structural information.Experimental results show that the proposed method can learn the local and global information of the image well,and improve the performance of transfer learning task to a certain extent.(2)In view of the current self-supervised learning methods,which only involve a single image and lack of interactive information among multiple images,this paper proposes a self-supervised feature representation learning method based on the collaborative prediction of angle rotation.Firstly,two images in the data set are randomly selected and rotated counterclockwise by four angles.Then,sort the image color channels.To increase the variety,the transformed image is also multiplied by a constant to change the brightness and darkness of the image,resulting in two transformed images.Finally,the transformed image pixels are added according to the channel and the mean value is calculated to achieve the fusion of the two images.Through the fusion of the two images,a total of 16 transforms are generated,which are set as the final self-supervised signals to carry out the 16 classification prediction during the self-supervised training.Experimental results show that the proposed method has better feature learning ability and can improve the recognition accuracy of other tasks through transfer learning.
Keywords/Search Tags:Self-supervised learning, Transfer learning, Local and global information, Multiple images collaboration
PDF Full Text Request
Related items