Font Size: a A A

Research On Image Text Recognition Based On Attention Mechanism

Posted on:2022-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:M H DiaoFull Text:PDF
GTID:2518306485462334Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,the world is becoming informationalized and digitized inch by inch.The construction of all kinds of network platforms makes the transmission of data and information increase day by day.The means of expression of data and information can not only be in the form of text,but also in the form of images.In order to maintain the health and safety of the network,as well as avoid pornographic,violent,reactionary language and text in the way of image dissemination on the network.The recognition of characters in images has important theoretical and practical significance for promoting the development of network even maintaining network security.In the field of image character recognition,the method based on traditional image character recognition is suitable for the printed character that needs to recognize the image with less character and standard font,and its recognition speed is slower.With the rapid development of deep learning,a large number of scholars gradually apply deep learning to the task of image and character recognition,and then achieved breakthrough progress.At present,the mainstream deep learning Network models for image and character recognition include Convolutional Neural Network(CNN),Recurrent Neural Network(RNN)and CRNN.In the image character recognition,CNN network can effectively extract the feature information in the image and character recognition.RNN network can learn the correlation information between image pixels.CRNN integrated the advantages of CNN+LSTM network to carry out overall text recognition.Although the above three models have their own advantages in image and text recognition,they also have their own defects.For instance,the CNN network lacks the ability to learn information about associations between pixels in long distance images.Aiming at the problems of the above models,this paper proposes an image character recognition model based on attention mechanism.Based on the analysis of image character recognition methods and deep learning network models,this paper conducts in-depth research on how to accurately and effectively recognize image text and reasonably build image character recognition models.The main research work is as follows:1.Improve the method of the image feature extractionAt present,the image and character recognition model based on CNN+Seq2Seq(Sequence-to-Sequence)as the main framework has the problems of too many model parameters and too long training time.The Multi-Head attention mechanism has the advantages of being able to capture long-distance dependence and fast speed,which can improve the problem of too slow model training.Based on this,this paper proposes a method to improve CNN+Seq2Seq image and text recognition model by using Multi-Head Attention mechanism.The improved model uses Multi-Head Attention mechanism to extract features of image and text generation.The improved model can shorten the training time and improve the training efficiency without affecting the performance of extracting image features.2.Introduce the semantic information of image textThe image character recognition model based on codec can deal with the image characters with perspective distortion and curve shape,but it cannot do a good job in the image characters with fuzzy image,uneven illumination and incomplete characters.The reason is that the low-quality images cannot provide high-quality visual information for the image character recognition model.Image text is different from ordinary image,image text contains not only visual information but also abundant semantic information.In this paper,a semantic extraction module including semantic extraction and semantic correction is constructed to extract semantic information from images.In order to extract the correlation information between the semantic information and the visual information,the interactive attention mechanism is used to make the visual information feature and the semantic information feature interact with each other,so as to extract the correlation information between the semantic information feature and the visual information feature,as well as enhance the role of the semantic information feature in the image character recognition model.3.Introduce the timing sampling mechanismIn sequence-to-sequence,the decoder input is the true sequence marker during training,but the decoder input is the predicted value of the previous time step during test.In this paper,timing sampling is applied to the codec.In the training process,in order to alleviate the input difference between the training process and the test process,the improved model with timing sampling will select the real sequence marker or the calculated value of the last time step as the input with a certain probability.
Keywords/Search Tags:text recognition, codec, attention mechanism, timing sampling
PDF Full Text Request
Related items