Font Size: a A A

Text Detection And Recognition Of Natural Scene Based On Deep Learning

Posted on:2021-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:J C WuFull Text:PDF
GTID:2518306311970899Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the development of computer vision technology and artificial intelligence technology,the need for rapid extraction of text in natural scenes also increases sharply.Capturing and recognizing these text is helpful to understand and analyze images.While manually reading text in images not only consumes a lot of time and energy,but is also prone to errors.Therefore,text detection and recognition in natural scene images has become one of the hot research topics.At present,text detection and recognition are widely used in real life.For example,photo translation software on mobile devices can take pictures of words on foreign streets or street signs,translate one language into another language in real time,and provide guide help.The high-speed monitoring equipment of the public security organ can take photos to identify the license plate number of the car driving on the expressway,and other things,such as identifying business card,identifying menu,identifying express bill,identifying certificate,identifying road sign,identifying examination paper,identifying document and so on,have great practical application value.However,most of the existing successful scene text detection algorithms are based on region proposal,which require a large number of appropriate anchor boxes to be set artificially in advance,which is very tedious,and will cause additional post-processing and slow reasoning time.Therefore,this paper proposes a text detection algorithm based on keypoint detection without anchor frame,which improves training efficiency and reasoning speed better than existing text detection algorithms.At the same time,because there is no existing text recognition network focus on the characteristics of different channel figure different contribution to character recognition accuracy and recognition accuracy in Chinese has yet to be promoted,at the same time brought about by the RNN gradient disappear make text recognition training time is too long,so the text at the same time with a channel attention mechanism is proposed intensive convolution network of connections and residual LSTM text recognition algorithm,improved the precision of text recognition,speed up the convergence speed of text recognition training,specific work is as follows:1,Propose a text detection algorithm based on key points without anchors.directly to the text detection is converted into a key,build a model of the text target as a point,used the key estimates directly to find the center of the text,predict whether each pixel point to text center,for the center to return the text frame's width is high,compared with text detector based on candidate box,this paper puts forward the testing model is simpler and faster.Compared with other complex multistage text detection algorithms,the tradeoff between precision and speed is realized.2,Design encoder and decoder structures with residual parallel dilate convolution(RPDC)module for key point estimation.The dilate convolution is introduced to increase the receptive field while maintaining the resolution of the feature map,the larger resolution images can improve the target positioning accuracy,small text at the same time,the increase of receptive field strengthens the overall semantic information,is helpful to locate the large-scale text.In addition,the parallel mode of dilate convolution with different expansion rates is adopted to obtain different receptive fields for each branch.The larger the expansion rate is,the larger the receptive fields will be,so that the context information of text objects and images can be captured in multiple scales.Improve the effect of multi-scale text positioning and introduce residual connection as identity mapping,which means that if not required,residual connection can eliminate the impact of RPDC modules.3,Proposed a text recognition algorithm with a channel attention mechanism for dense connection of convolutional network and residual LSTM,by focusing on the importance of the difference between different channel characteristics,study relationship between each channel characteristics,make model,pay more attention to the large amount of information channel,and channel characteristics of inhibition is not important,makes the characteristics of characters directivity is stronger,Dense connections at the same time make each layer of the output of the convolution network to connect to the behind all the input layer,to encourage the reuse of the characteristics,combined with the characteristics of the multilayer,reduces unnecessary computation,and behind the introduction of a layer of residual mechanism LSTM,the text image features into a sequence,before and after the use of the word sequence relation,residual mechanism to speed up the convergence rate of network training,the experimental results show that by introducing intensive attention mechanism with channel convolution network and residual LSTM can effectively improve the accuracy of text recognition,and speed up the network convergence speed training.
Keywords/Search Tags:key point detection, dilate convolution, text recognition, channel attention
PDF Full Text Request
Related items