Font Size: a A A

Research And Application Of Scene Text Recognition Based On Deep Learning

Posted on:2022-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y HuangFull Text:PDF
GTID:2518306332467624Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the field of text recognition,which is one of the hot directions in computer vision,relevant experts have achieved many remarkable results,and it has been widely used in real scenes to facilitate our daily life.At present,although the traditional OCR technology has a high recognition accuracy for document text recognition,but in the natural scene,because of the complex background and diverse texts,text recognition remains challenging,which needs to explore and improve continually.Therefore,this topic will take scene text recognition as the main research content,and put forward an improved method aiming at the existing two major difficulties.As for the application,the topic takes the menu as the landing place,designs a complete recognition process,and then realizes a system according to this,which provides the user text recognition and translation service with the Chinese and English.According to the research and application of scene text recognition,this paper mainly covers the following work:(1)To solve the problem of irregular text scenes and attention drift,this topic proposed a scene text recognition algorithm named DMDAN based on deep learning.First,deformable convolution is applied to enhance its adaptability to irregular text.Then,in the process of encoder and decoder,the hybrid domain attention mechanism and the self-attention mechanism are adopted respectively,which can effectively alleviate the effect of attention drift.Finally,the center loss is used to reduce the distance within the class and make each character feature more easily to recognize.The comparative experiment confirms that the effect of this model is obviously improved.(2)Set up a domain-oriented scene text recognition framework.The whole framework first uses VGGNET-16 model for text direction detection,which corrects the picture to the horizontal direction.Then,CTPN model is used to detect the text and locate the text region in the image.After,DMDAN model is used for text recognition to extract the text in the picture.Finally,a bidirectional decoder and dropout mechanism are introduced into the seq2seq model for text post-processing,which can detect and correct word errors in the text.Especially,in the text post-processing stage,this topic manually constructed a relevant dataset,to meet the needs of the application scenario.(3)On the basis of the above recognition process,this paper designs and implements a text recognition and translation applet for Chinese and English menus,which verifies the usefulness and correctness of our research.
Keywords/Search Tags:scene text recognition, scene text detection, text post processing, WeChat applet
PDF Full Text Request
Related items