Font Size: a A A

Recognition Of Scene Text Based On Deep Learning

Posted on:2020-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:W W GaoFull Text:PDF
GTID:2428330575465127Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Text in natural scene images is often rich in precise high-level semantic information.With the rapid development of mobile Internet and computer vision,these information have been widely used in geolocation,license plate recognition,and driverless.Compared with the detection and recognition of traditional copywriting text,you will find the text in the natural scene changes more strongly in terms of font,size,layout,background,color,brightness,etc.And because of its superior performance the deep learning is precisely the main method in this field.The process of text recognition of natural scene images can usually be divided into two parts:Detection part and recognition part.The text detection part mainly finds the part of the image that has text and then frames it out.And the text recognition part is mainly responsible for recognizing the text found during the detection phase.In this paper,combined with deep learning technology,a set of methods for text recognition in natural scenes is proposed.This paper mainly works as follows:1)Using a method of artificially synthesizing Chinese characters to generate a data set,then using the Faster R-CNN model to locate the text.By making the synthetic sample closer to the sample in the real natural scene and using a lower learning rate and more iteration to train the model to improve the recall rate and accuracy of text detection.2)Using the East method for text detection in natural scenes.And the data set is picture that has been climbed from Taobao.The East model is mainly based on the full convolution idea,which is a pixel-by-pixel detection classification.And the method can detect the obliquely curved text by considering the angle information of the text.And the East model is improved to have a higher recall rate and accuracy.3)Using a model based on attention mechanism to recognize the Chinese dataset and English datasets,this model mainly adds attention mechanism to focus on important parts and improve recognition accuracy.4)Using CNN+LSTM+CTC model to recognize text.Through the use of larger training samples(more than 3.6 million pictures),and the long and short memory characteristics of LSTM,and the excellent characteristics of the joint CTCLOSS,to achieve a higher recognition rate.5)Designed an end-to-end Chinese recognition framework based on convolutional neural network.This framework mainly consists of two parts:(1)Text positioning process,this process is mainly based on the improved East network framework.The East is a text-specific target detection network.Compared with other text detection algorithms,East not only has high enough positioning accuracy but also can detect curved text lines.Its network structure is simpler than other methods for detecting text.The trained model occupies less memory.(2)Recognition process based on convolutional neural network,the base layer is the traditional convolutional layer and pooling layer and relu layer,then add bidirectional LSTM and CTCLOSS.This make the end-to-end framework achieve state-of-the-art performance.
Keywords/Search Tags:text recognition, LSTM, CTC, convolutional neural network, deep learning
PDF Full Text Request
Related items