Font Size: a A A

Research On Deep Learning Based Text Detection And Recognition

Posted on:2021-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:C ChenFull Text:PDF
GTID:2428330611467293Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development and popularization of the Internet and the Internet of Things in China,huge amounts of image data are generated every day.The image contains not only low-level information such as shape and color,but also high-level semantic information such as text,which plays an indispensable role in analyzing and using the image.However,in the complex background of natural scenes,there are still many difficulties in detecting and recognizing text with multi-scale and multi-directions.This paper studies text detection and recognition methods based on deep learning.The specific work and research results are as follows:(1)A novel text detection model combining Convolutional Neural Networks(CNN)andRecurrent Neural Network(RNN)is proposed.The model makes full use of the multi-scale characteristics of the feature pyramid in Feature Pyramid Network(FPN),extracting features at different scales,so as to adapt to multi-scale text.In order toutilize the contextual features of the text,Bi-directional Long Short-Term Memory(Bi-LSTM)is used to generate a series of text proposals.Finally,these text proposals areconnected through a redesigned text connector.The model can adapt to multi-directional and multi-scale scene text detection and has achieved ideal results onmultiple public data sets.(2)Aiming at the deficiency of text recognition in natural scenes,a text recognition modelcombining Convolutional Recurrent Neural Network(CRNN)and ConnectionistTemporal Classification(CTC)with strong adaptability is designed.The model usesCNN for feature extraction,then uses Bi-LSTM for encoding and decoding to generatefeature sequences,and finally uses CTC for mapping,which can output text lines ofany length.The model has been verified on multiple public datasets.(3)Combined with the document recognition needs of the logistics industry,the textdetection and recognition model was used to design a document recognition system.The whole system includes three processes: pre-processing,recognition and post-processing.The preprocessing includes steps such as document correction and text detection to extract a series of text to be recognized.After the text recognition model is recognized,post-processing is used to check and correct the recognition results.Through the actual document recognition,the effectiveness of the document recognition system is verified.
Keywords/Search Tags:Text detection, Text recognition, Scene text, Document recognition system
PDF Full Text Request
Related items