Font Size: a A A

Scene Text Detection And Recognition Based On Deep Learning

Posted on:2021-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:J L WenFull Text:PDF
GTID:2428330620464270Subject:Engineering
Abstract/Summary:PDF Full Text Request
Text detection and recognition is an important research direction in computer vision and image processing.Text in pictures of natural scenes is called scene text.Scene text detection and recognition is a current research hotspot.Scene text detection and recognition has great research and application value,and it has a good research on high-level semantic information contained in images.In real life scenarios,scene text detection and recognition have a wide range of applications,such as autonomous driving,license plate recognition,intelligent navigation,and unmanned supermarkets.The research of scene text detection and recognition has been carried out for many years,and a lot of research results have been obtained.However,the text in the natural scene is affected by its background,scale,font,text box shape,orientation,and picture quality,which causes difficulties in scene text detection and recognition.Until today,The detection and recognition of scene characters with irregular shapes and the detection and recognition of scene characters next to the text are still hot topics in research.In recent years,deep learning and neural networks have achieved great success in the fields of computer vision and image processing research.This thesis researches respectively scene text detection and recognition by deep learning.The main work and innovations in the thesis are as follows.1.The scene text detection network firstly preprocesses the scene text pictures by enhancing and reducing noise on images.The role of preprocessing is to enable the network to better extract the features of the picture to facilitate subsequent detection.2.The detection network uses a residual network based on FPN structure to extract and fuse the features of the image.Then the feature map is classified through multiple convolutional layers to obtain multiple prediction detection results of the same text of different sizes.Gradually expand the smallest prediction result to the largest prediction result.The adjacent text instances will be split well.And finally get the detection result of the text in the picture.3.The recognition network corrected the irregular scene text.Firstly the text instance is located.Then calculate the TPS parameter transformation between pixels and find the coordinate position of the pixel in the corrected picture at the position of the pixel corresponding to the original picture.Finally the corrected image will be generated.4.A sequence-to-sequence model is used as the basis to build a recognition network.Deep residual networks are used to extract image features and convert the features into a sequence by the net.Then encode the sequence through BiLSTM.Finally,bidirectional LSTM decoder based on attention mechanism is used to get the final recognition result.This thesis constructs two networks for scene text detection and recognition.And we prove the excellent performance of the two networks through experimental analysis.
Keywords/Search Tags:Text detection, Text recognition, Recurrent neural network, Convolutional neural network
PDF Full Text Request
Related items