Font Size: a A A

Research On Text Detection And Recognition In Natural Scenes

Posted on:2021-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:B Y ZhangFull Text:PDF
GTID:2428330602971285Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As an important branch of computer vision applications,text reading in natural scenes has always been one of the most popular research fields in the field of computer vision based on deep learning,and has been extensively studied in the past decade.Driven by many real-world applications,it has practical significance for blind assist systems,intelligent transportation systems,driverless navigation systems,etc.Due to the diversity of scene texts and the complexity of backgrounds,scene text detection and recognition are currently facing with many challenges.Reading text in natural scenes is divided into two processes:text detection and text recognition.It is not completely similar to optical character recognition(OCR),and there are still great differences in detection difficulty and recognition accuracy requirements.Especially in the extraction of road sign information for driverless navigation,the recognition accuracy is extremely high,and OCR technology is difficult to achieve high accuracy recognition.The current scene text detection and recognition methods are mostly based on deep learning methods.In the face of complex scenes,deep learning has a generalization ability that is unmatched by traditional methods.In this paper,we conduct in-depth research on text detection and recognition of natural scenes,and propose a multi-directional text detection algorithm based on YOLOv3 and an indefinite-length character recognition method based on CRNN.The specific research contents of this article are as follows:(1)This paper proposes a scene text image preprocessing method,and also proposes a long text sub-data set for long text detection,and a large-scale synthetic Chinese character data set for Chinese character recognition.In this paper,PCA is used to reduce the dimensionality of the picture,and an improved median filter denoising method is proposed.Based on the RCTW-17 data set,a long text sub-data set is proposed,and a synthetic Chinese character data set is proposed for text recognition training while imitating the background of natural scenes.(2)Aiming at the multi-directionality of scene text,this paper proposes an improved scene text detection algorithm based on YOLOv3 target detection algorithm.In this paper,the network structure is redesigned according to YOLOv3,and feature fusion after long convolution kernel convolution is added to adapt the shape characteristics of the text area for feature extraction.Then,a rotation filter is introduced to replace the ordinary filter to extract the rotation information of the feature map.At the same time,the a priori frame size was redesigned to adapt to long text detection.Finally,through coordinate compensation,text box regression is implemented to realize multi-directional text detection.(3)This paper proposes a Chinese and English indefinite long character recognition method based on convolution recurrent neural network.In this paper,based on CRNN,the feature extraction network of Chinese characters is improved,and the downsampling is achieved by replacing the pooling layer with a convolutional layer with a step size of 2.At the same time,the double-layer LSTM model is used to predict the text sequence and extract more detailed Chinese character feature maps.At the same time,the attention mechanism is introduced to achieve the extraction of local features to global features.Finally,the CTC method is introduced to transcribe the predicted characters to achieve mixed prediction of Chinese and English characters.The algorithm was tested on the public data set and the synthetic data set proposed in this paper,and the results of the experiment and the advantages and disadvantages of the algorithm were analyzed.
Keywords/Search Tags:Text detection, Text recognition, ARFs, LSTM, CNN
PDF Full Text Request
Related items