Font Size: a A A

Text Recognition In Natural Scenes Based On Deep Learning

Posted on:2022-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:M LiFull Text:PDF
GTID:2518306752454474Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of artificial intelligence technologies such as deep learning,text recognition in natural scenes,as an important field of computer vision,has also been under continuous development in recent years.This technology is widely used in the fields of robot guidance,invoice inspection,license plate recognition and industrial inspection,and has very broad prospects and application value.However,texts in the natural scenes are different from the texts in the standard document,such as the background is often more complicated,and it also has the characteristics of different orientation,variable length and multiple angles.According to the characteristics of texts in natural scenes,this thesis studies text recognition in natural scenes based on deep learning methods.The main contents include the following three parts:1.For text detection in natural scenes,we have made improvements on the basis of the SSD object detection algorithm.To address the problem of imbalance between positive and negative samples in the one-stage object detection algorithm,Focal Loss is used to replace the traditional loss function.In order to correctly detect the oriented texts,a rotation angle parameter is added to the original horizontal anchors of the SSD.Finally,through the refinement of the anchors,we can effectively optimize the quality of the selected candidate-boxes.2.For text recognition,a soft attention mechanism is embedded between the convolutional neural network and the recurrent network on the basis of CRNN.Because a single-layer bidirectional LSTM is used in CRNN,it often takes a long time to train the model.In order to speed up the training without affecting the recognition rate,two-layer bidirectional GRU is used to replace the original singlelayer bidirectional LSTM.3.Transformer-based text recognition.Text recognition is a typical sequence recognition problem.We use the seq2 seq Transformer to replace the RNN layer and transcription layer of the recognition algorithm,that is,after the convolutional layer,the extracted features are directly sent to the Transformer network for recognition.We did experiments on ICDAR datasets.The experimental results demonstrate the effectiveness of the proposed methods.We also collected a new container test dataset and annotated the texts on the containers.The detection algorithms and recognition algorithms were tested on it.
Keywords/Search Tags:deep learning, text detection, text recognition, attention mechanism
PDF Full Text Request
Related items