Font Size: a A A

Research On Character Level Chinese Scene Text Detection And Recognition Based On Deep Learning

Posted on:2021-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y F TaoFull Text:PDF
GTID:2428330605454259Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Text records the culture and civilization of various countries.Nowadays,with the development of the multimedia,text in pictures is often informative.Chinese characters have a long history and various font forms,which not only provides a lot of important research information for literature lovers,but also provides rich research topics for the field of computer vision.In natural scene images,the content of text has strong semantic information.By reading and understanding these words,researchers can extract relevant information needed for industrial applications,such as document retrieval,urban monitoring,driverless driving,medical treatment,and so on.Traditional Optical Character Recognition(OCR)technology has been widely used in document analysis.Unlike traditional scanned documents,text in natural scenes not only has variable font styles,but also has very complex background.Therefore,text detection and recognition in natural scenes is relatively difficult.There are many algorithms for English text,but the detection and recognition of Chinese scene characters is still a challenging task.In brief,our main work and contributions are as follows:1.This dissertation proposes a natural scene text detection algorithm based on multi-scale features and multi-objective functions.At the same time,it is suitable for text line and character detection,and has many advantages compared with the existing methods.Details are as follows:(1)Inspired by semantic segmentation,the basic backbone network output stride is changed to 16 first,and then the Dense ASPP(Dense Atrous Spatial Pyramid Pooling)module is added to the network,which connects a set of atrous convolutional layers in a dense way,such that it generates multi-scale features that not only cover a larger scale range,but also cover that scale range densely,without significantly increasing the model size.(2)This dissertation proposes a confidence rectification mechanism based on multi-objective functions.The network uses two different classification loss functions: binary cross-entropy loss,and dice coefficient loss.The final confidence of the text box will be scaled according to the two scores.This method can significantly improve the deviation between the quality and confidence of the text box through the supervised learning of two loss functions(3)This dissertation uses the Online Hard Example Mining algorithm to balance the proportion of positive and negative samples.The scores are sorted pixel by pixel,removing negative samples that are easier to train.The final retention of positive and negative samples is about 1: 3,which makes the network easier to train and increase the accuracy of detection.2.To improve the accuracy of Chinese scene character recognition in real scenes,this dissertation designs a Chinese scene character recognition network based on deep metric learning.The two character images are concated on the channel,the basic backbone network is used to capture the features,and then the convolution features at different levels are fused.Finally,a neuron is used to control the output value between 0 and 1.The model has achieved excellent recognition accuracy in character recognition by learning the similarity between two scene images.It is good at predicting whether a scene character image and a template typeset image share the same Chinese character,even if the characters and/or the template typeset images have never appeared in the training set of the model.In order to increase the speed of the recognition process,coarse classification is used first,which effectively reduces the number of matching templates and time consumption.3.This dissertation presents a character-based Chinese scene text recognition algorithm.A simple combination of character detection network and character recognition network can achieve excellent recognition results without additional pre-processing operations.We first adjust the scene text detection network based on multi-scale features and multi-objective functions to obtain character coordinates from the scene text line images,and then use the Chinese scene character recognition network based on deep metric learning for character recognition,and finally combine the character recognition results into line according to the character coordinates.The framework does not require data augmentation during training,and has shown excellent recognition accuracy on several Chinese text datasets.In summary,inspired by semantic segmentation and medical image segmentation,this dissertation proposes a natural scene text detection network based on multi-scale features and multi-objective functions,and conduct experiments on a variety of different English and Chinese datasets.Experimental results show that our algorithm has better performance on text line detection and character detection.This dissertation proposes a Chinese scene character recognition network based on deep metric learning,and use coarse classification to speed up the test stage.Significant recognition results have been achieved on different Chinese scene character databases.Finally,according to the detection algorithm and recognition algorithm proposed in this dissertation,a character-based Chinese scene text recognition algorithm is proposed,and this method achieves significantly higher recognition accuracy than ASTER.
Keywords/Search Tags:Chinese text in natural scenes, text detection, character detection, character recognition, text recognition
PDF Full Text Request
Related items