Font Size: a A A

Research On Text Recognition Method Of Natural Scene Image Based On YOLO

Posted on:2021-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2518306047482154Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of science and technology,people will send and harvest a large number of images in their daily production and life,and most of the images will contain a large amount of text information.People can quickly harvest a lot of content in the information of vivid images,among which the most important influence on people is the amount of information brought in the scene images.However,the scene images are bound to have poor image quality.How to accurately and quickly identify the information in the text has become an urgent problem to be solved.In the image text recognition of natural scene,multi-text classification and detection should be carried out firstly,and then text feature sequence should be extracted through text location for further recognition.Deep learning technology in the field of computer vision usually adds many parameters and network hierarchy to better predict the effect.In view of the network model is too deep and the scene text recognition effect is not good,a natural scene image text recognition model combined with model compression is proposed by this paper.The model consists of four parts:(1)The text area detection algorithm based on the compressed YOLO model performs pruning on the Yolo v3-Darknet 53 network with too many parameters.In the pruning,a regularization scale factor is used to delete parameters that have little effect on accuracy.Subsequently,the Darknet 53 network after pruning is used for text location detection.(2)Text region filtering and extraction.Sort multiple text regions detected in the previous step by scores to extract the optimal text region feature map.(3)Character region extraction,extract text information in the text area,and serialize the obtained text features into the recognition model.(4)Text recognition,each feature vector in the predictive feature sequence in a bidirectional LSTM has a label distribution.Finally,the results of the LSTM network prediction are processed,the results of the feature sequences are integrated,and the output and input alignment problems are solved using CTC technology to obtain the final output results.Finally,in order to verify the effectiveness of the algorithm proposed in this paper,Precision,Recall and f-measure were taken as evaluation criteria to compare the pruned Darknet 53 deep network with unpruned model and Seg Link and EAST model in terms of text detection.The Text image recognition experiment of natural scene with feature fusion is carried out,and the result is compared with Word Sup,CTPN,EAST model vertically.The results of the LSTM network prediction are processed,and the results of the feature sequences are integrated to obtain the final output results.
Keywords/Search Tags:Deep learning, Text recognition, Pruning algorithm, Feature map
PDF Full Text Request
Related items