Font Size: a A A

Research On Text Detection And Recognition Technology Based On Deep Learning Methods

Posted on:2020-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:C J YangFull Text:PDF
GTID:2428330590473250Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years with the development of social networks,dealing with the growing variety of visual information has become an unavoidable requirement.Many of these visual information are pictures in complex scenes,such as signature pictures,shop pictures in natural scenes,and there are also pictures of printed documents such as test papers.Pictures in complex scenes are affected by factors such as complex background,unknown language,inconsistent layout,the difficulty is greatly increased.However,understanding the text information has many practical worth for human-computer interaction,automatic driving.This topic is to study different text detection and recognition rechnologies,which are traditional OCR muiti-stage and end-to-end technologies,then apply them to two different complex scenes,which are multi-disciplinary test papers and unknown language natural scenes.Applying the multi-stage text detection and recognition technology of traditional OCR to complex scenes with multi-disciplinary test papers,where the key is multi-granular.The system is carried out according to the traditional OCR steps,following three steps of text detection,text seqmentation and character recognition.Firstly,the image is preprocessed by average filter and Hough tansform,then uses Faster RCNN algorithm for coarse-grained classification.Then uses Mask RCNN algorithm to perform fine-grained classification of small questions.In the end only the formula and Chinese characters need to be recognized,and the existing recognition APIs are respectively called according to different types.The system has universality for the complex layout of multidisciplinary papers.Moreover,the question number and type information are analyzed,and the subsequent text volume structure can be directly generated.The recognition rate of printed Chinese is 99%,which is very practical.Applying end-to-end text detection and recognition to complex scene with unknown language natural scenes,where the key is multilingual text.The system puts text detection and text recognition into a unified framework,and then judges the language.The system uses FPN as the backbone of the entire end-to-end system,and builds a detector with FPN.Then uses locality-aware NMS to remove redundant proposals,and the selected proposals are used to estimate the parameters of the spatial conversion layer.And uses spatial transducers to normalize the image for rotaion.The full convolution identification module is then entered to output the final recognition result. The system's text detection AP is 52.67%,text recognition N.E.D is 0.3190,and the script identification AP is 25.41%.
Keywords/Search Tags:text detection and recognition, multi-stage method, end-to-end, multigranular layout analysis, multilingual
PDF Full Text Request
Related items