Font Size: a A A

The Algorithm Research On Scene Text Detection And Recognition Based On Deep Learning

Posted on:2021-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:F ChenFull Text:PDF
GTID:2428330611464279Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Detection and recognizing text from images taken in natural scenes is a challenging task and a hot research topic in computer vision.Unlike traditional optical character recognition(OCR),words in natural images often possess irregular layout(e.g.arbitrarily orientation,curved text,perspective distortion,etc.),complex background and serious noise(such as occlusion,low resolution,blurring,illumination,etc.),which are difficult to recognize.The research of scene text has very important academic and practical significance.This technology is widely used in driverless technology,information security audit and many other aspects,attracting a large number of researchers to invest in it.The scene text detection and recognition technology consists of two parts: first,the text is detected in an image,and then the detected text is recognized.In the text detection algorithm,we extract the features of the image,and then use the object detection algorithm to detect the text.The character recognition algorithm aims at the image which only contains characters,extracts the features from the image and recognizes the character.As a branch of object detection,the mainstream of scene text detection algorithms is mainly divided into two categories: one-stage method and two-stage method.One-stage method directly returns the text category scores and position coordinates,fast but less accurate.The two-stage method generates candidate boxes and then carries out fine classification.This method is divided in two steps,which is slow in speed but high in accuracy.The model of scene text detection proposed in this paper is mainly based on Faster R-CNN,which is the representative of two stage algorithm.In the feature extraction stage,Resnet is used to extract deep features.At the same time,we change the inception network structure and added it to the model,so the extracted feature is more suitable for long text detection.In the detection module,the traditional RPN based on region prediction is changed into the anchor-free RPN which is based on point prediction.It solves the problem that RPN can only detect horizontal objects,so that the model can deal with the multi-directional scene text.In order to solve the problem of sample imbalance,we use focal loss instead of the traditional softmax loss function to further improve the accuracy of the model.For the problem of scene text recognition,we develop a novel method consisting of a text recognition network and a text correction component,which is more robust to irregular text.The text correction component rectifies the text of an input image to a more “readable” text,which effectively solves the impact of the layout on the accuracy.The text recognition network is a more “location aware” attention-based sequence learning model that take the rectified image as input and recognize the text.We extract the deep features with long-term dependence,and then use the LSTM network based on attention mechanism to predict the character sequences.During the training,the standard Softmax loss function only considers the separability between classes but does not restrict the aggregation within classes.Therefore,we adopt a new loss function based on the Softmax loss function to enable the model to learn more discriminative features,reduce misjudgments and improve accuracy.In this paper,the algorithm of scene text detection and recognition is trained and tested on 10 challenging scene text image data sets,including synthtext,synthetic text,icdar2003,icdar2013,icdar2015,icdar2017 MLT,IIIT 5K-words,street view text,SVT perspective and cute80.Extensive experiments demonstrate that the two algorithms can make better achievement,and the proposed method is comparable to state-of-the-art performance.
Keywords/Search Tags:Scene Text Detection, Scene Text Recognition, Deep Learning, Neural Network
PDF Full Text Request
Related items