Font Size: a A A

Detection And Recognition Of Scene Text Based Deep Learning

Posted on:2018-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:J F MaFull Text:PDF
GTID:2348330533466688Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Text spotting in natural images is usually divided into two tasks: text detection,and word recognition.The text detection stage generates bounding boxes around words in an image,while the word recognition stage takes the content of these bounding boxes and recognizes the text within.Text in images provides rich and precise high-level semantic information,which is important for numerous potential applications such as scene understanding,image and video retrieval,and content-based recommendation systems.Consequently,text spotting in natural scenes has attracted considerable attention in the computer vision and image understanding community.In this paper,we develop a novel unified framework for text region proposal generation and text detection in natural images via a fully convolutional neural network(CNN).First,we propose the inception region proposal network(Inception-RPN)and design a set of text characteristic prior bounding boxes to achieve high word recall with only hundred level candidate proposals.Next,we present a powerful text detection network that embeds ambiguous text category(ATC)information and multi-level region-of-interest pooling(MLRP)for text and non-text classification and accurate localization.Finally,we apply an iterative bounding box voting scheme to pursue high recall in a complementary manner and introduce a filtering algorithm to retain the most suitable bounding box,while removing redundant inner and outer boxes for each text instance.Our approach achieves an F-measure of 0.83 and 0.85 on the ICDAR 2011 and 2013 robust text detection benchmarks.In order to solve the problem of insufficient training samples of word recognition methods based on deep learning,this paper presents a new method of synthesizing scene words based on Poisson editing fusion.The synthesized images are real and close to real scene data.In order to verify that our synthetic samples can be used for model training of scene word recognition based on deep learning,this paper selects two classical scene recognition models based on deep learning,which are Encoding Words and Convolutional Recurrent Neural Network.
Keywords/Search Tags:Deep Learning, Text Recognition, Text Detection, Poisson
PDF Full Text Request
Related items