Font Size: a A A

Research On Image Enhancement Techniques For Natural Scene Text Recognition

Posted on:2021-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WangFull Text:PDF
GTID:2428330647450754Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of technology and the popularity of smart phones and Internet,the number of the image data produced by people in daily life is also increasing at high speed.Text in natural scene images usually carry rich semantic information and are of great value for some image related applications such as image analysis,image classification,image content understanding,etc.Therefore,how to recognize text in natural scenes accurately has attracted lots of research interest.However,due to the irregular shape of text,the complicated background and different kinds of image quality degradation problems resulted from the shooting process,it is a very challenging problem to accurately recognize text in natural scenes compared to those with regular layout in documents.As a result,this thesis correspondingly proposes two kinds of image enhancement techniques for scene text images in terms of text image super-resolution,rectification and quality improvement and thus increase the text recognition accuracy in natural scene images.This thesis first explores the benefit of image super-resolution enhancement to text recognition and proposes a novel text image super-resolution enhancement model which focuses on text regions.Based on Conditional Generative Adversarial Network(c GAN),this thesis first introduces spatial attention mechanism by exploiting the text/non-text binary segmentation map to compute a mask applied to the feature map and designs the corresponding loss function at the same time,so as to guide the network to pay more attention to the features of text regions and thus increase the super-resolution quality and the learning efficiency of the model.Meanwhile,this thesis integrates channel attention blocks into the network to emphasize part of the feature mapchannels which is more helpful to this task and correspondingly restrain the irrelevant features,making the network able to learn more effective feature representations.These two attention mechanisms altogether help the network focus more on extracting more helpful features and improve the super-resolution quality of text regions.As an extension to the text image super-resolution work,this thesis further proposes an end-to-end enhancement-based scene text recognition model that introduces an adaptive enhancement module ahead of the recognition network.First,a spatial transform network(STN)is employed to rectify the input image to relatively regular linear layout,which is easier to be recognized,so as to alleviate the influence of the shape of text.Then hierarchical U-shape networks are further employed to pixel-wise improve the rectified image's quality and make the text regions gain emphasis,so as to alleviate the influence of ambiguity,complex background,low contrast,etc.The whole model can be trained end-to-end with only word-level annotations and requires no additional supervision information.This thesis evaluates the effectiveness of the proposed methods by taking experiments on several public scene text recognition datasets.The results show that the proposed text-attentional image super-resolution method and enhancement-based scene text recognition method achieve better results compared to existing methods,which demonstrate their effectiveness.
Keywords/Search Tags:image enhancement, text image super-resolution, scene text recognition, natural scene image, deep neural network
PDF Full Text Request
Related items