Font Size: a A A

Irregular Scene Text Recognition

Posted on:2021-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:J J YouFull Text:PDF
GTID:2518306452457584Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The task of irregular scene text recognition is to recognize irregular scene text image as a text sequence that can be understood by computer.In recent years,irregular scene text recognition has attracted a lot of attention from computer vision researchers.Compared with regular scene text recognition,existing methods for irregular scene text recognition still cannot achieve satisfactory performance in some applications due to the complexity of background and the diversity of character arrangement.In this paper,we propose different models for different scenes to improve the accuracy of irregular scene text recognition.The contributions of this paper are as follows:Firstly,for the scene with character level annotation information in training set in English irregular scene text recognition,we propose a novel method,called saliency map based rectification(SMART).SMART contains a rectification network and a recognition network.Different from existing rectification based methods,SMART is the first method that adopts two-dimensional character level saliency map for rectifying irregular text image.Besides,strip region transformation(SRT)is used in SMART.Furthermore,thin-plate-spline data augmentation is introduced to improve the training of the network.Extensive experiments show that SMART can outperform existing methods on irregular text datasets by a large margin,and can also achieve state-of-the-art performance on regular text datasets.Secondly,for the scene of no character level annotation information in training set in English irregular scene text recognition,we propose a novel method,called two-dimensional irregular text recognizer(TDIR).TDIR consists of two-dimensional feature fusion(TDFF)and two-dimensional attention mechanism(TDAM).In the encoding stage of TDIR,TDFF is used to fuse encoding feature in horizontal and vertical directions.In the decoding stage of TDIR,TDAM is used to predict character semantic information in the horizontal and vertical directions at each decoding step.Extensive experiments show that TDIR can outperform existing methods on irregular text datasets by a large margin,and can also achieve state-of-the-art performance on regular text datasets.Finally,for the scene of Chinese irregular scene text recognition,we propose a novel method,called oriented response irregular text recognizer(ORIR).To the best of our knowledge,ORIR is the first method to apply oriented response network to the task of scene text recognition,which considers the direction information of characters in irregular text.In addition,ORIR adopts oriented response attention mechanism(ORAM)to learn the attention information on oriented feature and channel feature,which can strengthen the learning of oriented response convolution and suppress the invalid features between channels.Extensive experiments show that ORIR can outperform existing methods on Chinese text datasets,especially on Chinese vertical scene text dataset.
Keywords/Search Tags:Irregular Scene Text Recognition, Text Rectification, Two-dimensional Feature Fusion, Two-dimensional Attention Mechanism, Oriented Response Attention Mechanism
PDF Full Text Request
Related items