Auto STR:Efficient Backbone Search For Scene Text Recognition

Posted on:2021-10-26

Degree:Master

Type:Thesis

Country:China

Candidate:H Zhang

Full Text:PDF

GTID:2518306107968009

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Scene text recognition(STR)is very challenging due to the appearance and layout of text in scene text images are diverse and often accompanied by the interference of rich background noise.The current state-of-the-art of scene text recognition methods usually consist of three modules:(1)Pre-processing module that rectifiy irregular scene text pictures;(2)Feature extraction module that extracts feature sequence from input rectified text image;(3)Feature translation module that maps an image feature sequence into a text character sequence.The community has paid increasing attention to boost the performance by improving the pre-processing image module,such like rectification and deblurring,or the sequence translator.However,for a basic and important module in scene text recognition algotithms,another critical module,i.e.,the feature sequence extractor,has not been extensively explored and discussed in depth.The main reason is that manually designing a feature extraction network(deep convolutional neural network)requires very strong domain knowledge and a large amount of experiments and computing resources.Therefore,the feature extraction modules use in most scene text recognition methods currently directly used object classification task design structure.However,there are differences between object classification tasks and other vision tasks,which may lead to sub-optimal situation.Inspired by the success of neural architecture search(NAS)technology in many visual tasks,such as large-scale object classification,image segmentation,object detection,etc.,and can identity comparable or even better architectures than manully designed ones.In this work,we propose automated STR(Auto STR)to search data-dependent backbones to boost text recognition performance.We first analyze the feature sequence extractor for the scene text recognition task and design a general domain-specific search space for STR task,which contains both choices on operations and constraints on the downsampling path.Then,we propose a novel two-step search algorithm based on the search process of convolution operations and the search of feature downsampling paths,which decouples operations and downsampling path,for an efficient search in the given space.Experiments demonstrate that,by searching datadependent backbones,Auto STR outperforms the state-of-the-art approaches on standard benchmarks with much fewer FLOPS and model parameters.

Keywords/Search Tags:

Scene Text Recognition, Neural Architecture Search, Convolutional Neural Network, Automated Machine Learning

PDF Full Text Request

Related items

1	Text Recognition In Natural Scenes Based On Convolutional Neural Network
2	A Research On Deep Learning Based Text Recognition And Generation In Natural Scene Images
3	Research On Convolutional Neural Network-based Scene Text Detection And Multi-orientational Character Recognition
4	Efficient Neural Architecture Search:Algorithms And Applications
5	Research On Text Detection And Recognition In Natural Scenes Based On Deep Learning
6	Research And Application Of Automatic Augmentation Of Text Data Based On Neural Network Architecture Search Ideology
7	Design And Implementation Of Chinese Character Recognition Model Based On Deep Neural Network
8	A Study On Fast Neural Architecture Search For Affinity Chip Microarchitecture
9	Research Of Optimizing Neural Architecture Search In Automated Machine Learning Systems
10	Studies Of Scene Text Detection And Recognition Based On Deep Learning