Scene Chinese Text Recognition Based On Dual Attention Mechanism

Posted on:2021-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Chen

Full Text:PDF

GTID:2428330611465318

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

As an important carrier of human information communication,the text contains very rich semantic information.It is of great significance for the recognition and understanding of the text in the image.With the rapid development of artificial intelligence technology,scene text recognition technology that based on deep learning is changing rapidly.However,the current methods have some problems,such as insufficient recognition accuracy and insufficient recognition ability for deformed characters.Therefore,those methods are still far from practical applicationThe existing scene text recognition algorithms have the following problems:(1)The feature extraction network cannot adapt well to the input image of the scene text.(2)The existing technology cannot directly use explicit language models to fully mine semantic information.(3)The recognition algorithm based on the one-dimensional encoding and decoding network cannot directly process two-dimensional images.Therefore,this thesis proposes a scene Chinese text recognition method that based on dual attention mechanism and binary associated semantic information.Including feature extraction module,codec module,binary associated semantic information module,dual attention network module.The specific contributions are as follows1.For the current feature extraction network,there is no way to handle small-sized text input images well.A multi-scale fusion residual network is proposed to effectively improve the feature extraction capability.Based on ResNet using jumpers to make residuals on the input and output of the convolutional layer,the input and output feature maps are simultaneously channel-spliced to perform feature map information fusion of different scales.Since the number of jumpers does not increase,it is not easy to cause overfitting2.In order to effectively use the language model,this thesis draws on the factorization machine algorithm and proposes a binary associated semantic model.It can explicitly learn sequential semantic information and step-by-step semantic information.When predicting a character in the sequence,the previous predicted character vector is used to perform two-two point multiplication.And the binary associated semantic information can be obtained.Later,the obtained information is utilized to guide the generation of the currently predicted character Compared with LSTM which can only implicitly learn sequential information,it can mine semantic information better3.Aiming at the recognition of irregular characters,a scene text recognition model based on dual attention mechanism is proposed.It can process two-dimensional image features and one-dimensional sequence features at the same time to deal with the recognition of some distorted texts.The dual attention mechanism uses sequence attention weights to weight sequence features.Then,one-dimensional sequence information is obtained through the encoder.Meanwhile,it uses image attention weights to directly weight two-dimensional image features to obtain two-dimensional image information.Finally,it combines sequence information and image information for identification.It has played a good role in supplementing the information when compared to the original sequence-based single attention mechanismThis thesis uses the Chinese text scene database MTWI and Baidu OCR for testing.The experimental results show that the scene Chinese text recognition model proposed in this thesis has a 2%and 6%improvement respectively on the two data sets,when compared to the basic network.Compared with the industry-leading method SAR on the same data set,the improvement is 0.7%and 2.9%,respectively,which verifies the effectiveness of the proposed method.

Keywords/Search Tags:

Scene Chinese text recognition, Multi-scale fusion residual network, Binary associated semantic information, Dual attention mechanism

PDF Full Text Request

Related items

1	Research On Scene Text Detection And Recognition Based On Deep Learning
2	Research On Scene Text Detection Technology Based On Multi-Scale Information Fusion
3	Speckle Image Recognition And Location Based On Faster R-CNN
4	Research On Semantic Segmentation-Oriented Attention Mechanism And Multi-Scale Feature Cross-Layer Fusion
5	Research On Natural Scene Text Detection Algorithm Based On Semantic Segmentation
6	Research And Implementation Of Natural Scene Text Recognition Algorithm Based On Attention Mechanism
7	Studies Of Scene Text Detection And Recognition Based On Deep Learning
8	Scene Text Detection And Recognition Based On Deep Learning
9	Research On Facial Expression Recognition Based On Residual Network
10	Research On Chinese Text Classification Method Based On Attention Mechanism And Multi-feature Fusion