Research And Implementation Of Natural Scene Text Recognition Based On Deep Learning

Posted on:2024-08-25

Degree:Master

Type:Thesis

Country:China

Candidate:R M Jiang

Full Text:PDF

GTID:2568307079966049

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

Text is one of the most important carriers of information in modern society,and extracting text information from a large number of images is currently one of the research hotspots.With the increasing prevalence of multilingual scenes in modern society,single-language text recognition has certain limitations in practical applications.Therefore,research on scene text recognition in multilingual scenarios is of great significance.In the field of multilingual scene text recognition,traditional methods that concatenate language recognition and text recognition may introduce accumulated errors.Additionally,the character set is usually larger in multilingual scenarios,and similar characters in different languages pose great challenges to language recognition.To address these challenges,this thesis proposes a multilingual scene text recognition algorithm that achieves higher overall recognition accuracy by leveraging both optical text recognition and semantic information correction.In terms of optical text recognition,this thesis embeds language information into the text recognition model to improve recognition accuracy in multilingual scenarios.Compared to concatenating language recognition and text recognition,embedding language information into the recognition model is less likely to introduce accumulated errors.By comprehensively analyzing global and local features in the language recognition model,this thesis can better distinguish language types with similar characters.Finally,the proposed algorithm is tested on the ICDAR2019-MLT dataset to validate the effectiveness of the improvement.In terms of text correction,this thesis takes into account the rich semantics in text and proposes using a CNN-based Seq2 Seq model for text correction.Since errors in scene text recognition are often unrelated to semantics,correcting recognition results based on text semantics is meaningful.By treating text correction as a machine translation task and using a CNN-based architecture,this thesis improves the training speed compared to RNN-based models.Language information is embedded into the model,the feature embedding mode is changed,and global residual connections are introduced to make the model adaptable to the requirements of correcting recognition results.Moreover,this thesis proposes a method for generating a multilingual scene text correction dataset,and the effectiveness of the improved model is verified on this dataset.Finally,to apply the proposed algorithm in practice,this thesis designs a web application for natural scene text recognition with a complete user experience.

Keywords/Search Tags:

Scene Text Recognition, Muliti-Language, Language Classification, Text Correction

PDF Full Text Request

Related items

1	Text Correction For ASR Result On The Platform Of Intelligent Mobile Phone
2	Research On Language Model Corpus Expansion And Text Error Correction Algorithm For Speech Transcription
3	Research On Cross-language Text Classification Technology
4	A Study Of Error Correction In Domain Oriented Dialogue Text After ASR Conversion
5	Design And Implementation Of Scene Text Recognition System
6	Research And Application Of Text Error Detection And Correction After Speech Recognition
7	Research On Detection And Recognition Algorithms Of Vietnamese Text In Natural Scenes
8	Interative Scene Research For Text To Scene
9	Research On Cross-language Text Classification Based On Multilingual Segment Representation
10	Study On Key Technologies Of Scene Text Recognition