Font Size: a A A

Detection And Recognition Of Vietnamese Texts In Real Scenes

Posted on:2022-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y T FengFull Text:PDF
GTID:2518306554471064Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
As a carrier of a country and national culture,the importance of text goes without saying.Recognition of text in real scenes by computer has become one of the most important research topics in the field of computer vision.However,the existing scene text detection and recognition algorithms are mostly for English,Chinese,and other widely used languages.Vietnamese,as a phonetic text of tone language,is relatively small in number,but it is representative.Achieving the detection and recognition of Vietnamese scene text is conducive to the recognition of phonetic scene text in tone languages widely used in Southeast Asia.Unlike common Latin texts,there are six kinds of tonal markers in Vietnamese writing,and different tonal markers are used to express different semantic information,the current algorithms still have various defects in Vietnamese scene text detection and recognition.Therefore,based on the deep neural network,this paper studies the detection and recognition algorithms of Vietnamese scene text.The algorithm proposed in this paper is as follows:First,to solve the problem that tone marker regions in Vietnamese texts are often ignored by detection algorithms,this paper presents a shape expand algorithm based on an instance segmentation model(Mask R-CNN),which allows the algorithm to detect tone marker in Vietnamese texts step by step through model reuse and combined with a two-way attention mechanism.In view of the lack of training data for Vietnamese scene text detection,this paper presents a method of model joint training,which enhances the generalization ability of the model to extract text in different scenarios.To overcome the problem that duplicate text detection boxes cannot be eliminated by the conventional non-maximum suppression algorithm,this paper designs a filter module for text area to effectively eliminate duplicate detection boxes.The validity of the algorithm is verified by comparison and cross-validation experiments.Secondly,due to the wide existence of tone markers in Vietnamese texts,the combination of the same letter with different tone markers will result in little diversity between letter classes.The recognition dictionary is larger than English,thus the design of text recognition models is also more difficult.In order to solve the above problems,this paper adds a spatial attention mechanism based on the text recognition network CRNN to enhance the network model's ability to recognize Vietnamese texts with little diversity between letter classes.To solve the problem of text sequence has both horizontal and vertical arrangement in scene images,this paper designs a network structure that can extract the two-way alignment features,and accurately recognizes the scene text content without judging the text direction of the scene.By comparing with CRNN,the Vietnamese scene text recognition algorithm proposed in this paper has advantages in training speed,inference speed,and recognition accuracy.
Keywords/Search Tags:Vietnamese text, scene text detection, scene text recognition, attention mechanism, model joint training
PDF Full Text Request
Related items