Font Size: a A A

Research On Uyghur Text Recognition In The Scene Image

Posted on:2022-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z L FuFull Text:PDF
GTID:2518306323979789Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the mobile Internet,image information is flood-ing all aspects of people's lives.The use of computers to automatically and efficiently extract text information from images is of great significance to the development of coun-try's informatization and digital economy.At present,the research on text detection and recognition in scene images is mostly based on Chinese and English,and there are few related researches on Uyghur texts.Because of the wide application of text recogni-tion technology in language translation,information retrieval,information security and other fields,the research of Uighur text recognition in scene images plays an important role in promoting the development of intelligent industry and economy in Xinjiang.With the rise of deep learning,many scene text detection and recognition meth-ods based on deep neural networks have been proposed,which has greatly promoted the development of this field.The detection and recognition of Uighur texts in scene images is one of the most challenging tasks in this research field.The difficulties can be summarized as follows:(1)Word-level Uyghur text detection.There are character spacing between Uyghur words and within Uyghur words,which will cause the am-biguity of Uyghur word partition in scene images and affect word-level text detection accuracy;(2)Robust feature extraction of text region.On the one hand,the texture features of Uyghur texts are relatively simple,and the background noise in the scene image is easily confused with the uyghur texts and causes false positive detection.On the other hand,the length of Uyghur words varies greatly,which may easily lead to missed detection of small-sized texts;(3)Adhesion of Uighur writing.There are of-ten adhesions between Uyghur characters,which brings challenges to the application of mainstream text recognition methods in Uyghur text recognition.(4)Uyghur texts have many similar characters.This character-similarity will greatly affect the performance of the mainstream text recognizer.In order to cope with the above difficulties,this article studies the recognition of Uighur text detection in scene images,as follows:(1)A segmentation-based Uyghur text detection algorithm is proposed.In this method,this article proposes a text representation that suitable for Uyghur texts,namely pixel affinity pyramid,to encode the instance affiliation relations of each pixel in the image and its multi-scale neighborhoods.This kind of text representation provides a fine local description for the reconstruction of the uyghur text instance in the post-processing,which is beneficial to word-level Uyghur text detection.Experimental results show that our method is superior to the existing text detection methods and achieves the best performance(F-measure 96.7%)In addition,in order to enhance the network's ability to extract features of Uyghur text regions,this article proposes a region enhancement module(REM)and an attentional fusion module(AFM).The REM mod-els the semantic correlation of regional features to capture the global context information effectively,which is beneficial to suppress false positive detection.The AFM uses the attention mechanism to adaptively aggregate multi-level text semantic features,which is beneficial to Multi-scale text area detection.Sufficient comparative experiments have proved the effectiveness of the two modules,in which REM can bring 1.2%performance improvement,and AFM can bring 0.8%performance improvement.(2)An attention-based Uyghur text recognition method is proposed.This method introduces a parallel contextual attention mechanism based on the parallel encoder-decoder framework to improve its visual feature alignment ability in Uighur text recog-nition.Parallel context attention mechanism consists of two parts:bidirectional lan-guage model and attention operation.The bidirectional language model can be pre-trained on large-scale Uyghur texts in the way of self-supervised learning,providing powerful Uyghur linguistic semantic information for the overall framework to reduce the false-prediction of similar characters.Experiments show that the parallel contex-tual attention mechanism can enhance the robustness of the model for hard samples,like blurred words,low-quality images and long uyghur texts,and greatly improve the model's performance.Based on the above research,this article designs and implements a Uyghur key-word recognition system.The system consists of Uighur character detection model,Uighur text recognition model and post-processing algorithm.The dataset proposed in this article is used to test the performance of this Uyghur keyword recognition system.The test results show that the Uyghur keyword recognition system reaches the practical performance(recall 94.8%,reasoning speed 4FPS).
Keywords/Search Tags:Neural network, Uyghur text detection, Uyghur text recognition, instance segmentation, language model
PDF Full Text Request
Related items