Font Size: a A A

Urdu Natural Scene Text Recognition And Detection

Posted on:2022-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:Muhammad SalmanFull Text:PDF
GTID:2518306722493714Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text detection and recognition in natural scenes is extremely complex and challenging affected by complex backgrounds,different writing styles,sizes and directions of text,lowresolution images,and multiple languages.As a cursive script,Urdu is similar to other nonLatin scripts(such as Arabic and Hindi).Machine learning and deep learning algorithms have achieved excellent results in the practical application of text recognition,while the character recognition of cursive texts such as non-Latin cursive scripts and Urdu scripts in natural image application scenarios is still a relatively open research problem.Compared with the recognition task of non-cursive script,the characters of cursive script are intricate and complicated,especially Urdu script has more changes in character junction with different positions of words,which also leads to the optical character recognition(OCR)technology applied in Arabic and Urdu in natural scene images is greatly restricted.The Urdu data set of natural scenes is still in the development stage,and the acquisition of training resources is limited.This paper proposes a more comprehensive Urdu text recognition data set to provide better training for the neural network model.The natural scene images token by digital machine up to 2500 including Urdu character images,Urdu word images and end-to-end text recognition image data sets on levels of character,word,and sentence.The scene images contain different text styles,sizes,colors and backgrounds,as well as uneven lighting and text in any direction.Compared with existing Arabic natural scene text dataset,the samples of our dataset are more abundant.In Urdu in natural scenes,the detection task is more complicated and difficult due to the light conditions,the random direction and the variability of the text style.Aiming at the problems of poor Urdu text detection in natural scenes and inaccurate detection of text area boundaries,this paper proposes an MLF-based text recognition and detection algorithm in any direction.First,the deep convolutional neural network is used to extract features of different scales,and the different level features are merged through up-sampling to produce a feature map containing rich semantic information;then,the head and tail regions of the text are obtained through the method of semantic segmentation.The pixels in the regions are used to predict the boundary vertices,which to overcome the network's difficulty in detecting long texts due to blurred boundaries.Finally,OHEM(online hard example mining)is introduced to solve the problem of the imbalance of positive and negative sample categories in the training process.Urdu characters will vary with the adjacent characters before and after,resulting in changes in the adjacent connections.This makes the feature and forms of the detected characters changeable and complex.Aiming at the problem of the diverse connection and form characteristics of Urdu,this paper proposes a multi-scale feature aggregation and multilevel fusion network architecture(MSA-MLF)to recognize Urdu characters in natural images.The network MSA first extracts feature of different scales by various convolution filters,and then multi-scale features are aggregated by operations such as up-sampling.Finally,in the MLF network,the low-level features aggregated by MFA and high-level features are combined to produce the generated features contain more spatial structure information.The contributions of this article as following:First,a richer dataset of Urdu text in natural scene images proposed and analyzed.The Urdu character data set contains character images of different forms and styles,which is conducive to the model learning with richer detailed features.The Urdu word data set has more images than the existing Arabic word data set,and can be used as a benchmark data set to verify word recognition tasks.Urdu text recognition data set can be applied to multilingual text detection and recognition problems.Secondly,an arbitrary direction text detection algorithm based on multi-scale feature fusion proposed,which is used to solve the problem that the direction of Urdu text in natural scene is variable and random,and the boundary of the text area is blurred.In the network,features of different scales are extracted and fused to enhance the extraction of semantic and contextual information;the pixels in the head and tail regions of the text are used to predict the vertices of boundary to enhance the appearance of boundary features and improve the accuracy of segmentation.Finally,in view of the variability of Urdu's characters and the complexity of the connection of adjacent characters,a model network architecture of multi-scale feature aggregation and multi-level fusion proposed for Urdu character detection.The structure captures feature information at different scales and integrates the extracted character features at multiple levels.The semantic information and spatial information of feature after fusion of deep and shallow layers are effectively enhanced,so that the model fully captures the diverse and changing details of the characters themselves and the features of the joints.
Keywords/Search Tags:Urdu Chinese text segmentation, Convolution neural network and deep learning network
PDF Full Text Request
Related items