With the development of the Internet,massive image information is full of people’s life in all aspects.Using computer to automatically and effectively identify the text content in images is of great significance for improving the multimedia retrieval capability,scene understanding ability and control efficiency of network information platform.At present,there are many methods of Chinese and English detection and recognition in natural scenes,but the detection and recognition of Uyghur is still in the exploration stage and there is no mature theoretical method.Based on deep learning technology,this thesis studies the detection and recognition of Uyghur in natural scene images to realize an intelligent and efficient Uyghur text recognition system.The main research contents are as follows.1.A text image synthesis method named Synth Text is studied,and a Uyghur image data set is generated manually,which can effectively solve the problem of lack of training samples in Uyghur text detection.According to the segmentation information and depth information of the original background image,this method selects the appropriate text embedding region and inserts the text target into the background image naturally.However,the text image synthesized by this method has the defect of losing the Uyghur concatenation features.For this defect,this thesis uses PIL image processing library to draw a batch of Uyghur images with the concatenation features as a supplement to the synthesized data set.2.Two text detection algorithms,CTPN and EAST,are studied and used in Uyghur text detection experiments respectively.The experimental results show that CTPN can detect the long Uyghur text more completely,but the detection effect of tilt text is poor;EAST can detect the text at any angle,and the detection speed is faster,but the detection is incomplete or even lost when the Uyghur text is long.In addition,the F-measure of CTPN on the test set is higher than that of EAST,and CTPN can better detect the points around Uyghur characters.In this thesis,CTPN model is finally selected as the basic model for Uyghur text detection.3.Two mainstream text recognition frameworks based on attention mechanism and CTC loss function are studied,and CRNN+CTC text recognition scheme is selected for Uyghur text recognition.Because Uyghur characters are written from right to left,and the feature extraction order of CRNN is from left to right,the sequence of Uyghur characters output by CTC is opposite to the correct sequence of label,which leads to the wrong model training.For this reason,this thesis improves the output part of CTC,and designs a character collation rule that conforms to the writing order of Uyghur to make the network output Uyghur sequences in correct writing order,so as to correctly train the model and realize the recognition of Uyghur.The data set is 8 million Uyghur recognition images with the size of 280 ? 32 generated manually by using the PIL image processing library.The experimental results show that the Uyghur character recognition accuracy of the CRNN_CTC model trained in this thesis is 64% on the test set.After the above research,this thesis designs and implements a Uyghur text recognition system based on deep learning.The system mainly includes four modules: image preprocessing module,Uyghur text detection module,text image tilt correction module and Uyghur text recognition module.The Uyghur text detection module and recognition module are implemented by CTPN model and CRNN_CTC model respectively.Finally,the Uyghur text recognition system is tested.When four processes are simultaneously running on a single NVIDIA GTX 1080 Ti GPU,the average recognition time of a single image for each process is 0.663 s. |