| Script identification is a classical problem in the field of text image processing,which is closely related to the overall recognition effectiveness of multi-script text image recognition system.Natural scene text images often have complex backgrounds and diverse text fonts,which makes it difficult to recognize the script of natural scene text images.On the other hand,some different scripts may contain similar or even the same characters,further increasing the difficulty of script identification.In order to improve the performance of script identification method on natural scene text images,this paper proposes two methods of script identification,namely,a script identification method based on attention mechanism and multi-scale feature fusion and a script identification method based on the context semantic enhancement.Furthermore,the effectiveness of the proposed script identification method is verified in the multi-script text image recognition system.The specific work is as follows:(1)Aiming at the complex background of scene text image,a multi-scale feature fusion script identification method based on attention mechanism is proposed.Firstly,on the basis of extracting local features of text images,a multi-scale feature fusion network is used to further extract global features and fuse these two kinds of features.Combined with the soft channel attention module,multi-scale features are weighted on the channel-wise to strengthen the weight of features that are more important for script identification tasks,thus improving the performance of script identification.This method achieves the accuracy of 99.03%,96.18%,89.64% and 94.05% on three public datasets,CVSI-2015,SIW-13,RRC-MLT2017 and private dataset Keda2030,respectively.(2)Aiming at the situation that different scripts use similar or even the same characters,a script identification method based on context semantic enhancement is proposed.This method combines the rich image representation generated by convolutional neural network with the sequence semantic features extracted by recurrent neural network,so that the network can capture the image level features as well as context information.Then the method applies the attention mechanism to enable the network to adaptively learn the image’s features and context information,so as to better identify scripts.This method achieves the accuracy of 99.10%,96.49%,89.93%and 94.65% accuracy on CVSI-2015,SIW-13,RRC-MLT 2017 and Keda2030 dataset respectively.(3)In the unified multi-script text image recognition system,in order to verify that script identification can effectively improve the performance of the system,the above two script identification methods are integrated into the text recognition system respectively.Firstly,identify the script of text image input,and train the unified multiscript text recognition model according to scripts to form the exact script’s text recognition model at the same time,and then select the corresponding script recognition model according to the predition result of script identification method.The experiments are conducted on eight scripts of the Keda2030 dataset,whose results show that the recognition performance of the text image recognition systems based on multi-script unified modeling are effectively improved after integrating the two script identification methods proposed in this paper.The average string recognition accuracy on eight scripts of the text recognition systems with the two methods proposed in chapter 2 and chapter3 are increased by 4.39% and 4.88% and the average character level accuracy is increased by 2.28% and 2.51% respectively. |