With the development of artificial intelligence technology,text recognition technology,as an important branch of pattern recognition,has also gained rapid development.Among them,Tibetan text recognition is an important research content in the field of pattern recognition and Tibetan information processing,which has important research significance for the protection and utilization of Tibetan ancient books.The current research object of text recognition is focused on Chinese and English recognition,and there is relatively little research on Tibetan ancient text recognition.In this paper,we combine the characteristics of Tibetan text and the layout of Tibetan ancient books,and study the multi-line text recognition algorithm of Tibetan ancient books based on implicit segmentation and no segmentation,mainly accomplishing the following work:(1)In this paper,firstly,to address the problem of lack of data of Tibetan ancient books,combining the information of background,font and font size of Tibetan ancient books,we propose the data enhancement methods of resolution modification,perspective transformation,elastic deformation and random projection transformation,expansion and erosion,brightness and contrast adjustment and text tilt rotation,etc.,and combine with real data to construct two data scale of 21912 and 15777 Tibetan The training effect is substantially improved compared with the real samples.(2)An attention mechanism-based multi-line text recognition method for Tibetan ancient texts is proposed,which uses an encoder-decoder structure based on the attention mechanism,and the encoder uses FCN with the attention module,which not only reduces the computation time,but also improves the experimental results.In the experiments of synthetic data and real data of Tibetan ancient texts,the recognition error rate reaches less than 15%.(3)A dimensional folding-based method for recognition of multiple lines of text in Tibetan ancient books is proposed,which achieves the stitching of multiple lines into one line by bilinear interpolation combined with Res Net encoder,and uses standard CTC loss in the transformation network to train the model,and achieves a recognition error rate of less than 10% in the experiments of synthetic data and real data of Tibetan ancient books.In summary,both methods are experimented on real datasets with synthetic datasets in the presence of multiple fonts,line crunching,noise,and curved text lines,and validate the effectiveness and robustness of the multi-line text recognition method for Tibetan ancient texts. |