| Currently,Tibetan regular script recognition technology is quite mature,but Tibetan cursive script recognition is still in the early stages of research.Tibetan script mainly consists of regular script and cursive script,and these can further be divided into different fonts.While formal documents and scriptures mostly use neat Tibetan regular script,cursive script is commonly used for drafting documents,recording events,especially in informal writing among the Tibetan people.Therefore,detecting and recognizing cursive script is equally important.Compared to the neat Tibetan regular script that emphasizes speed and natural flow,cursive script focuses on fluent and casual strokes,concise lines,diverse stroke variations,and the existence of glyph variants.Consequently,there are challenges in detecting and recognizing printed Tibetan cursive script due to stroke adhesion and overlap,making it somewhat challenging.This paper explores the detection and recognition of six types of cursive script in the Everest font series using deep learning technology,including:(1)Dataset ConstructionFor the issue of lacking datasets for detecting and recognizing printed Tibetan cursive script,six datasets have been constructed,including scene Tibetan text detection,printed cursive script detection,brush-written Tibetan cursive script line text recognition,multi-style cursive script line text recognition,and whole-page brush-written Tibetan text recognition.(2)Tibetan Text DetectionIn response to the problem of detecting Tibetan text in complex backgrounds,a multi-scene Tibetan text detection method based on a multi-scale hybrid attention network(MHAN)is proposed.This method integrates low-level and high-level semantic features through multi-scale residual connections(MSRC),effectively aggregating multi-scale features and preserving more foreground text features.Additionally,by enhancing the information of different feature maps through hybrid attention(HA),it improves the ability of the detection head to accurately infer text.The accuracy on two testing datasets,natural scene Tibetan text detection and printed Tibetan cursive text detection,reached 85.2% and 99.04%,respectively.It achieved the best text detection accuracy in benchmark testing during the same period.(3)Line-level recognition of Tibetan cursive script in printed(3)Regarding the problem of stroke adhesion and intersection in Tibetan cursive script,two line-level printing cursive script recognition methods have been proposed.First,focusing on individual cursive script characters,a cursive script recognition method based on the RCNN+Char_Seg Net model for Drutsa cursive script is proposed.This method achieves an average accuracy of 91.43% on the Drutsa cursive script test dataset.Furthermore,to address the issue of recognizing cursive scripts in multiple styles,a cursive script recognition method based on feature-refined patch embedding for multi-style Tibetan cursive scripts is proposed.The average recognition accuracy on the test dataset containing six styles of multi-style Tibetan cursive scripts reaches 92.5%.(4)Full-page level recognition of Tibetan cursive script in printedFor the problem of recognizing entire pages of cursive script,a method based on end-to-end Transformer for recognizing printed Tibetan cursive script was proposed,which does not require character or line text segmentation.Firstly,to effectively address the problem of model underfitting,a Chinese handwritten recognition model was pre-trained.Additionally,to enable the model to distinguish between lines of text,a newline token “” was added to the Tokenizer.This method achieved an average character error rate of 12.48% on three different types of Drutsa cursive script test datasets,including normal,compact line,and distorted scripts.A Tibetan syllable correction algorithm was proposed in the post-processing stage of entire-page recognition,with an average correction accuracy of 27.34% on three different datasets,further improving the accuracy of recognition results.In summary,this article first constructs six datasets related to detection and recognition.Secondly,a Tibetan text detection method based on a multiscale hybrid attention network is proposed for the detection stage.Then,two line-level recognition methods and one whole-page recognition method for Tibetan cursive script text recognition are proposed.Finally,a Tibetan syllable correction algorithm for post-processing of whole-page recognition is proposed to further improve the effectiveness of whole-page recognition. |