Font Size: a A A

Research On Multi-shape Scene Tibetan Text Detection Technology

Posted on:2024-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z XuFull Text:PDF
GTID:2555307085970779Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Tibetan culture is one of the glorious treasures among the blossoming cultures of the Chinese nation.The digitisation of the huge volume of Tibetan literature resources is a necessary path for the inheritance and development of excellent culture,and the detection of Tibetan text is an indispensable part of the digitisation process of traditional Tibetan cultural heritage and the development of Tibetan informatization.One of the difficulties encountered in Tibetan text detection research is the problem of detecting multi-shaped Tibetan text areas such as bending and skewing in different scenes.And exploring the application and innovation of excellent ideas and concepts in Chinese and English text detection techniques in Tibetan text detection is an effective way to promote high quality development of Tibetan text detection.The main contributions of this paper are as follows.To address the more complex relationships embedded between the text regions in each part of Tibetan text images and the problem of Tibetan as a two-dimensional text,this paper proposes a text detection model based on the idea of component connectivity,using convolutional neural network to predict the text components in the images,and then using graph convolutional network to infer the deep relationships embedded between the text components.Experiments on Tibetan image datasets show that the graph convolutional network has significantly improved the effectiveness of Tibetan text detection.The model shows excellent experimental results compared to other models.Based on the feature that the attention mechanism can capture important information and ignore redundant information,this paper proposes an improved method of introducing the attention mechanism on the target detection model with encoder-decoder structure,so that the attention module is optimised from focusing on the global to the surrounding region of the target.Experiments on the Tibetan image dataset show that the text detection efficiency of the model is improved compared with some popular models.From the comparative analysis of the two parts of the experiments,it can be seen that the model based on graph neural network outperforms the model based on attention mechanism for text with irregular shapes such as excessive curvature,while the overall detection efficiency of the text detection model based on attention mechanism is higher than that of the model based on graph neural network for a more regular text dataset.
Keywords/Search Tags:object detection, text detection, graph convolutional network, attention mechanism
PDF Full Text Request
Related items