Font Size: a A A

Research On Text Detection Method Based On Graph Neural Network

Posted on:2023-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:P S ZhouFull Text:PDF
GTID:2568307025992639Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Natural scene text has close logic and rich expression,which can effectively provide high-level semantic information.With the rapid development of Internet technology,the analysis and processing of natural scene text has gradually become one of the research hotspots in the field of computer vision.At present,the mainstream text detection method is based on deep learning.Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection,DRRG uses an innovative Local Graph structure to connect the text region suggestion network and the deep relation inference network to realize the end-to-end training and prediction of the model.However,there are also some problems,such as the problem of over-smoothing when using GCN network to learn the representation of graph nodes,which reduces the accuracy of model inference.In addition,the large amount of calculation and parameters in the feature extraction stage,and the redundancy of network structure lead to the difficulty of model training and slow detection speed.In view of the above problems,the following research works are carried out:To solve the problem of over-smoothness when using GCN for inference in DRRG algorithm,This paper proposes a learning algorithm for Scene text detection algorithm considering the importance of text components(CITC)based on graph attention mechanism.It uses the graph attention mechanism to assign adaptive weights to the edges between nodes representing text components on the graph,which can effectively alleviate the over-smoothing problem caused by GCN.The improved density clustering method is used to cluster the inferential text components,which reduces the time complexity of using the width first search algorithm in the original algorithm.In order to further reduce the computational load and parameter quantity of CITC algorithm and simplify the network structure,an improved algorithm based on CITC is proposed: Lightweight deep inference scene text detection algorithm(LDIS)based on improved Mobile Net-A0 network.Firstly,the structure of the Mobile Net V2 network is simplified,and the redundant convolution layer and pooling layer at the bottom are deleted to simplify the network structure and reduce the computational load.Secondly,through the deep feature extraction module based on depth-separable convolution,the feature map is convolved in the two dimensions of space and channel respectively to further reduce the number of parameters.Finally,the channel mixing mechanism is introduced to effectively utilize the feature information of different channels in the same space to make up the feature extraction ability of the model.To verify the effectiveness of the algorithm,LDIS algorithm is applied to ID card text detection and business license text detection respectively.To some extent,it alleviates the image quality problems of ID card text detection affected by illumination intensity and shooting Angle,and improves the accuracy of ID card text detection.It solves the problem that business license text detection is affected by the density of adjacent text areas and text diversity,and improves the accuracy of business license text detection.
Keywords/Search Tags:natural scene text detection, Graph attention mechanism, Depth-separable convolution, Channel mixing mechanism
PDF Full Text Request
Related items