Font Size: a A A

Research On Text Detection In Natural Scene Based On Deep Learning

Posted on:2020-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:D Q WangFull Text:PDF
GTID:2428330572489345Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text information in natural scenes has clear semantics,which helps to understand and analyze the content of natural scenes.In recent years,text detection and recognition in natural scenes,as an important research direction in the field of computer vision,has attracted extensive attention of scholars and research institutions at home and abroad.Its results can be widely used in scene classification,automobile driving,robot vision and other fields.Among them,the results of text area detection and location in natural scenes will directly affect the accuracy of next step of text recognition.Most of the existing text detection technologies are based on the structural characteristics of the text itself to extract the artificial features,and then combined with machine learning method to detect text regions.Due to the complexity of natural scenes and the diversity of characters in natural scenes,this artificial feature is usually only applicable to specific situations,and the overall detection accuracy is low.With the development and maturity of deep neural network,many scholars and research institutions have designed different text detection network models to achieve end-to-end text localization.Compared with the previous traditional methods,the detection performance has been greatly improved,but the feature learning stage is time-consuming.Therefore,in this dissertation,the traditional feature and deep network feature extraction methods in natural scenes were studied,and combines traditional features with deep network features,uses traditional features to guide the extraction of deep network features,improves the speed of feature extraction in deep neural network.The main contents of this dissertation are as follows:Firstly,according to the multi-lingual,multi-directional,multi-scale and multi-morphological characteristics of text in real natural scene images with complex background,data sets were searched and collected to form text image database.Secondly,the difference between text and background caused by traditional image extraction algorithms were studied,and different text structure features were used to extract text regions.The validity of saliency detection algorithm for scene text image was studied,and the detection effect of visual attention model for text object was analyzed.Different traditional feature extraction methods were combined to find the most prominent text region candidate.Then the machine learning method was used to judge the text and non-text regions by its powerful classification ability,so as to improve the detection accuracy.Finally,traditional methods and deep learning methods were combined.At the same time,the text summary map obtained by traditional feature fusion and the convolutional neural network with strong feature extraction ability were used to obtain better text features to enhance the text feature representation which reduced the redundant information generated by the neural network,and used the end-to-end processing mode of deep detection network to accurately locate the text area.The two scene text detection methods proposed in this dissertation improve the performance of scene text detection to a certain extent.Experimental results show that the proposed method using multi-scale MSER and ITTI model improves the comprehensive performance by 1 to 5 percentage compared with other traditional methods,and can deal with multi-scale text and other situations.The proposed method combining the traditional feature and Advanced EAST model can realize scene text positioned accurately under the condition of multi-arrangement and multi-language.This work combined the advantages of the two methods which can effectively extract the salient features of text area,and also reduced the training time of network.The experimental results on a variety of data sets show that the proposed methods have good robustness to text in different scenarios.
Keywords/Search Tags:text detection, deep learning, visual model, maximally stable extremal region, feature fusion
PDF Full Text Request
Related items