Font Size: a A A

Research On Text Detection In Natural Scene Images

Posted on:2020-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:B C XiongFull Text:PDF
GTID:2428330590458205Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Text in natural scene images can express abundant information concisely and help people understand the scene better.Therefore,text detection in natural scene images is a very valuable research topic.This thesis takes ICDAR-2013 focused scene dataset and ICDAR-2015 incidental scene dataset as the research objects,and studies text detection algorithm based on maximally stable extremal regions and text detection algorithm based on deep learning respectively.The main work of this thesis is as follows:Firstly,as for the problem that in text detection algorithm based on maximally stable extremal regions,the classification accuracy produced by support vector machine and histograms of oriented gradients features is not good enough,a text detection algorithm based on maximally stable extremal regions and residual network is proposed.ResNet-18 is used to classify candidate character regions so that better character classification accuracy and text detection results are achieved.Secondly,as for the problem that arbitrary quadrilateral description of text area makes the learning method of single shot detector is not suitable for the task of text detection,vertex regression method is used which directly predicts the absolute difference of the coordinates of four vertextes.And the method of regional spatial similarity measurement based on manhattan distance is proposed in the thesis and reduces the time spent on judging positive and negative default boxes during every training iteration from 1 minute 30 seconds to 0.1 seconds,which greatly improves the efficiency of network training and the accuracy of text detection algorithm.Then,as for the problem that the text detection network based on vertex regression and manhattan distance measurement has poor performance on vertical text area and text area with a large angle,the text detection network based on multi-kernel rotated module is proposed.Rotated default box is used to detect arbitrary orientated text,then random rotation cascade and multi-kernel convolution module is used to alleviate the problem that the detection accuracy decreases due to the imbalance between the number of horizontal and vertical text areas in natural scenes.At the same time,focal loss is used to replace the online hard negative mining strategy,which alleviates the imbalance between positive and negative samples and improves the detection accuracy.Finally,as for the over-fitting problem of text detection network based on multi-kernel rotated module,a text detection network based on multi-task learning is proposed.The text detection network based on multi-kernel rotated module and fully convolutional networks supervise the same backbone network VGGNet-16 and learn the location information and semantics segmentation information of text simultaneously in order to alleviate the over-fitting problem and improve the accuracy.At the same time,on the basis of text detection network based on multi-task learning,a multi-task output fusion algorithm is proposed in the thesis,which extracts structured information from the semantics segmentation results of fully convolutional neural network,and fuses it with the output of text detection network based on multi-kernel rotated module to further improve the accuracy.The experimental results show that 0.78 F1 score on ICDAR-2015 dataset is achieved by the text detection network based on multi-task learning,which is 3 percentage points higher than SegLink algorithm,2 percentage points higher than EAST algorithm based on VGGNet-16 backbone network and 1 percentage point higher than SSTD algorithm.
Keywords/Search Tags:Natural Scene, Text Detection, Semantic Segmentation, Multi-task Learning
PDF Full Text Request
Related items