Font Size: a A A

Deep Learning In Scene Text Detection

Posted on:2021-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:D PanFull Text:PDF
GTID:2518306476450184Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Scene text detection have drawn much attention with the rise of deep convolutional neural network.It has great practical value under a variety of scenarios,such as advertisement filtering,scene understanding,document analysis,robot navigation and so on.However,there are still many challenges remaining like image distortion,extreme illumination,incomplete character and large variance in scale,aspect ratio and direction.The main work of this paper are as follows:1.A text detecting algorithm named AC-EAST is proposed,which is improved based on a semantic segmentation method,EAST.Atrous convolution is applied in AC-EAST in order to achieve large receptive field with reasonable feature map size.Atrous spatial pyramid pooling structure is also used to extract features of different scales.AC-EAST is better than other current excellent algorithms in text detection,achieves the F-score of 0.826 on ICDAR 2015 test dataset.2.A text detecting algorithm named ITPN is proposed,which is improved based on an object detection method,Textboxes++.A modified prior box generation mechanism is adopted to detect small text,text of extreme aspect ratio and vertical text.The Inception output layers apply different convolutional kernels to different kinds of prior boxes.Experiments prove that ITPN has great detecting accuracy on texts of small size and large aspect ratio.It achieves the recall of 0.838 on ICDAR 2015 test dataset,outperforming other methods.3.BLSTD,a text detecting algorithm combining semantic segmentation and object detection is proposed.This method combines AC-EAST and ITPN using attention mechanism and fused non-maximum suppression algorithm,making the detection of all general texts possible.Experiments show that BLSTD has both high accuracy and high recall.Compared to other methods,BLSTD has only one non-maximum suppression as post-processing,largely reducing the detection time.A business license text detection system is also built based on this algorithm,showing great results not only on business license but also on other licenses and receipts.
Keywords/Search Tags:Deep convolutional neural network, Scene text detection, Semantic segmentation, Object detection, Atrous convolution
PDF Full Text Request
Related items