Font Size: a A A

Scene Text Detection Based On Feature Fusion And Pyramid Attention

Posted on:2021-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y J FengFull Text:PDF
GTID:2518306467957169Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the increasing demand for Internet products,more and more applications concern the technology of text extraction from images.At present,the study of text detection in natural scenes is dominated by deep learning.However,the deep learning based algorithms generally lack the refinement of feature level,which results in the well-designed models cannot be fully utilized.In addition,the convolution operator has a local receptive field,long-term dependence problem can be solved by multiple convolutional layers.In order to solve the above problems,the combination of feature fusion and feature pyramid attention module is proposed to implement the natural scene text detection to improve the detection effect.The main work of this thesis is as follows:1.Analyze and summarize the relevant aspects of scene text detection,including deep convolutional network,object detection framework based on deep learning,semantic segmentation,instance segmentation,and the popular natural scene text detection algorithms,which provide theoretical basis for the researches of scene text detection algorithm based on feature fusion and scene text detection algorithm based on feature pyramid attention.2.Design and implement the feature fusion module based on the basic feature extraction network(Pixel Link algorithm).For deep networks,the deeper layers contain more semantic information,but have relatively lower resolution and weaker ability to perceive details,while the lower layers contain more content description,location and detail information,but have less semantic information.The proposed feature fusion module combines the feature information of each level,thereby increasing the amount of information of the feature mapping layers and further improving performances.Comparing to the Pixel Link algorithm,the scene text detection algorithm based on feature fusion achieves F-measure improvement of 0.36% and 3.85% on ICDAR2015 and ICDAR2013,respectively.3.Design and implement the feature pyramid attention module based on the basic feature extraction network(Pixel Link algorithm)and feature fusion.The attention network can expand the receptive field without more computing power,and the spatial pyramid structure employs different grid scales or expansion rates to fuse the multi-scale feature information.The feature pyramid attention module includes three branches: the refined pyramid network,the nonlinear transformation and the global average pooling.The refined pyramid network adopts a coarse-to-fine strategy to let higher feature layers have richer information without increasing parameters and computing power.Comparing to the Pixel Link algorithm,the scene text detection algorithm based on feature fusion and pyramid attention achieves F-measure improvement of 2.91% and 4.04% on ICDAR2015 and ICDAR2013,respectively.
Keywords/Search Tags:Natural Scene Text Detection, Feature Fusion, Feature Pyramid Attention Module, ICDAR2015, ICDAR2013
PDF Full Text Request
Related items