Research On Key Technologies Of Multi-oriented And Arbitrary-shaped Scene Text Detection

Posted on:2021-03-24

Degree:Master

Type:Thesis

Country:China

Candidate:Y Xiao

Full Text:PDF

GTID:2428330647951060

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In the recent years,with the emergence of deep learning represented by Convolutional Neural Network(CNN)and Recurrent Neural Network(RNN),the research of scene text detection has made new developments.However,due to the existence of the following two factors,scene text detection is still a very challenging task.First,images in natural scene often have complex backgrounds,which can easily interfere with the detection process.Second,the forms of text in natural scenes are very diverse.Horizontal text and inclined text,straight text and curved text may exist in a scene image at the same time.In order to better solve the problem of multi-oriented and arbitrary-shaped scene text detection,this thesis studies the key technologies of this problem based on Mask R-CNN and proposes two algorithms.The main contents of this thesis are as follows:(1)In view of the problem that text-like objects in the backgrounds of scene images are easily misclassified as text,this thesis proposes a scene text detection algorithm that combines attention mechanism and instance segmentation.Based on Mask R-CNN,a new attention mechanism module Text-context-aware Attention Module(TCAM)is proposed.In the network architecture of this algorithm,TCAM is connected to each level of the feature pyramid of original Mask R-CNN.TCAM utilizes channel attention mechanism and spatial attention mechanism at the same time,and combines these two forms of attention mechanism by addition.TCAM can effectively suppress the false positive detection boxes produced by text-like objects in the background,thus improving the detection performance.The proposed algorithm has achieved F-measure of 84.60% and 70.20% on ICDAR2015 and ICDAR2017-MLT datasets,respectively.(2)In order to better deal with the variance of scale of scene text,this thesis further proposes a scene text detection algorithm based on multi-level featutre fusion.Based on Mask R-CNN,Pyramid Feature Fusion Module and Multi-layer Ro I Future Fusion Module are proposed to improve the construction and utilization methods of feature pyramid in original Mask R-CNN to improve the algorithm's capability of dealing with the variance of scale of text.Pyramid Feature Fusion Module uses both top-down and bottom-up feature fusion paths,so that the information of shallow feature and deep feature is fully exchanged and fused.This module simultaneously enhances the expression capabilities of shallow feature for detecting small text and deep feature for detecting large text,thus improving the detection performance of both small text and large text.Multi-layer Ro I Feature Fusion Module combines all levels of feature maps in feature pyramid to extract the features for prediction for text candidate regions,which enables feature extracted to better highlight the local and global characteristics of text instances,thus further improving the overall detection performance of the algorithm.Finally,this algorithm utilizes deformable convolution in its backbone network,which further enhances it's capability of dealing with the variance of scale of text.The algorithm proposed has achieved F-measure of 93.01%,87.80%,76.39% and 84.15% on ICDAR2013,ICDAR2015,ICDAR2017-MLT and SCUT-CTW1500 datasets,respectively.

Keywords/Search Tags:

scene text detection, convolutional neural network, attention mechanism, featue fusion, deformable convolution

PDF Full Text Request

Related items

1	Scene Text Detection Algorithm Based On Convolutional Neural Network
2	Research On Text Detection Method In Natural Scene Image
3	Image Quality Assessment Based On Deformable Convolutional Neural Networks With Gradient Fusion And Bilinear Attention Mechanism
4	Research On Scene Text Detection Technology Based On Multi-Scale Information Fusion
5	Research On Scene Text Detection Algorithm Combining Dual Attention Mechanism And Dilated Convolution
6	Research On Text Detection Method Based On Graph Neural Network
7	The Research Based On Convolutional Neural Network For Text Detection In Natural Scene Images
8	Research On Personalized Recommendation Based On Double Attentional Deformable Convolutional Network
9	Text Detection In Complex Scene Based On Multi-scale Information Preservation
10	Natural Scene Text Detection Based On Attention And Feature Enhancement