Research On Scene Text Detection Based On Deep Learning

Posted on:2019-04-24

Degree:Master

Type:Thesis

Country:China

Candidate:M Y En

Full Text:PDF

GTID:2428330593950238

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Text in natural scene images is an important source of information,containing rich and precise high level semantics.So detecting and recognizing scene text have great application value and have attracted much research interests during the last two decades.Early detection and recognition methods are based on artificially designed text features.However,with the revival of deep learning,deep neural networks show strong ability of learning features.Research based on deep neural networks,especially convolutional neural networks has became the mainstream of this field.Against the backdrop,the main task of this paper is to study the problem of scene text detection based on deep convolutional networks.In order to solve the problem of multi-scale scene text detection,especially small text detection,we propose a new detection framework called feature pyramid based scene text detector.The framework is based on the state-of-the-art object detection framework SSD,and introduces feature pyramid mechanism.Through a top-down feature fusion manner,features from different depth in CNN are combined and new features are built,forming a feature pyramid in which features have both high-level semantics and fine local details.Detecting on the new built features improves the performance on multi-scale text detection and small text detection.On ICDAR2013 benchmark,the F-score of the proposed method achieves 87.6%.Most of the current state-of-the-art scene text detection methods need a large amount of data with bounding box-level or pixel-level ground-truth to train deep models.But getting these kinds of data require expensive manual annotation.We explore to propose a weakly supervised method that train a deep CNN model with text localization ability on datasets that have only image-level annotations.Given an input image,the model is capable of producing a 2-D class activation map(CAM)where value of each pixel denotes the confidence score of whether the pixel belongs to text region or not.By the help of the CAM,most of background areas in the input image can be filtered out and then we find the areas where text may exist.Based on this method,we can generate text proposals by some MSER-based methods.The proposed weakly supervised method achieves recall rate comparable to some fully supervised methods on ICDAR2013 and ICDAR2015 benchmarks.

Keywords/Search Tags:

scene text, convolutional neural netwotks, weak supervision, deep learning

PDF Full Text Request

Related items

1	Research And Application Of Text Detection In Natural Scene Images Based On Deep Leaning
2	Research On Text Detection And Recognition In Natural Scenes Based On Deep Learning
3	Research On Natural Scene Text Detection Algorithms Based On Deep Learning
4	Research On Deep Learning Based Scene Text Understanding
5	Research And Application Of Image Knowledge Extraction Technology Based On Deep Learning
6	Deep Learning In Scene Text Detection
7	Deep Learning-Based Methods For Text Detection And Recognition In Natural Images
8	Anomaly Detection With Weak Supervision In Surveillance Video Scene
9	Research On End-to-end Scene Text Recognition Method Based On Deep Learning
10	Studies Of Scene Text Detection And Recognition Based On Deep Learning