Font Size: a A A

Scene Text Detection With The Text Statistical Characteristics And Deep Neural Network

Posted on:2018-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:L LinFull Text:PDF
GTID:2428330515453637Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a common visual element in the image,text contains rich and accurate high-level semantic information.In virtue of the good ability of expressing the scene visual content,text can help people to describe and understand the image more specifically and accurately.Scene text detection and recognition has a wide range of applications in geo-locating,blind nevigation,image retrieval,human-computer interaction,etc.In this paper,we aim at scene text detection,which is the basis and key of a text detection and recognition system.Due to a number of factors,such as interferences from background clutter,noise,blur,occlusion,and non-uniform illumination,as well as the variabilities in text apperance,layout,font,language,and style,detecting text in natural scene images still remains an overwhelminly challenging problem.Although progresses are witnessed in the field of scene text detection,the recent research lacks the discussion on the end-to-end scene text detection.Moreover,there is no good solution for multi-granularity,multilingual,and multi-oriented text detection problems.The research contents in this paper are as follows.Firstly,a deep text detector based on the Text Statistical Characteristics is proposed.Most of the existing text detection approaches are not end-to-end but consist of multiple stages.They are time-consuming and there is large room for improvement.In this paper,text detection is regarded as a special object detecting problem.Under the framework of Single Shot MultiBox Detector(SSD),a text detector is designed based on the Text Statistical Characteristics.Furthermore,to imrove the robustness on the size of text,a voting multi-scale fusion algorithm is developed.The experiments on standard benchmarks prove the effectiveness of the improvements.It achieves 86.16%F-measure and 86.83%F-measure on the ICDAR 2011 and ICDAR 2013 benchmarks.Our approach is computationally efficient with 0.09s/image for single-scale version and 0.27s/image for multi-scale version,which suppass the state-of-the-art results in both performance and speed.Secondly,we propose a multi-granularity and multilingual scene text detection method based on the deep text detector.Most of the existing methods,which can only be used for a specified granularity(such as a character,a word,or a textline)and cannot be directly applied to other granularity,are designed for a specific language(mostly English).By analysing the characteristics of different granularities of text,we extend the proposed method under the same framework to detect multi-granularity,multilingual text in scene images.The experimental results in three benchmarks demonstrate the extendibility of our approach.Thidly,two multi-oriented scene text detection approaches are proposed in this paper.To detect multi-oriented text in the scene images,which can not be solved well by most of the existing methods,we propose two different detecting strategies by extending our deep text detector from horizontal to multi-oriented.The first method adopts the conventional buttom-up pipeline.And the second is based on a new robust multi-oriented fusion algorithm.The proposed approaches are evaluated on the MSRA-TD500 dataset and achieve competitive results compared to the state-of-the-art methods.
Keywords/Search Tags:Scene Text detection, Text Statistical Characteristics, SSD
PDF Full Text Request
Related items