Font Size: a A A

Research On Multi-orientation Scene Text Extraction

Posted on:2019-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:Q L LeiFull Text:PDF
GTID:2428330566477951Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Text is one of the important information carriers for the advancement of human civilization based on its intuitive expression ability.The ubiquitous text in natural scenes which carry high-level semantic information about the surrounding environment play an important role in many practical situations and related fields in daily life.With the popularity of mobile devices equipped with shooting function and the rapid development of Internet,the extraction of natural scene text that becomes an important way to understand and retrieve content of images makes it possible for machine to simulate the interaction between human and the environment.Due to the influence of uncertainties such as complex environments,different languages,changing light and image degradation,text extraction of natural scenes still faces enormous challenges.At present,most methods designed for scene text extraction choose to rely on the detection of individual characters,and then aggregate the initial set of character candidates into words based on space or dictionary constrains.However,there is a large number of problems that cannot be dealt with by single-character detection algorithms such as curved text,dot matrix text,low-resolution text and complex backgrounds that are similar to the structure of text.It makes this method more difficult to implement and less stable.In this thesis,the difficult parts of text detection task in natural scenes are deeply analyzed.Especially,the region proposal method based on the uniqueness of text is studied.Motivated by the transformation from traditional text detectors to general object proposal techniques in text recognition system,as well as the significance of saliency for image analysis,we choose to filter and merge the over-divided regions which are generated by the MSER and output a hierarchy of text hypotheses in an efficient way with the combination of the characteristics of text and saliency map which is predicted by the full convolutional network.By testing different configurations and comparing with other method on multiple datasets,such as complex backgrounds,multi-language scenarios.It is shown that our method can provide high recall rate while background is complicated.Considering with the problem that the direction of text arrangement in natural scenes is arbitrary,we propose a novel method that combine text region proposals with convolutional neural networks.On the premise of the high-quality candidate regions generated by the text region proposal method,a multi-scale ROIPooling is set according to the geometric features of the text object,and the features extracted from different layers are merged.Finally,after the multi-task process composed of classification and regression,the inclined non-maximum suppression is used to filter the redundant output and obtain the results.Experiments show that the proposed method has validity and robustness for natural scene text detection in multiple directions.
Keywords/Search Tags:Natural scene text, Region proposal, Full convolutional network, Multi-orientation text extraction
PDF Full Text Request
Related items