Font Size: a A A

Localizing And Extracting Text In Nature Scene Images

Posted on:2017-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:L XiongFull Text:PDF
GTID:2348330515458400Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,internet technology and information technology is developing rapidly.With the popularity of portable devices such as mobile phones,digital cameras,etc,people can easily capture images and then,upload to the internet in anytime to satisfy their needs.Text,as a medium of human communication,it is also an important way of information transfer.But the nature scene text extraction remains a complex issue.There are many reasons.Firstly,text is an artificial structure and it shows different characteristics in various languages.For example,East Asia countries such as China,Japan and Korea have a large number of character classes,complicated character structures and various fonts.So it is difficult to use a simple method to detect all languages.Second,during the image collection process,scene text detection may be affected by kinds of factors,such as the uneven lighting,the complex background and so on.Now scene text extraction is still a hot research topic.Image text localization as the key step of the text information extraction,the performance of localization will directly influence the later OCR process.So,in this paper,we have designed a multi-resolution strategy for detecting horizontal English text.And the framework can effectively localize and extract text line in nature scene images from coarse to fine.Firstly,in the coarse stage,we used Gaussian pyramid convert every image into three resolution representations and it can help to detect different size of characters.Then,we trained the convolutional neural network(CNN)to classify object areas.We used two methods to obtain object areas.The first one is based on maximally stable extremal regions(MSER)extraction and the second method is based on stroke width transform(SWT).Finally,the experiment showed that convolutional neural network can effectively detect character areas and the performance in this phase mainly affected by the completeness of object sets.Our paper shows that more character areas can be extracted by using the maximally stable extremal regions than stroke width transform method.Secondly,in the refine stage,we designed a series of rules to merge the results from the coarse stage.These rules used texture measures based on gray level co-occurrence matrix(GLCM)to merge multi-resolution text localization.And then,we fed the merged results into an Adaboost classifier to remove false positives.And the Adaboost is based on histogram of oriented gradient(HOG).Finally,the experiment shows this stage can effectively improve the accuracy of text localization.In this framework,the localization results can further use image binarization to deal with the segmentation process.And then,results can directly use OCR program to recognize the image text lines.
Keywords/Search Tags:Convolutional Neural Network, Maximally Stable Extremal Region, Stroke Width Transform, Gray Level Co-occurrence Matrix, Histogram of Oriented Gradient
PDF Full Text Request
Related items