Font Size: a A A

Research On Text Extraction In Natural Scene

Posted on:2019-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:S DaiFull Text:PDF
GTID:2428330545990123Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Text extraction in natural scene images is one of the most valuable researches of the image processing technology field,it is beneficial to image analysis and understanding,and corresponds to the application of machine vision in the current hot industry and agriculture,transportation,safety and other industry development.However this research still needs to be improved,mainly due to the diversity of scenes and texts.These make the extraction from Chinese text of natural scene still be a challenging task.Based on a large number of research methods of text detection and extraction in recent years,this paper studies the extraction technology of Chinese text in natural scene images,and proposes two kinds of Chinese text extraction algorithms.A text extraction algorithm based on the Maximally Stable Extremal Regions(MSER)of the edge enhancement.First,the edge enhanced MSER detection algorithm is used to get the candidate MSER,and then use the constraints of long and short axis,area and number of holes to filter the obvious non MSER efficiently,also verify the candidate texts preliminarily.Because Chinese in text images tend to be split into multiple MSER,polymerization method proposed in this paper for the MSER polymerization of Chinese,the candidate regions become single candidate text Chinese components,and these components into the analysis of the use of machine learning Chinese choose the correct text.A text extraction algorithm based on Iterative Self-Organizing Data Analysis Techniques Algorithm.First,we use the improved NiBlack algorithm to segment the foreground from the image preliminarily,and then use the Lab spatial color information and stroke width information as the feature,and use the clustering algorithm to segment the image.Then the connected components are extracted and the connected components are filtered by using the constraints of geometric features.The connected components after the filtering are aggregated in Chinese to make the scattered strokes form the candidate Chinese text.In order to verify the text furtherly,a candidate text is connected to a row based on the rules of the text cluster,where rejection of stroke features and spatial features does not satisfy the constrained candidate text.By analyzing the line level features of the text,the SVM is used to classify the correct text lines and corresponding correct texts.Finally,experiments were performed using the datasets of the natural scene images created for Chinese texts.The images which contained various scenes and different conditions under the real environment.Experimental results show that the proposed method can extract text information effectively from scene images,with the satisfactory accuracy and recall rate as well.
Keywords/Search Tags:text extraction, Maximally Stable Extremal Regions, Chinese aggregation, support vector machine, text area verification
PDF Full Text Request
Related items