Font Size: a A A

Research On Natural Scene Text Detection Based On MSER And Color Clustering

Posted on:2020-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:M LiFull Text:PDF
GTID:2428330620462249Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Text detection in natural scenes is the basis of natural scene image information extraction.It has wide application value and research significance in the fields of license plate recognition,real-time translation and image retrieval.The component based methods are the most common methods in natural scene text detection.Among them,Maximally Stable Extremal Regions(MSER)algorithm and color clustering algorithm are widely used.Aiming at the limitations of the traditional MSER algorithm and traditional color clustering algorithm,this paper proposes a natural scene text detection algorithm based on image enhanced MSER and improved color clustering.The main contents are as follows:(1)Image preprocessing.Natural scene images come from real life scenes,the quality of the images is uneven and there is a lot of noise in the picture.The original image is bilaterally filtered,and the color image is scaled and grayscaled.In order to improve the effect of MSER algorithm in extracting the initial text regions,an image contrast enhancement method based on dark channel priori theory and illumination averaging is proposed.(2)Candidate region extraction.MSER algorithm is used to extract the initial text regions.Considering that the background pixels with similar color have great interference on the color clustering algorithm,Stroke Width Transform(SWT)and angle feature are used to filter out the stable textual pixels in the initial text regions.In order to solve the problem that K-means clustering algorithm needs to set the initial K value manually,a two-level grouping strategy is proposed to group the stable textual pixels,determine the initial color center,and obtain candidate character regions by multi-scale color clustering.Compared with the traditional color clustering algorithm,the proposed algorithm achieves higher text coverage.(3)Non-text filtering.There are a large number of non-text regions in the extracted candidate regions.By extracting geometric,stroke,corner and texture features of the candidate regions and combining them with support vector machine for character region verification.Aiming at the problem that Harris corner detection will produce false detection points,an improved scheme is proposed.In addition,in order to improve the recall rate of character detection and reduce the misclassification of text,a misclassification recall strategy based on text location relationship is adopted.(4)Text line aggregation and validation.Because multi-scale color clustering will lead to the recurrence of regions with stable color changes,in order to solve this problem,the circumscribed rectangular position and the color change rate of the candidate region are used to carry out the region de-duplication.In addition,compared with a single character,words with semantic information can better reflect the meaning of the image.Through the similarity of characters in the same text line,the candidate character regions are aggregated and the text lines are segmented into words by using vertical contour statistics,so as to achieve the goal of text detection.The algorithm is evaluated on the ICDAR2013 public dataset.The experimental results show that the proposed algorithm can obtain better detection results for texts with different conditions in natural scene images.The recall rate and accuracy of the proposed algorithm are 73.4%and 82.3%respectively.Compared with other text detection algorithms,the proposed algorithm achieves better detection performance.
Keywords/Search Tags:text detection and location, maximally external regions, color clustering, feature extraction, support vector machine
PDF Full Text Request
Related items