Font Size: a A A

Research On Convolutional Neural Network-based Scene Text Detection And Multi-orientational Character Recognition

Posted on:2017-08-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:A N ZhuFull Text:PDF
GTID:1318330503958159Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of intelligent transportation, blind navigation and intelligent logistics applications, scene text detection and recognition from signsboards, billboards, license plates, books and articles packaging has become a popular research direction in computer vision field. Scene text images have not only low resolution, uneven illumination, blur and affine distortion, but also contain complex background texture, such as trees, brick and glass, etc. The text itself varies in color, font, size, orientation and arrangement. To solve the problem by the existing optical character recognition technology results in low precision and poor adaptability to different environment. Therefore, how to detect and recognize the scene text fast, accurate and robust, is still an open problem.From observation, we found that, although background texture interference is complex in scene text images, the texture feature in character stroke regions are relatively constant. Based on that, we used convolution neural network to extract the texture feature of text by combining the geometrical feature or scene context feature. It could suppress background texture interference and enhance the accuracy and adaptability. Besides, to improve the robustness for character recognition in multi-orientation, we proposed a character recognition model by using texture feature and structure feature. The bag-of-words model and SVM systems are used for classification. The research presents in the two aspects: scene text detection and character recognition. We obtain some achievements as below.Firstly,because of the convolutional neutral network could obtain richful high-level information through layer structure layout work, which can extract the object feature with complex background inference, we propose a new scene text detection method by using convolutional neutral network to extract the texture feature of text and design a texture feature and geometrical feature-based support vector machine classifier to suppress the non-character regions. Besides, to accurately detect text region with multi-orientation, we propose a two-layer mechanism which computes similarity score of regions to filter out background in the first layer and a HOG features-based SVM classifier for verification in the second layer for rectified candidate text regions. The Experimental results show that this algorithm is robust for detecting text with multi-scale, multi-orientation and different intensity. It can suppress the background texture effectively and improve the accuracy and adaptiveness for scene text detection.Secondly, by using the scene segmentation model, we propose a text detection method by combining convolutional neural network and scene context feature. For text and non-text classification, most methods focus on textness feature extraction, like dense edges, stroke width and gradient distribution, etc. Thus, background objects which behave like text are often misclassified. Therefore, we propose to use scene context feature to aid text detection task. First, we use TextonBoost and a fully-connected conditional random field to estimate pixel-level probability of 14 scene classes, like tree, signboard, wall and sky, etc. Meanwhile, we extract and extend the extracted maximum stable extremal regions. Then, for each region, all the contained pixel's scene class' s probabilities are averaged to be considered as its scene context feature. And the scene context feature is inputted to a convolutional neural network concatenating a SVM classifier. Finally, a hierarchical grouping method groups characters to text regions by context feature, geometrical information and color. The experimental results show that our method could reduce the false alarms in some scene with low probability of text. So it could improve the text detection accuracy.Thirdly, to recognize characters in different direction, we propose a character recognition model by using texture feature and structure feature. Currently, most character recognition technologies are proposed to recognize characters in the horizontal direction. This is no normal pattern for multi-orientation characters. To better recognize character in multi-direction, we propose an approach which takes advantage of relative orientation of each pair sampled points as well as the relative location of them. First, the keypoints are sampled uniformly in normalized images. We compute the orientation removed histogram of gradient feature of any two sampled points. Meanwhile, we record their location relationship which represents the structure feature. Then these two features are both projected to bag-of-feature words and classified by a SVM classifier. The character's feature is rotate-invariant. So this model can process text in different orientation. The evaluate performance of our method on both standard horizontal character data sets and collected multi-orientation character data sets reflects it could achieve high recognition accuracy.
Keywords/Search Tags:Text detection, Character recognition, Natural scene images, Texture feature, Convolutional neural network, Skewness correction, Scene context, Rotational invariance
PDF Full Text Request
Related items