Deep Learning Based Scene Text Recognition

Posted on:2017-01-16

Degree:Master

Type:Thesis

Country:China

Candidate:P Huang

Full Text:PDF

GTID:2308330482981823

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Different from common visual elements, text in natural scenes conveys rich information of high level semantics, which plays a key role in the scene understanding. In addition, in the industrial areas, there is a strong demand of the technique of text recognition in natural scene. Up to now, text recognition has been widely used in many fields, such as virtual reality, human-computer interaction, image retrieval, unmanned, license plate recognition, industrial automation. Traditional optical character recognition(OCR) technique mainly works on the high-quality document images. OCR technique has already achieved a high recognition rate under the circumstance that the background of the image is clean and the characters in the image are simple and neatly arranged. Different from document image based character recognition, recognizing characters in the natural scene image is more challenging. Because the natural scene image has many difficult elements, such as a complex background, the low resolution, the diverse fonts and the randomly distribution. Therefore, conventional OCR cannot be applied to the natural scene image to do text recognition. As a fundamental work, the continuous development and breakthrough of natural scene based text recognition has far-reaching significance and practical value. This paper proposes an approach of text recognition in natural scenes image by combining deep learning. The main work is as follows:(1) A context-aware image decoding method based on CNN and BiRNN is proposed. Taking advantage of CNN, the high-level visual features can be obtained from the underlying pixels. And exploiting the characteristics of local perceptual of CNN, the positional relationship between the high-level features and the underlying pixels is built. Then making use of BiRNN, the global information of the image can be obtained. Experiments show that this decoding method has a good ability of expression.(2) The text decoding method based on ARSG is proposed, in which character alignment and recognition are completed at the same time. ARSG uses RNN to perform the sequence labeling task. In the per-character classification process, we align each character by finding out the attention points of current RNN state. At the same time, heuristic rules and delay generation technique are used to improve the efficiency and accuracy. Experiments show that this method can lead to better character alignments and recognition results.(3) One efficient deep learning framework is achieved. This framework can support a variety of neural network architecture, and provide a series of effective training strategies. The availability of our approach has been proved by this framework. Experiments show that, compared to earlier works, our approach has higher accuracy and better generalization ability.

Keywords/Search Tags:

Character Recognition, Natural Scene Image, Deep Learning, Image Understanding, Semantic

PDF Full Text Request

Related items

1	Research On Key Techniques Of Video Semantic Understanding Based On Dynamic Scene Understanding
2	Research On Optical Character Detection And Recognition Of Natural Scene Based On Deep Learning
3	Research On Scene Semantic Understanding And Its Key Technologies In Complex Environment
4	Scene Graph Generation For Image Semantic Understanding And Represention
5	Research On Scene Character Recognition Technology In Image
6	Research On Key Technologies Of Scene Understanding Based On Machine Learning
7	Environment Recognition And Semantic Understanding For Mobile Robots
8	Design And Implementation Of A Natural Scene Text Recognition System Based On Deep Learning Algorithm
9	A Research Of Natural English Character Recognition Based On Deep Learning
10	Research On Image Semantic Understanding Based On Deep Learning