Font Size: a A A

Research On Deep Learning Based Scene Text Detection

Posted on:2019-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:C P PeiFull Text:PDF
GTID:2428330545971543Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the era of big data and the popularity of mobile shooting devices,people have become more and more eager to extract and recognize the texts existing in natural scene images.The detection and recognition of texts in natural scenes,as an important branch of computer vision has attracted significant attention research.At the same time,the mass production of natural scene images containing text also provides a data foundation for this task.Text is an important tool for recording and transmitting information.Extracting and recognizing words in images through artificial intelligence technology can make people's lives more convenient and automated.The traditional OCR technology can only handle optical images with relatively clean backgrounds,such as documents scanned by electronic devices such as scanners and digital cameras.The traditional OCR technology can achieve better results when the text in the image is printed and aligned.However,the images in the natural scene often have factors such as uneven illumination,occlusion of the background to the text area,inconsistent fonts,and irregular text arrangement,which affect the detection and recognition.Compared with the detection of English text under natural scenes,the detection of Chinese characters is more difficult.For English 26 letters,the GB 2312 standard contains 6763 Chinese characters,of which the number of commonly used Chinese characters has exceeded 3,500,and the Chinese character fonts are more diverse than English.There are fewer Chinese character data sets in natural scenes.These problems have brought challenges to the detection and recognition of Chinese characters in natural scenes.This article researches the relevant problems of text detection under natural scene,including the designof a new natural scene text detector based on deep learning,then verify its performance under the English and Chinese data sets,respectively.At the same time,in order to solve the problem of insufficient number of natural scene text data sets,the English data sets of natural scenes are artificially generated for model training,and the Chinese data sets of natural scenes under real scenarios are collected and consolidated.In the design of the detector,this article uses the target detection algorithm SSD: Single Shot MultiBox Detector as the basic framework.During the experiment,it was found that the unmodified SSD algorithm performed poorly in text detection in natural scenes and made some improvements.First of all,the multi-category SSD algorithm was changed to two categories for text detection in natural scenes,namely text and background.Secondly,the aspect-ratio in SSD algorithms is mostly applicable only to target detection tasks.When processing text detection tasks in natural scenes,the training set is regressed by the K-means algorithm,and five aspect-ratio calculations are applied to natural scene text detection tasks.Again,the batch normalization layer is added after each convolutional layer of the SSD so that the characteristic map is normalized after each convolution.The data distribution is normalized to zero variance of 1,so as to achieve the purpose of accelerating training and improving accuracy.Finally,the FCN(Full Convolutional Network)is used to pre-process the detected picture,generate a probability map of the text distribution area for it,and then synthesize the probability map and the original picture as new pictures to be detected through the detector.Through experiments,it has been found that the detection accuracy of the detector can be greatly increased after the above steps.When training natural scene text detectors,it is labeled as a rectangular box that contains text areas.This article uses ICDAR2013,COCO-TEXT,RCTW-17 and MSRA-TD500 for training and testing.This paper adopts the method of adding batch normalization layer,returning aspect-ratio and adding regional probability graph respectively,and evaluates the influence of the three on the accuracy of the algorithm.Compared with other similar algorithms,the design algorithm in this paper has a good effect on the focused natural scene document image data set(ICDAR2013,msra-td500).In this paper,three evaluation methods are used to evaluate the accuracy of thealgorithm: Recall,Precision,and F-measure.In ICDAR2103 dataset,this paper uses ICDAR2013 Standard,DetEval,and IoU to evaluate the algorithm.This paper also studies the related problems of scene character recognition.This paper improves the CRNN algorithm to recognize text field.Here,this article focuses on the recognition of Chinese characters.Since the Chinese character recognition data set is very rare and the output in the detection step is a text line,this paper uses the Chinese text line identification data set artificially generated by the Chinese corpus to train the model.This paper improves the SSD target recognition algorithm to form a new natural scene text detection algorithm,collects natural scene Chinese datasets,performs large-scale experimental analysis on the model,and compares it with multiple mainstream natural scene text detection algorithms.Experimental results show that the algorithm has achieved good detection results on various data sets.Finally,the CNN+CTC architecture was used,and the pooling layer was considered to be removed.This architecture was verified through a large number of experiments on the manually generated data set,which also achieved a good text recognition effect.
Keywords/Search Tags:Natural scene text detection, text recognition, OCR, CNN, FCN
PDF Full Text Request
Related items