Font Size: a A A

Natural Scene Text Detection Based On Fully Convolutional Network

Posted on:2019-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2428330545986978Subject:Photogrammetry and Remote Sensing
Abstract/Summary:PDF Full Text Request
Text is the abstract expression of human wisdom,it appears in every corner of our lives.In recent years,text detection has set off a wave of research upsurge in the field of computer vision,widely used in intelligent cities,artificial intelligence,scene interpretation and other aspects,showing great market potential.The demand for the scene of text detection is from simple to complex,from single to multiple.However,the traditional text detection methods are mainly aimed at simple text scenes.The main purpose of this topic is to sum up the previous research results and combine the latest text detection techniques.On the basis of deep learning,we propose an efficient and stable text detection algorithm for multiple natural scenes.The main contents of this paper are as follows:i)Natural scene text synthesis.In order to meet the requirements of training data for deep learning methods,this paper synthesized a large number of data close to real data based on scene depth map and segmentation graph,and derived a variety of data formats to apply to different training networks.ii)Rough extraction of text regions in panoramic image.This paper makes use of the characteristics of panoramic image to cut out the invalid part of the image.The image noise is smoothed by Gauss filter and median filter,and the image enhancement problem is solved by guided filter.Then we use the grayscale edge gradient detection,image density detection and contour extraction of panoramic images to find the text proposal regions.Then,clip panoramic image candidate regions to put into the detection network.iii)Design a multi-task text detection network based on full convolution neural network.First,scale and cut the input images to expand the input data.In this paper,we use ResNet50 network which merges up the feature images by layer,which means that the abstract information and the detail information are taken into account at the same time.Get the image probability score map and geometric score map,using the balance entropy loss function to balance text positive and negative samples.Then obtain the candidate regions are regressed and the polygonal text detection boxes.At the same time,the residual network is introduced into the training process to improve the accuracy and efficiency of experimental detection.iv)Image post-processing.In this paper,we use the inherent geometric features such as the aspect ratio,the area size,the angle range of the signboard,and the topological constraints in the image space to filter out the candidate areas of panoramic images.The method of local non-maximum suppression is used to filter the detection network candidate box and remove the pseudo-text region.The method of pixel merging is used to reduce the time complexity of post-processing function.In this paper,network training and testing are carried out on three kinds of data sets:standard dataset,Chinese dataset and panoramic image dataset.The accuracy can reach 84%,69%,71.2%in each dataset,which is 5%to 7%higher than other methods,and 13%higher than the traditional Adaboost method in panoramic images.The time efficiency is 2.5 times higher than the VGG16 network and 10%to 50%higher than the other detection networks as a whole.
Keywords/Search Tags:Deep learning, Natural scences, Text detection, Fully convolutional network, Panoramic images
PDF Full Text Request
Related items