Font Size: a A A

Rotated Text Detection Algorithm Based On Deep Learning

Posted on:2020-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:M LiuFull Text:PDF
GTID:2518305732977229Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In computer vision,text detection is a hard and classic problem.It is the premise of text extraction.Text detection aims at mark the position of the target object in the image.Generally,the text in the image is divided into two categories:artificial text arranged neatly,detection is simple.Scene text has more complexity in font,color,shape,and orientation.Traditional text detection algorithms require manual design features and are not robust enough for text detection in scenes.For characters with varying aspect and attitude,end-to-end detection framework has fixed position and aspect ratio and cannot obtain accurate detection information during training and testing.TextBoxes algorithm uses a simple deep neural network to build an end-to-end text detection system.The full convolution regression network(FCRN)uses the synthesized images for text detection.However,although these algorithms are useful for text detection,they are only suitable for horizontal slant character detection.However,in the natural scene,the direction of text is changeable,which requires a detection algorithm adapted to any direction.In this paper,a rotation invariant text detection algorithm based on deep learning is proposed.The algorithm is based on U-net network idea,which combines high and low layer features to satisfy different proportions of text lines.The algorithm model can be decomposed into three parts:feature extraction network,merge layer and output layer.Feature extraction network adopts 5-layer convolutional pooling structure,and each convolutional layer is followed by a pooling layer.The merge layer use join operations to merge feature maps of the same size.The output layer uses the INMS(Inclined non-maximum Suppression)to select the score of the candidate box and the four angular coordinates.? improved the basic feature extraction network to adapt to multi-dimensional input:Adopting the FCC network based on VGG16,there is no need to preprocess the image,it can be adapted to any size of the input image.At the same time,the problem of repeated storage and repeated detection is avoided,and the detection is convenient and efficient.? improved the multi-direction text strategy to reduce the algorithm complexity:A large number of small convolution kernels of 3*3 and 1*1 are used.The 1*1 convolution kernel is beneficial to dimension reduction and nonlinearity.The 3x3 convolution kernel is beneficial to increase the spatial receptive field and reduce parameters.The image pre-emphasis strategy using affine transformation can make the training set complete,improve the performance of the model,and set the appropriate candidate frame rotation strategy for multi-directional text.? improved the NMS candidate box strategy to adapt to multi-direction detection:Adopting the candidate box selection strategy of INCM(Inclined NonMaximum Suppression),it can adapt to text detection in multiple directions,and it is better for tilted text lines.After many times and multiple samples test,this paper proposes a rotation text detection algorithm based on deep learning algorithm.The algorithm has the characteristics of high generalization performance,fast detection speed,accurate detection,compatibility with small targets and multi-scale targets,and can reduce labor and detection cost.At the same time,the algorithm is generalized.It also has a good application effect in the natural scene data set and the industrial printing images,meeting the detection requirements of the natural scene text and the industrial scene text.
Keywords/Search Tags:target detection, rotation invariance, deep learning, text detection
PDF Full Text Request
Related items