Font Size: a A A

Research On Scene Text Detection Method Based On Anchor-free Network

Posted on:2022-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:S S MaFull Text:PDF
GTID:2518306332977479Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Scene text detection refers to locating the text position from the image,which is widely used in image retrieval,robot navigation,industrial automation and real-time translation,and has very broad research and application value.There are great differences in the size,arrangement direction and contrast of texts in natural scenes,and they are also affected by noise interference,shooting angle and illumination changes.Due to these factors,traditional text detection methods have poor detection effect on scene texts,and cannot be applied to complex scene text detection.With the rise of deep learning,scene text detection methods based on deep learning have made great breakthroughs,and many excellent scene text detection methods have emerged.However,most text detection methods based on deep learning still have the following problems:When detecting long texts,there will be prediction failure and fracture;There are serious imbalance between positive and negative samples and scale insensitivity,which will reduce the training efficiency and detection accuracy of the model;There is no balance between accuracy and speed.For example,some methods have high accuracy,but the detection speed is too slow to be applied to the actual production environment.Aiming at the above problems,this paper proposes a scene text detection method based on anchorless network based on the latest FCOS.Based on FCOS,this paper uses arbitrary quadrilateral as network output instead of rectangular box,which enables the detector to accurately detect text edges.This paper uses Darknet-53 as the backbone network to enhance the ability of basic feature extraction.In this paper,the Loss function is also improved.Dr Loss is used as the Loss function of text classification to improve the imbalance between positive and negative samples.In terms of position regression loss,this paper uses vertex regression method to directly calculate the absolute difference of four vertices in the predicted region and the true region.In addition,a diagonal adjustment factor is proposed,which can make the predicted detection box closer to the text example.The experiment shows that the diagonal adjustment factor improves the accuracy of position regression.In this paper,the centrality loss of arbitrary quadrilateral is proposed,which can reduce the weight of low-quality bounding boxes far from the center point.In order to improve the detection accuracy of the model,this paper proposes a scene text detection method based on attention mechanism and context extraction.In this paper,CSP structure is introduced and CSPDarknet-53 is used as backbone network,which further improves the basic feature extraction ability of backbone network.Inspired by AC-FPN,context extraction module(CEM)is used to make feature fusion more adequate.Attention mechanism is also introduced in this paper,which can highlight important features and weaken the interference of irrelevant information on detection results.Ablation experiments show that this attention mechanism can greatly improve the accuracy of text detection model.Finally,experiments and analysis are carried out on ICDAR2015,MSRA-TD500 and ICDAR2013 data sets.The precision rate,recall rate and F value of this method on ICDAR2015 data set are 87.9%,83.1%and 85.4%,respectively,and 8.7 pictures can be detected per second.Experimental results show that the proposed method can significantly improve the precision and recall rate of text detection in natural scenes,and has strong practicability.
Keywords/Search Tags:Scene text detection, Deep learning, Convolutional neural network, Attentional mechanism
PDF Full Text Request
Related items