Font Size: a A A

Research On Scene Text Detection Based On Deep Learning

Posted on:2021-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:H J ShenFull Text:PDF
GTID:2428330614471426Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The text in the natural scene contains rich semantic information,which is of great significance for the understanding of the scene.At present,more and more smart devices use text in natural scenes.Therefore,in recent years,scene text detection technology has received extensive attention in the field of computer vision.However,due to the diversity of text in fonts,languages,and directions,the complexity of the background,and other external factors,text detection technology in natural scenes still faces many difficulties.This paper divides text detection into text detection methods based on position regression and text detection methods based on semantic segmentation.The main contents are as follows.(1)The text in the natural scene has the characteristics of multi-direction.At the same time,some non-Latin texts such as Chinese have large aspect ratio differences,and the positioning granularity is the text line.In response to these problems,the GVHBRC model is designed in this paper.The model does not need to preset multiple combinations of anchor frames,but performs cascade refinement through an anchor point at each location to obtain efficient area suggestions to cover the size range of the text target.At the same time,in order to achieve the alignment of the feature and the anchor point,the anchor point is used as the input,and the feature extraction is performed under the guidance of the anchor point.During feature extraction,the residual method is used to inject multiple spatial context information of different scales into the highest-level features of the FPN network to reduce the information loss of the highest-level features in the channel.Experiments show that the F values of the model on the MSRA-TD500 and RCTW-17 data sets have reached 87.5% and 70.3%,respectively.(2)In natural scenes,there are some text shapes that are curved,low-contrast text is difficult to distinguish from the background,and some small texts with a granularity of words have missed detections and misdetections.In response to these problems,this paper designs the PSEFGB model,which is based on semantic segmentation.By classifying pixels,you can effectively locate curved text,and use the attention mechanism to reconstruct features during feature extraction,selectively enhancing Contains information useful for current text detection tasks,reducing text misdetection.The use of integrated balanced semantic information to strengthen multi-level features makes the features more discriminative and reduces the problem of missed inspections.Experiments show that the F value of the algorithm on the SCUT-CTW1500,Total-Text,MLT2017 data sets has reached 85.16%,83.30%,72.47%,respectively.
Keywords/Search Tags:Scene Text Detection, Feature Enhancement, Cascade Region Proposal Network, Attention Mechanism
PDF Full Text Request
Related items