Font Size: a A A

Research On Text Detection And Localization Based On Natural Scene

Posted on:2020-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z LiFull Text:PDF
GTID:2428330620462261Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rise of deep learning and computer vision technology in recent years,the detection and recognition of scene text has been further developed.It has broad application prospects in the fields of scene recognition,blind navigation,cross-modal retrieval,and automatic driving.However,the scene text detection task faces many difficulties,such as background complexity,text diversity,and imaging uncertainty.This paper focuses on the scene text detection task,and divides it into two parts:text regions saliency detection and word-level instance localization,and establishes a dual-task learning model.The main research works of this paper are as follow:(1)A saliency detection algorithm of scene text regions based on multi-scale features fusion is proposed.The algorithm mainly includes two parts:Firstly,in order to solve the problem of text scale diversity in scene images,a multi-scale Features Fusion Layers-by-Layers Model(F~2L~2M)is established.This model combines semantic information of high-level features with detail information of low-level features through upsampling,standardization,and element-wise fusion,which can improve the recall of small-scale scene text regions.Secondly,For the problem of high false detection rate caused by the sample extreme imbalance,the Unbalanced Sample Learning Strategy(USLS)is designed and applied to the text saliency detection task.This Strategy adds weight modulate factors based on the cross-entropy loss function,which can dynamically adjust during the model training process.The model focuses on learning the features of the difficult classification samples,which reduce the false detection rate.(2)A multi-oriented scene text instance localization algorithm based on Location Sensitive Regression(LSR)is proposed.The LSR algorithm optimizes the direct regression algorithm based on the geographically weighted regression idea,and applies it to the text instance localization task.The basic idea of LSR is that the target of instance localization task should not only make the predicted vertex coordinate offset as small as possible,but also make the intersection-over-union between the prediction bounding box and the ground truth is as small as possible,so that the farther the textual pixel is from the regression target,the smaller the contribution weight to the target.The experimental results show that LSR is suitable for text regression task with large aspect ratio.Finally,for the multi-oriented and intensive targets such as scene text,an Advanced Non-Maximum Suppression(ANMS)algorithm is proposed to select the best bounding box of the target,which further improves the accuracy of scene text detection result.(3)A system related to Scene Text Detection and Recognition(STDR)is designed and implemented.It mainly includes two functions:the usage of STDR service and the collection of STDR data.It verifies the feasibility of the scene text detection and localization model based on dual-task learning proposed in this paper,and also provides a solution for the problems of the current public scene text datasets such as inaccurate labeling and small quantity.
Keywords/Search Tags:Scene image, Multi-oriented text detection, Dual-task learning, Deep learning
PDF Full Text Request
Related items