Font Size: a A A

Research On Natural Scene Text Detection Method Based On Instance Segmentation

Posted on:2022-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiuFull Text:PDF
GTID:2518306314962639Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As the most direct way to convey information,words has a history of 6000 years.It is no exaggeration to say that words are the cornerstone of human civilization and the ladder of human progress.With the development of Internet technology and the pop-ularity of mobile devices,image-based multimedia information has gradually become the main method of information dissemination.More and more text need to be attached to images for propagation.Detection and recognition of text in images has been widely concerned because of its broad application prospects,such as automatic driving,infor-mation retrieval,prescription reading,etc.Text detection in natural scene images has been developing very slowly as a very challenging direction,and the advent of deep learning era has given a strong impetus to natural scene text detection.Today's scene text detection is dominated by segmentation based methods,which can be subdivided into two categories:semantic segmentation based methods and instance segmentation based methods.Although both types of segmentation based methods have achieved promissing results on arbitrarily shaped text,there are still many pressing problems to be solved.Few researchers pay attention to the false positive problem in instance segmenta-tion based methods.This type of methods generally uses a region proposal to locate a text instance first,and then performs a segmentation operation in the proposal to gen-erate a candidate region mask.When judging whether a proposal can be retained,most existing methods use the classification score of a proposal as the only criterion.But in fact,there isn't absolute positive correlation between the quality of a proposal and its classification score.Naturally,such methods will produce three types of false positive examples:false positive caused by classification,false positive caused by regression and false positive caused by segmentation.In order to solve the false positive problem mentioned above,this thesis proposes a strategy for scoring mask quality obtained by segmentation process.This thesis argues that the quality of a proposal should be determined by both its classification accuracy and the quality of its corresponding mask,so instance segmentation based approaches should predict two scores,the classification score and the mask score of a proposal,and the mask score is used to evaluate the quality of a mask.According to PASCAL evaluation criteria commonly used in most natural scene text datasets,this thesis defines mask score as the intersection over union between the region composed by a mask and a real text region.According to this strategy,a new instance segmentation based text detection method is proposed in this thesis,which can be used to detect arbitrary-shaped text,such as horizontal,tilted and curved text.The model consists of four parts:backbone network,region proposal network,box head and mask head,and the first three parts follow the framework of classical instance segmentation methods.In this thesis,two workflows are designed in mask head of the proposed model,mask generation flow is used for segmentation operation to generate masks for the regions surrounded by a proposal,and mask score flow is used to predict mask scores.In order to allow mask score flow to better perceive mask generation process and predict mask score more accurately,this thesis also design a Mask Attention Module to connect two workflows.Whether a proposal is ultimately retained is jointly determined by its classification score and its mask score.Ablation experiments,comparison experiments,and time consumption experi-ments on four natural scene text datasets(i.e.,ICDAR2015,ICDAR2017MLT,CTW15-00,and Total-Text)demonstrate that the model proposed in this thesis can suppress three types of false positive examples in a simple,fast,and uniform manner,and the performance of the proposed model is comparable to some state-of-the-art methods.
Keywords/Search Tags:Natural scene text detection, Arbitrary-shaped text, Instance segmentation, Proposal quality, Mask score
PDF Full Text Request
Related items