Font Size: a A A

Scene Text Detection Based On Deep Learning

Posted on:2020-11-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z D LiuFull Text:PDF
GTID:1368330578482981Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Scene text detection is an important research direction in the fields of computer vi-sion and document analysis.It has a wide range of application scenarios,such as license plate recognition,unmanned supermarkets,geolocation,blind reading,and automatic driving,etc.Despite years of research,many research achievements have been made in scene text detection algorithms.However,due to the large changes in the language,layout,scale,font,appearance,direction,etc.of the scene text,as well as the complex and diverse background of the scene image,which brings great difficulties and chal-lenges to the task of scene text detection.At present,scene text detection with arbitrary direction and shape,as well as in close and adjacent positions,is still challenging.In recent years,deep learning has achieved widespread success in many computer vision problems.Based on deep learning technology,this thesis is aimed at efficient scene text detection,focusing on effective,novel and robust feature acquisition meth-ods,designing network models,and proposing solutions for the problems in scene text detection tasks.The main work and innovations of this thesis are as follows:(1)This thesis proposes a scene text detection method based on text region in-formation prediction model to solve the problem of scene text detection with arbitrary direction.The method is based on the idea of text stroke and text center block,which splits a text instance into two components:text stroke and text center block.The text stroke region and the text center block region are respectively predicted by a fully con-volutional neural networks with the same structure.Following that,a text bounding box generation algorithm is used to realize the combination of the two components.Exper-imental results show that the proposed scene text detection method can not only detect scene text with multi-scale and arbitrary direction,but also realize scene text detec-tion with multi-lingual.In addition,the proposed method does not need to specifically regress the orientation information of the scene text.(2)This thesis proposes a scene text detection method based on attention and bidi-rectional LSTM model to solve the problem of scene text detection with arbitrary shape.In this method,a multi-scale context-aware feature extraction module(MCFE)is de-signed to extract the features with rich context information to improve the precision of the method.A bidirectional LSTM module(BLSTM)is designed to improve the pre-cision of the method by using the spatial sequence characteristic between characters.The attention module(Attention)is designed to estimate the importance of features of different layers and realize recombination to improve the recall of the method.In this thesis,the contour of text area is proposed to represent the scene text region with arbi-trary shapes.Besides,an algorithm of irregular shape text center block label generation is proposed.Experimental results show that the proposed scene text detection method can detect scene text with arbitrary shape and multi-lingual.(3)This thesis proposes a scene text detection method based on multi-level fea-ture enhanced cumulative network to solve the problem of scene text cohesion in close or adjacent positions.In this method,the multi-level features enhanced cumulative(MFEC)module is designed to realize multi-scale and irregular shape scene text detec-tion.Spatial attention module and channel attention module are introduced to improve the cumulative enhancement ability of atrous convolution feature representation.Multi-level feature fusion module is designed to integrate MFEC features of different levels and realize adaptive coding of scene text information.Experimental results show that the proposed method can detect scene text with irregular shape and multi-lingual,over-come the adhesion between close or adjacent scene texts,and perform well in several public datasets.
Keywords/Search Tags:Scene Text Detection, Deep Learning, Attention Mechanism, Feature Enhancement, Feature Fusion, Instance Segmentation
PDF Full Text Request
Related items