Research On Text Detection Method Of Natural Scene Based On Deep Learning

Posted on:2021-08-09

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X H Wang

Full Text:PDF

GTID:1528307100474514

Subject:Electronic Science and Technology

Abstract/Summary:

PDF Full Text Request

The understanding and analysis of natural scenes has always been a hot issue in image processing,pattern recognition,computer vision and other fields.As a special visual element,the text in natural scenes often contains rich high-level semantic information,which is a more accurate description and supplement to the scene content.Therefore,the text detection method in natural scenes has extremely high academic research value.At the same time,it has broad application prospects in many fields such as automatic driving,blind assistance,geographic information annotation,robot automation,and can produce huge social benefits.Text detection in natural scene images and videos faces many challenges.First of all,unlike document text,text line in natural scenes has the characteristics of arbitrary directions.Secondly,multi-language text of different scales in natural scenes requires a more robust detection algorithm.Finally,some unfavorable factors during the video shooting process,such as occlusion,uneven lighting,violent jitter of equipment,etc.will lead to the interruption of text region integrity,color distortion,image blur and other problems,resulting in the performance degradation of the detection algorithm.Based on the deep learning algorithm,this thesis studies the main challenges of text detection in natural scenes.The main research work and innovations are as follows:1.Aiming at the problem of Multi-orientation text detection,this thesis proposed an arbitrary orientation text detection algorithm based on convolution network of coarse to fine supervision,which consider the structural characteristics of the text region.The algorithm is based on the idea of separation and combination.Based on the coarse prediction of the text region,fine character shape segmentation and text central line centerline prediction are obtained.The segmentation results with the same central text line are grouped together from bottom to top,and then form the final detection results.This method can locate the text region accurately,and the prediction of the centerline attribute ensures that the algorithm can detect the text in arbitrary direction.In order to improve the performance of semantic segmentation,a multi-scale feature pyramid structure is adopted in the network design phase.In this structure,high-level features are up-sampled and combined with shallow features layer by layer to enrich semantic information.At the same time,the multi-level supervised learning method is used to improve the generalization ability of the network.This method selects the corresponding supervised information according to different learning tasks.Experiments show that the algorithm can effectively detect arbitrary orientation text in complex scenes.2.Aiming at the problem of multilingual text detection in different scales,this thesis proposed a multilingual text detection algorithm based on the combination of precise text region segmentation and scale estimation.For precise text region segmentation,this thesis proposed a new representation of text region based on text boundary,which can accurately separate the small text objects and estimate the multilingual text of arbitrary direction and shape.At the same time,based on the relationship between image resolution and text region scale,the method enhances the multi-scale feature expression of text region by image pyramid input,and estimates the scale of text region to integrate the detection results in different prediction images.In the network design phase,the network uses a parallel multi-scale feature fusion structure to obtain high-resolution feature representations,while adding a residual pooling module to further enrich background context information.The results show that the algorithm can effectively detect multilingual text in natural scenes,and has certain advantages compared with mainstream algorithms.3.Aiming at the problem that the performance of existing text detection algorithms is degraded due to unfavorable factors such as occlusion,uneven lighting,and violent jitter of equipment in video shooting,this thesis proposed a new video text detection algorithm based on layout constraint tracking.This algorithm uses the detection and tracking framework to track the text by detecting the text in each video frame,and uses the time redundancy of the text in the video to eliminate the false detection and improve the detection performance.In order to improve the performance of single frame text detection in video,a new fast text detection network combined with semantic segmentation is proposed,which can accurately locate text regions by enhancing semantic information in extracted features.To improve multi-text tracking performance,this thesis proposed a text tracking algorithm based on layout constraints,the layout similarity between multiple text regions is used to model the relative position by a new data association cost function.The tracking results are obtained by optimizing this function.The experimental results demonstrated the effectiveness of the proposed method for scene video text detection and tracking.

Keywords/Search Tags:

Natural Scene Text Detection, Deep Learning, Arbitrary Orientation, Multilingual Text, Video Text Detection and Tracking

PDF Full Text Request

Related items

1	Research And Implementation Of Natural Scene Text Detection Method Based On Deep Learning
2	Research On Arbitrary Orientation And Scale Text Detection Algorithm Based On Deep Learning
3	Research On Multi-orientation Text Detection Algorithm In Natural Scene Based On MSER
4	A Research On Detection Methods For Scene Text In Natural Videos
5	Research On Scene Text Detection
6	Research On Detecting And Identifying Scene Texts Of Arbitrary Distribution Based On CNN
7	Deep Learning-Based Methods For Natural Scene Text Detection
8	Research On Text Detection Technology In Natural Scene Pictures
9	Research And Application Of Arbitrary Shape Text Detection Algorithm In Natural Scenes
10	Research On Deep-Learning-Based Scene Text Detection And End-to-End Recognition