Font Size: a A A

Research On Deep-Learning-Based Scene Text Detection And End-to-End Recognition

Posted on:2021-10-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L LiuFull Text:PDF
GTID:1488306464981999Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Text is not only an important carrier of human thought and communication,but also an important way for the inheritance of civilization.As the crystallization of human wisdom,text widely exists in the living environment of modern society in the form of symbols.In recent years,due to the development of deep learning technology,scene text detection and recognition plays a very important role in many fields such as life,education,medical treatment,navigation,human-computer interaction,automatic driving,finance and retrieval,which is a hot research direction in the field of pattern recognition and computer vision.Precise positioning of text is a very challenging research topic.On the one hand,text in natural scenes varies from place to place,which may be accompanied by complex background,diversified text aspect ratio,multi-scale,multi-direction,irregular arrangement and arbitrary shape.On the other hand,due to the randomness of the artificial shooting,the number of samples,the change of angle,the clarity of characters,the shade of light,and the ambiguity of the text interval,it also brings great challenges to the detection and recognition of scene text.This thesis studies the detection and end-to-end recognition of text in natural scenes based on deep learning,and focuses on exploring flexible,efficient and novel modeling methods for text shape,so as to overcome the difficulties in scene text detection and end-to-end recognition.The research of this paper is carried out in the following three aspects:(1)The arrangement of characters in natural scenes may be stacked in an irregular way.The previous detection methods of scene characters were limited to the positioning method based on rectangular boxes,so the text rendered with non-rectangular cannot be processed robustly.An arbitrary quadrilateral detection algorithm for modeling text shape is proposed.Based on the characteristics of natural scene text itself,a quadrilateral sliding window method is proposed to recall the text,and a sequence-independent discrete bounding box detection structure is proposed.Experiments show that this method is more flexible to locate arbitrary size and multioriented text in various scene images,and can also reduce background noise interference,and robust to locate stacked and boundary text.(2)Natural scene text may also be presented in any shape,such as curved or wavy text.In view of the past detection methods of scene text can only adapt to the disadvantages of straight text,a detection system which can locate arbitrary shape scene text is proposed.Different from the previous method,this method uses the mask proposal method to locate the text area by pixel and reconstruct the polygon's compact bounding box.Experiments show that this method can detect the text of arbitrary shape scene in different scenes and reduce the error detection results effectively.On the other hand,in order to more effectively evaluate different methods in arbitrary shape scene text detection performance.Collecting a dataset containing a large number of curve and arbitrary shape scene text,at the same time,aiming at the disadvantages of existing evaluation criteria,put forward a set of suitable for arbitrary shape performance evaluation of robust text location evaluation criteria.(3)Aiming at the disadvantages of low efficiency of text detection and recognition performance in the existing end-to-end natural scene,an end-to-end detection and recognition system capable of real-time processing of text in scenes with arbitrary shapes is proposed.Compared with previous methods,this method uses Bezier curve to connect the detection and recognition features,reduces the number of parameters in the detection process,and aligning the irregular shape into a regular form that can be easily recognized through smooth parametric boundary.The method presented in this paper achieves the best performance on multiple datasets and has a great advantage over previous methods in terms of speed.
Keywords/Search Tags:Multi-oriented, Text detection, Text recognition, Curve text, End-to-end, Arbitrary shape, Evaluation protocol
PDF Full Text Request
Related items