Font Size: a A A

Research On Scene Text Detection And Recognition Based On Deep Learning

Posted on:2022-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y SongFull Text:PDF
GTID:2518306509454404Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The detection and recognition of scene text is currently a popular research content in the field of computer vision,which is used to locate the text area in the scene text and recognize its character sequence.Scene text often contains a lot of rich text information,which can help us recognize the scene.The detection and recognition of scene text can be applied to many fields,such as unmanned driving,intelligent transportation,instant translation and visual search.Although the traditional printing text detection and recognition technology is relatively mature,there are still many difficulties in the detection and recognition of scene text,such as complex background,variable text form,and distortion of the perspective curvature of the text image.This thesis conducts an in-depth study on the task of scene text detection and recognition.The specific content is as follows:(1)Aiming at the problem that small-scale scene texts will be missed in the detection,this thesis proposes a scene text detection model based on Bi-DBNet.The proposed Bi-DBNet model adds a weighted bidirectional feature pyramid network(Bi FPN)to the original DBNet model to improve the original DBNet.In the original DBNet,Feature Pyramid Network(FPN)is used for feature fusion,but FPN is limited by one-way information flow and cannot effectively fuse multi-scale features,which has a certain impact on the accuracy of detection.The Bi FPN adopted by Bi-DBNet can cover all possible scales,realize cross-scale feature connection,and effectively fuse low-level and high-level features,so as to focus on small-sized scene text and improve detection accuracy.The experimental results show that the performance of the Bi-DBNet model proposed in this thesis has been improved on both the ICDAR 2015 and MSRA-TD500 data sets.(2)Aiming at the problem of text perspective distortion and text irregularity in scene text,this thesis proposes a Combination Rectification Network(CRN)model based on a correction method to recognize scene text.The CRN model proposed in this thesis combines the pixel level multi object rectification network(MORN)and geometric rectification(TPS)methods,which weaken the geometric constraints and effectively improve the performance of the correction network.The sequence-tosequence recognition model based on the attention mechanism can recognize scene text more accurately.The experimental results show that the performance of the proposed CRN model is higher than the existing scene text recognition model based on the correction method on the two data sets of SVT and SVTP.
Keywords/Search Tags:Scene text detection, Scene text recognition, Sequence to sequence model, Attention mechanism, Feature pyramid network
PDF Full Text Request
Related items