Font Size: a A A

Natural Scene Based On Deep Learning Research On Text Detection And Recognition Technology

Posted on:2022-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:K X ZhangFull Text:PDF
GTID:2518306746496274Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Words,everywhere in the natural environment,is one of the main ways for people to information transmission and communication with each other.In recent years,text detection and recognition technology in scene images has become a research focus in computer vision,natural language processing,instant translation and other fields,and has received strong attention from scientific and industrial fields.However,due to the influence of multi-scale,multi-direction and illumination conditions in complex scenes,the text detection and recognition task in natural scenes is still a challenging task.In order to further improve the accuracy and robustness of scene text detection and recognition,this paper mainly studies the problem of natural scene text detection and recognition,and improves and optimizes the text detection algorithm and text recognition algorithm respectively.The specific research contents and results are as follows :(1)Aiming at the arrangement of natural scene texts usually stacked in an irregular way and ordered in random directions,this paper designs a scene text detection algorithm in any direction based on image semantic segmentation.First,the convolutional neural network(CNN)is used to extract the feature image containing rich semantic information content in the natural scene image,and with the help of the idea of semantic segmentation,the grid area of each pixel in the image is predicted by the feature,whether it is a text or not,and the frame is directly framed.Regression obtains parameter-related information about all text boxes in the image,thereby obtaining text detection results.(2)Aiming at the problem that it is difficult to recognize text in complex scene images,this paper proposes a text recognition algorithm based on attention mechanism(Attention)combined with connection time series classification(CTC).The text recognition problem of natural scene is transformed into sequence marking problem,and the correlation between image and text sequence is used to overcome the problem of character segmentation.Firstly,CNN is used to generate ordered feature sequences from the whole word image.Then,the generated feature sequence is feature coded through bi-directional long short term memory(Bi-LSTM)network.Finally,an integrated module of CTC and attention mechanism is designed to decode and output text sequences,which can effectively solve the problem of unconstrained attention coding and the problem that long text sequences cannot be effectively predicted through CTC automatic coding structure.This paper has done text detection,text recognition and end-to-end experiments on ICDAR 2013,ICDAR 2015,MSRA-TD500,SVT,IIIT5 k and other public data sets,and carried out comprehensive analysis and verification.Compared with the relevant classical algorithms,the algorithm proposed in this paper has achieved competitive results.
Keywords/Search Tags:deep learning, text detection, text recognition, scene text, end-to-end
PDF Full Text Request
Related items