Font Size: a A A

Research On Document Text Detection And Recognition Based On Deep Learning

Posted on:2022-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:G B WangFull Text:PDF
GTID:2518306566991099Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the text area detection,multiple convolutions and down-samplings will make the small text areas of the original images continue to shrink or even disappear in the subsequent feature maps,resulting in low accuracy of small text area detection.Moreover,each character is supposed to be labelled to detect single characters,it is very complicated to make the training set.Especially when there are a large number of characters in the bill images,it is of high cost to label each character to makes the training set.In the process of text recognition,the typical OCR model based on deep neural network recognizes the texts in natural scenes.Due to the different forms and angles of the texts in natural scenes,the recognition model is complex and the number of parameters is large,to ensure good recognition accuracy,which leads to difficulties in model training and slow recognition speed.Based on this,this paper conducts research on the following three issues:(1)In order to deal with the problem that small text areas are easily missed in text detection,this paper proposes a text detection model LW-Char Net(Light Wave Character Network)based on characters.This model optimizes the network structure of the CRAFT(Character Region Awareness For Text detection)model,combines the feature map without down-sampling with the feature map with one or more downsampling,and increases the number of layers of up-sampling,which makes the features of small text areas more obvious and improves the recognition accuracy of small text areas.The paper compares LW-Char Net model with five text detection baseline models such as CRAFT,and verifies the effectiveness of LW-Char Net by comparing the precision,recall and hmean value.(2)In order to deal with the high labeling cost of character-based text detection model,the paper proposes a data labeling solution related to LW-Char Net.Based on the idea of transfer learning,the LW-Char Net model is trained by using the existing character-level labeling datasets,then the convergent model is used to detect the character areas of the bill texts,and the pre-labeling results of the character areas are obtained.Finally,the pre-labeling results are manually corrected,which greatly reduces the labeling cost.(3)Considering that compared with bill text recognition,the text recognition model is too complicated,which leads to the difficulty of model training and slow recognition speed,this paper proposes a text recognition model CBF-Net(Convolution al Bidrectional-LSTM Fully-connected Network).The model simplifies the network structure of CRNN(Convolutional Recurrent Neural Network).CBF-Net consists of a single-layer Bidrectional-LSTM(Bidrectional Long Short Term Memory Network)and a fully connected RNN(Recurrent Neural Network)module.The model parameters are reduced compared with these of the original model,which helps to simplify model training and improve recognition speed.By comparing the accuracy and recognition speed with baseline models CRNN and CNN(Convolutional Neural Network)in the experiment,the paper finds that CBF-Net has high accuracy and fast recognition speed in the recognition of bill datasets.
Keywords/Search Tags:Deep learning, OCR, Text detection, Text recognition, Bill recognition
PDF Full Text Request
Related items