Font Size: a A A

Design And Implementation Of A Document Image Analysis And Recognition System Based On Deep Learning

Posted on:2023-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2568306914977549Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Document image analysis and recognition has important practical value in image information extraction,document retrieval and other fields,and is one of the important research topics in the field of image processing.It is a challenging topic due to the flexible and complex layout of document images and the changeable shape of document targets.We conduct research on the document image analysis and recognition algorithm based on deep learning,including layout analysis,text recognition and table recognition,and develop a document image analysis and recognition system.The main contributions of this thesis are as follows:For image layout analysis,we construct a Chinese document image layout analysis dataset CDLA,which is divided into two parts,including training set and test set,and contains ten categories of document targets.At the same time,we propose a fine-grained annotation generation method to reduce the boundary error of manual annotation.Then,the Mask-RCNN framework is used on the CDLA training set to implement the layout analysis in the Chinese document images,and it is verified on the CDLA test set.For text recognition,we propose a lightweight text recognition model based on convolutional recurrent neural network,and the character-level recognition accuracy of more than 98%is achieved on our text line recognition test set.The model introduces an approximate nonlinear activation function based on MobileNet to enhance the fitting ability of the model,and uses the SE module to weight the network channels.In addition,we propose a method for synthesizing text line dataset as training set for a text recognition model.For table recognition,we combine a table text detection model based on semantic segmentation,a text line recognition model and a table structure recognition model based on encoder-decoder structure to convert table images to HTML code sequences,and it is verified on our table recognition test set.In addition,we also construct a text detection dataset TabTextDet for tabular image scenes to achieve accurate tabular text detection and recognition.Finally,we develop a document image analysis and recognition system based on the above algorithms,and the effectiveness of the above algorithms are verified.
Keywords/Search Tags:Document image analysis and recognition, Layout analysis, Text recognition, Table recognition
PDF Full Text Request
Related items