Research On Layout Analysis And Text Line Extraction Of Document Image

Posted on:2020-09-03

Degree:Master

Type:Thesis

Country:China

Candidate:Q Zhang

Full Text:PDF

GTID:2428330590494383

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Digitalization of documents has broad applications.Using Optical Character Recognition technology,we can directly extract the data we need from the image,which will greatly facilitate the storage,processing and retrieval of information,and also reduce the burden of manual input.The accuracy of text box extraction is an important prerequisite for the successful completion of text recognition.At present,a number of deep learning models such as CNN+LSTM+CTC have been proposed,which effectively solve the problem of end-to-end text character recognition.But the performance of text-row extraction is far from satisfactory.Therefore,this dissertation is mainly focused on extracting text rows from original pictures more effectively and accurately.Because of the issues such as skew and complex background in the document image,it usually contains a lot of noise or invalid information,which will greatly affect on the final recognition performance.To address such issues,we first introduce the preprocessing method of skew correction and de-noising.Then,to accurate detect text objects in a document image,this dissertation presents a method of object detection and semantics segmentation based on deep learning method.This method effectively solves the problems in traditional method which are difficult to extract page features and have poor generality.The general algorithm is refined and the multi-scale feature fusion is used.To verify the performance of proposed method,mAPs of IOU are used on 2017 ICDAR page object detection dataset,the results improved from 0.787 and 0.637 to 0.865 and 0.752 for the indicators 0.6 and 0.8 respectively.Considering that some preprocessing methods optimized for specific document obejects may not suitable for other objects,in order to reduce the loss of information,corresponding processing should be done according to the different areas of page objects.For example,offline processing in the table area,and removal processing in the seal area by the method of separating color channels should be run on different objects.According to the property of text distribution between pure text pages and table pages,different text box extraction algorithms are designed in this thesis.The text box extraction of pure text pages is a combination of CTPN algorithm based on deep learning and projection method,which effectively solves the problem of text box extraction under complex page background.Through the design of text extraction algorithm based on different page features,a better text extraction algorithm is achieved.By combining the text detection algorithms with an OCR engine,a complete document recognition system is implemented.The experimental are conducted on the corpus constructed for a real application,and the results show that the system can achieve good results in image denoising,page object detection and text box extraction,and the whole system reaches the satisfactory performance for real application.

Keywords/Search Tags:

page object detection, image preprocessing, complex layout analysis, text detection

PDF Full Text Request

Related items

1	Research Of Layout Analysis On Complex Chinese Document Images
2	The Research And Application Of Segmentation Method Between Image And Text In Layout Analysis
3	Research On Document Image Layout Analysis And Text Extraction
4	Research And Implementation On Key Technology Of Web Text Collection And Analysis
5	Research On Text And Specific Object Detection Algorithm In Images And Videos
6	Research Of Mixed Text Detection In Natural Scene Image
7	The Research Of Infrared Image Preprocessing And Small Target Detection Under Complex Background
8	Text Detection And Recognition In Complex Scene Of Image And Video
9	Extraction And Analysis Of Formula And Text In The Document Image With Complex Layout
10	The Research Of Complex Background Image Text Detection On Interaction Platform And Its Applications