Font Size: a A A

Research On Document Warping Rectification Based On Deep Learning

Posted on:2021-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:X L HuangFull Text:PDF
GTID:2428330614972160Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Document digitization is an important means for preserving physical painted documents,providing users with access whenever and wherever possible.With the ubiquitousness of mobile cameras,capturing document images has become one of the most convenient methods to digitize a document as well as flat-bed scanners.After digitization,images can be used for subsequent high-level visual tasks including text localization and recognition,which take a significant position in content analysis and information extraction.However,there are poor conditions frequently emerging when photographing physical documents,which make the acquired images contain certain degradations occasionally such as fold.Besides,this situation may become a huge obstacle to digital document preservation.In recent years,image warping rectification based on deep learning has been a prevalent research topic.Using the rectification technology and deep learning method to quickly and accurately accomplish the correction of warping document image for its following high-level visual tasks is of great significance for saving costs and digitizing documents.Most of the methods of document image warping rectification primarily rely on the transformation and upgrades on hardware facilities.By adding auxiliary cameras and light source equipment to scanning equipment such as flat-bed scanner to ensure the accuracy and flatness of scanning,which greatly raises the price of the equipment and reduces its portability.In addition,after correcting the warping documents,the majority of the studies barely consider whether the quality of the whole image's has been improved,without taking into account the assistance for subsequent high-level visual tasks on content analysis and information extraction for the document,such as text localization and recognition.Despite of the increasing whole quality of document images,the text on the document has been blurred,weakened or even disappeared.Regarding the issues above,the main work of this paper:(1)Proposes an image quality assessment index based on the image high-level visual tasks,which aims to improve the whole image quality to the best capability under the premise of ensuring that the localization and recognition accuracy of the text in the document image is basically the same,so as to decline the level of the document image warping to its minimum.Meanwhile,according to the designing assessment index,a large document image dataset is generated,which includes the position coordinates and the corresponding text content of the text in the document image as well as the original image and the warping image,convenient for the quantitative comparison of text localization and recognition accuracy.(2)Proposes a Doc GAN(Document Generative Adversarial Network)framework for high-level visual tasks based on U-Net and GAN(Generative Adversarial Network)containing three innovations,as is demonstrated in the following.Firstly,in the process of network training,a region adaptive strategy is proposed for the decoder part of the generator,which divides the text and the background into two parts,different loss functions designed for different regions to ensure the normal progress of the subsequent high-level visual tasks.Secondly,a binary mask is generated for the original document image prior to the network training to ensure that the weight of the loss function of the text and background area is self-determined during the training process.Thirdly,the adversarial network is added to make the rectification document image much clearer and to ensure the improvement of the whole image quality.Experiments show that the proposed Doc GAN framework in our collected dataset,under the premise of ensuring that the text localization and recognition accuracy are basically unchanged,has profoundly enhanced the whole image quality qualitatively and quantitatively.Further experiments are carried out to verify the effectiveness of the proposed strategy.
Keywords/Search Tags:Deep neural network, Image warping rectification, Generative Adversarial Network, Sample generation and selection
PDF Full Text Request
Related items