Font Size: a A A

Research On Image Correction Of Deformed Documents

Posted on:2022-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhouFull Text:PDF
GTID:2518306500955769Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The era of big data made digitalized information more important.People usually take pictures of books and other paper documents with smart phones and other devices for digital storage and application.The digitization of documents can provide information for people in time and realize the sharing of resources quickly.However,when we use smart phone to take pictures of books and documents,different degrees of deformation will occur.When we take pictures of thick book page images,page bending deformation caused by page thickness and perspective deformation caused by different shooting angles will occur.On the one hand,the deformed document image will affect people's visual viewing effect,on the other hand,it will cause great obstacles to the recognition of document image content,layout analysis and format processing of digitized document in later stage.Therefore,it is very necessary to correct the deformed document image.In this thesis,the deformation problem is studied from two aspects.The main work is as follows:In order to correct the deformed document image,an algorithm is proposed to extract the deformed text line information.The algorithm first extracts ROI of the document image,then extracts text fields from ROI,and then merges multiple text fields into text lines according to the scoring rule method designed for such document images.After the merged deformed text line information is extracted,PCA is used to extract the effective feature information.The image page of the deformed document is treated as a generalized cylinder during correction,and the polynomial curve was fitted by the least square method,and the curved surface model was reconstructed by the translation fitting curve.Finally,the deformed image is corrected by interpolation mapping.In order to correct the deformed document image with fewer text lines,a method of correcting the deformed image based on auxiliary grid is proposed.This method realizes image correction by means of our auxiliary grid and edge shape matching algorithm.Firstly,the auxiliary grid information base is built according to different situations.When the image is corrected,the matching degree between the image to be corrected and the grid in the grid information database is calculated,and the grid with the highest matching degree is selected for the extraction of deformation information.After the corresponding deformed grid information is extracted,the grid image information with the highest matching degree is used as the basic information of the image to be corrected.The correction of deformed document images with little or no text line information is realized.In this thesis,the algorithm of extracting deformed text line information is experimented on the public data set CBDAR2007,and the method of correcting deformed image based on auxiliary grid is experimented on the man-made data set.The corrected results based on text lines reached 96.8% in OCR accuracy,0.44 in multi-scale structural similarity and 47% in matching percentage,respectively,which were higher than those of the comparative literature.The projected errors and geometric deformation measurements of the images to be corrected are both lower than that of the deformed image before correction.
Keywords/Search Tags:bending deformation, PCA, model reconstruction, matching algorithm, document image correction
PDF Full Text Request
Related items