Font Size: a A A

Research And Implementation Of Document Skew Detection And Correction Method

Posted on:2021-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:Z X ChenFull Text:PDF
GTID:2518306131983349Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the increasing use of electronic documents,a large number of historical documents need to be scanned as electronic documents by a printer,but during the scanning process,the document may be skew due to human operation or other reasons.The skew document brings inconvenience to subsequent text segmentation,text recognition,and reader reading.Therefore,many scholars have studied document skew and proposed many excellent skew angle detection algorithms,including Hough transform based methods,projection profile based methods,and nearest neighbor based methods.However,although some algorithms have high accuracy in detecting the angle of inclination,they require large amount of computational cost;other algorithms sacrifice the performance of the algorithm in order to improve the speed of the algorithm.Therefore,in order to solve the disadvantages of the above methods,two different skew angle detection algorithms are proposed in this paper.The first proposed method is based on bounding boxes,probability profile model and Q test.Firstly,bounding boxes are used to pick out the eligible connected components(ECC);then several possible document slope values are calculated by referring to the probabilistic model;finally,the Q test and projection profile method are used to calculate the optimal skew angle from the above slope values and correct the skew document.The second proposed method is based on finding the axes-parallel bounding box with the minimum area of the whole document and probability profile method.This method first preprocesses and zoom out the input image;then calculate the approximate skew angle using the axesparallel bounding box with the minimum area of the whole document;finally,find the optimal skew angle by using the projection profile method near the approximate angle and correct the skew document.Experimental results and analysis show that both two proposed methods can achieve very high accuracy.Compared with other existing methods,the first proposed method can ensure that the average deviation of the detection angle is only 0.081° and the calculation speed is at least 5 times faster than them,but this algorithm is not as effective as the second algorithm in processing images without text.Although the second algorithm is slightly slower than the first algorithm,the average deviation of the detection angle is only 0.067°,and it has a good processing effect for some images without text,which is more robust than the first algorithm.In summary,these two proposed methods can achieve high speed and accuracy simultaneously,and have their value in use.
Keywords/Search Tags:document skew detection, probability model, Q test, bounding box, projection profile method
PDF Full Text Request
Related items