Font Size: a A A

Research On Degraded Document Image Binarization

Posted on:2018-10-19Degree:MasterType:Thesis
Country:ChinaCandidate:J J XuFull Text:PDF
GTID:2348330536457749Subject:Navigation, guidance and control
Abstract/Summary:PDF Full Text Request
Document image binarization plays an important role in document analysis and recognition(DAR)technology,the performance of binarization will directly affect the DAR system.However,due to the complexity of degraded document images and the diversity of degradation factors,such as ink bleed through,page stains,uneven illumination,background texture and so on,the binarization of degraded document images have been the focus and difficulty in the research.The main work of this paper is as follows:(1)For the low contrast phenomenon of degraded document images,this paper put forward a binarization algorithm based on local contrast enhancement and stroke width transform.First,we adopt the min-average method as our gray preprocessing,which could improve the contrast between foreground and background,and reduce the variation intensity within text class.The nonlinear bilateral filter is used to filter out the noise and ink in background,while the edge of the foreground and background is preserved.Then,the contrast image which obtained by local contrast enhancement has obvious characteristics of bimodal.Meantime,we choose global threshold Otsu's method to detect the high contrast pixels as seed pixels.Finally,the width of stroke estimated by stroke width transform is used to determine the neighborhood window size,so as to perform precise local binarization.The results show that the proposed algorithm can not only preserve the stroke details,but also suppress the background.(2)In view of complex background characteristics of degraded document images,we present a method based on background estimation and energy minimization.This method firstly follows the preprocessing(min-average color-to-gray conversion and bilateral filtering).Then,according to visual test and morphological operation of closing,we simulate an object going far from one's eyes,achieving estimated background image.Finally,we construct Laplacian energy function and graph for image whose estimated background is removed.The minimization of energy function is achieved by max flow-min cut.Experimental results show that the contrast of image which removed estimated background is improved obviously,and the segmentation between foreground and background is more accurate.(3)The two algorithms proposed in this paper are compared with those of the seven classical binarization algorithms,and the evaluation metrics of Document Image Binarization Contest(DIBCO)would be used as performance evaluation parameters of algorithms.Experimental results show that the proposed algorithms outperform other classical document binarization methods in terms of binary image quality and performance evaluation parameters,with 0.5 and 3 percent,respectively.
Keywords/Search Tags:degraded document image binarization, contrast, stroke width transform, background estimation, Laplacian energy function
PDF Full Text Request
Related items