Font Size: a A A

A Variational Model For Document Image Binarization

Posted on:2018-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:S FengFull Text:PDF
GTID:2348330533960995Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
In image processing field,document image binarization has significant research value and extensive applications,especially in improving the accuracy and efficiency of OCR(Optical Character Recognition)system in recognizing the text from document images.However,since the presence of shake blur,non-uniform illumination and partial occlusion,arising from the capturing equipment and environment,the document image binarization is still a challenging problem.Therefore,how to binarize the document images fast and accurately has received extensive attention from researchers.Existing document image binarization algorithms could be categorized into two groups: the threshold-based and the PDE(partial differential equation)-based.The first group consists of three kinds of representative algorithms,including global threshold method,local adaptive threshold method and hybrid threshold method.In general,an optimal image threshold value is firstly determined and the threshold technique is then utilized to accomplish the goal of document image binarization.The main difference among these methods is the image information they employed in calculating the threshold value.For global threshold method,the holistic information is used.For local adaptive threshold method,the information of pixel and its neighborhood is called.For hybrid method,both local and holistic information are exploited and combined jointly.In recent years,more and more researchers have tried to introduce partial differential equation to document image processing and achieved interesting and encouraging performance;the key ingredients to the success are the perfect theoretical framework of partial differential equation(PDE)and the clear mathematical interpretation for document image binarization.The basic idea of this kind of binarization methods is to find a partial differential equation as the evolution equation of the original image,a set evolved images are then generated,converging to the expected binary image finally.In this paper,we propose a novel PDE-based method for document image binarization.We firstly design a variational model for the original image,the minimizer of which is the desired result.Then,the variational principle is used to derive the Euler-Lagrange equation of the model.Last,we employ a finite difference scheme to discretize the Euler-Lagrange equation.The proposed variational model consists of three items: the first item is data fidelity,ensuring the resulting binary image as close as possible to the original image.The second item is binary classification term,which aims to make the pixel values of foreground region tend to be-1 and the pixel values of objective region tend to be 1 simultaneously.The third item is called H~2 regularization,the goal of which is to prevent the oscillation of the resulted binary image.The experimental results on camera text images demonstrate qualitatively and quantitatively that the proposed variational model is capable of achieving comparative and even better performance for document image binarization.Moreover,our method has shown strong robustness against noise.
Keywords/Search Tags:image processing, document image binarization, variational model, partial differential equation
PDF Full Text Request
Related items