Font Size: a A A

Segmentation For Uneven Lighting Document Image

Posted on:2014-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:T J ZhengFull Text:PDF
GTID:2298330422960912Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Document image binarization is a key pre-processing of Optical characterrecognition (OCR) system. It affects the overall performance of the system directly.Degraded document image is caused by Uneven illumination, shade, highlightandmany other factors,which Seriously affect the effect of text image binarization.Document Image binarization is the one of the focus in the field of DocumentImage processing. Its binarization is still a focus and unsolved research. This paperanalyzes the main reason for the decline on quality of document Image, and focuseson how to binary a document image which has a highlightand, shade phenomenon aswell as uneven background.Firstly, we study the document binarization algorithmbased on local blocks of image. Then a new improved algorithm which is Based on thethe weakness on set size of the sub image is proposed,the new algorithm can Adaptivechosen sub image size. The method first detects the edge points of character strokes aby microcosmic operator. Then obtain the edge region of strokes by extreme filter.Finally we calulate for each text line height, binary for text line by OSTU algorithmand divide the line of background into backgroud. We compared that with thetraditional partitioned algorithm by making the experiment. The results show that theimproved text image binarization algorithm based on block method which is proposedby the paper can not only effectively segment the character information, but alsogenerate less noise. Niblack, Sauvola algorithm being a relatively large amount ofcalculation and the problem of time-consuming, the paper propose improvedalgorithm based on Sauvola algorithm model. Through the experiment contrast, theimproved algorithm which is equivalent to Sauvola algorithm on segmentation of thevisual effects eliminate the standard deviation calculation and adopt fast algorithmwhen we calculate the mean value so that we reduce the split time markedly andimprove the segmentation efficiency.In the view of the uneven illumination effects, we propose a text and imagesegmentation algorithm based on gradient correction. It adopt Laplacian operatorpicking up the stroke’s edge of the second order gradient, then the adaptive correctionof gradient values, finally the revised image using threshold segmentation algorithm.Compared to traditional text image segmentation algorithm based on unevenillumination, this algorithm segmentation improve greatly, especially for small fonttext images.This paper expound the quality evaluation method of the image segmentation andassess the quality of the segmentation algorithm objectively by using OCR performance analysis.Finally we apply the algorithm into local shadow and highlight the text of theimage and analysis the algorithm in the advantages and disadvantages of these twoissues.
Keywords/Search Tags:document image, binarization, sub-block image, local threshold, modify gradient, segment quality
PDF Full Text Request
Related items