Font Size: a A A

Image Processing: A Study Of The Problems In Probability Based OCR System

Posted on:2005-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:X S ZhangFull Text:PDF
GTID:2168360122480337Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
As a major part of pattern recognition, Optical Character Recognition(OCR) playsimportant role in areas such as information processing,office automation,post-office systemand bank system。 This paper focus on the study of the probability based extraction method which includingword, text line and text block extraction in our Optical Character Recognition System. Inpre-process of OCR, several methods of the two problems:image binarisation are alsodiscussed, thus corresponding methods are finally selected and determined. What follows isthe outline of the thesis: First, the author introduces the process of our OCR system and makes a brief descriptionof the extraction method.Then he points out that extraction method should rather be based onmathematical model than empirical mode. Second, probability model and its algorithm are presented, with emphasis on the detailedsteps of the algorithm.And the projection method is briefly introduced in order to compare itsexperiment result with that of ours .The comparison of the two methods follows thatprobability based extraction method is better than projection method.The application of ouralgorithm in block extraction is also introduced."Cell Count "method is introdued to computethe probabity appeared in the paper. Third,based on the model we have presented,word extraction method is discussed and theanalysis of its experiment result is made. Finally, we discussed several methods of image binarisation.We describe a binarizationmethod designed specially for OCR of low quality images:Background SurfaceThresholding.This method is robust and produces images with very little noise and consistentstroke width.The statistic of chain-code of stroke countours method is described to deslant theword.
Keywords/Search Tags:optical character recognition(ocr), probability based, cell count, binarisation, deslant
PDF Full Text Request
Related items