Font Size: a A A

Degraded text recognition using visual and linguistic context

Posted on:1996-01-19Degree:Ph.DType:Thesis
University:State University of New York at BuffaloCandidate:Hong, TaoFull Text:PDF
GTID:2468390014986958Subject:Computer Science
Abstract/Summary:
To improve the performance of an OCR system on degraded images of text, postprocessing techniques are critical. The objective of postprocessing is to correct errors or to resolve ambiguities in OCR results by using contextual information. Depending on the extent of context used, there are different levels of postprocessing. In current commercial OCR systems, word-level postprocessing methods, such as dictionary-lookup, have been applied successfully. However, many OCR errors cannot be corrected by word-level postprocessing. To overcome this limitation, passage-level postprocessing, in which global contextual information is utilized, is necessary. This thesis addresses problems in degraded text recognition and discusses potential solutions through passage-level postprocessing. The objective is to develop a postprocessing methodology from a broader perspective. In this work, two classes of inter-word contextual constraints, visual constraints and linguistic constraints, are exploited extensively. Given a text page with hundreds of words, many word image instances can be found visually similar. Formally, six types of visual inter-word relations are defined. Relations at the image level must be consistent with the relations at the symbolic level if word images in the text have been interpreted correctly. Based on the fact that OCR results often violate this consistency, methods of visual consistency analysis are designed to detect and correct OCR errors. Linguistic knowledge sources such as lexicography, syntax, and semantics, can be used to detect and correct OCR errors. Here, we focus on the word candidate selection problem. In this approach an OCR provides several alternatives for each word and the objective of postprocessing is to choose the correct decision among these choices. Two approaches of linguistic analysis, statistical and structural, are proposed for the problem of candidate selection. A word-collocation-based relaxation algorithm and a probabilistic lattice parsing algorithm are proposed. There exist some OCR errors which are not easily recoverable by either visual consistency analysis or linguistic consistency analysis. Integration of image analysis and language-level analysis provides a natural way to handle difficult words.
Keywords/Search Tags:OCR, Linguistic, Text, Postprocessing, Degraded, Visual, Consistency analysis, Image
Related items