Detection of text strings from mixed text/graphics images

Posted on:2001-11-23

Degree:Ph.D

Type:Dissertation

University:Case Western Reserve University

Candidate:Tsai, Chien-Hua

Full Text:PDF

GTID:1468390014453747

Subject:Computer Science

Abstract/Summary:

An algorithm for text string separation from mixed text/graphics images is presented. In comparison with those proposed works dealing with extracting text string issues, the algorithm has different features. The union-find operation is employed to perform detection of text regions in digital images efficiently. This scheme is also called region growing.; The principle of this method is to start with some seed objects and then jump from one object to the next. Based on this technique the Euclidean distance which is considered as a distance measure, is needed plus some threshold values in order to define in geometric terms what the next right neighbor should be. Then, we merge these patterns block by block until they are considered to be optimal with maximum neighborhood connectivity. For such a method, the maximal blocks, i.e., potential text regions, are obtained after several union and find operations while the union-find process is considered to be terminated. Accordingly, in terms of image processing or computer vision, such a block labeling has been realized in locating a segment at the same time as the procedure in detecting a text-string region by the union-find operation.; The algorithm is thus able to classify the text from the graphics and adapts to changes in document type, language category (e.g., English, Chinese and Japanese), text font style and size, and text string orientation within images. In addition, it allows a document skew that often occurs in the initial image or the scanned image, without skew correction prior to discrimination while the proposed methods such as projection profile or run length coding are not always suitable for the condition. The method has been tested with a variety of printed documents from different origins with one common set of parameters in this work. The experimental results of the performance of the algorithm in terms of computational efficiency are demonstrated by using several test images from the evaluation.; Consequently, the contribution is primarily to facilitate the achievements of software after extracting text information and graphics interpretation system after separation in terms of fast and effective analysis.

Keywords/Search Tags:

Text, Images, Algorithm, Terms

Related items

1	Research On Terms Co-occurrence Based Models And Algorithms For Text Mining
2	Information Extraction of cyber security related terms and concepts from unstructured text
3	Improvement And Application To Weighting Terms Based On Text Classification
4	Extended SBN Retrieval Model Based On Ontology Terms Relationship
5	On Suppressing Cross-terms In WVD Via Thresholding Superimposition Of Multiple Spectrograms
6	Research On Text Detection Algorithm In Natural Scene Images
7	Research On The Location And Segmentation Of Unconstrained Text In Images
8	Research On Overlay Text Extraction From Images With Complex Background
9	Techniques for improved LSI text retrieval
10	The Research Of Word Similarity Calculation Based On Web Text And Automatic Generation Technology Of Traffic Terms