Computer vision for scene text analysis

Posted on:2005-05-18

Degree:Ph.D

Type:Dissertation

University:University of Maryland, College Park

Candidate:Zandifar, Ali

Full Text:PDF

GTID:1458390008997962

Subject:Computer Science

Abstract/Summary:

The motivation of this dissertation is to develop a 'Seeing-Eye' video-based interface for the visually impaired to access environmental text information. We are concerned with those daily activities of the low-vision people involved with interpreting 'environmental text' or 'scene text' e.g., reading a newspaper, can labels and street signs.; First, we discuss the development of such a video-based interface. In this interface, the processed image of a scene text is read by off-the-shelf OCR and converted back to speech by Text-to-Speech (TTS) software. Our challenge is to feed a high quality image of a scene text for off-the-shelf OCR software under general pose of the surface on which text is printed. To achieve this, various problems related to feature detection, mosaicing, auto-focus, zoom, and systems integration were solved in the development of the system, and these are described.; We employ the video-based interface for the analysis of video of lectures/posters. In this application, the text is assumed to be on a plane. It is necessary for automatic analysis of video content to add modules such as enhancement, text segmentation, preprocessing video content, metric rectification, etc. We provide qualitative results to justify the algorithm and system integration.; For more general classes of surfaces that the text is printed on, such as bent or worked paper, we develop a novel method for 3D structure recovery and unwarping method. Deformed paper is isometric with a plane and the Gaussian curvature vanishes on every point on the surface. We show that these constraints lead to a closed set of equations that allow the recovery of the full geometric structure from a single image. We prove that these partial differential equations can be reduced to the Hopf equation that arises in non-linear wave propagation, and deformations of the paper can be interpreted in terms of the characteristics of this equation. A new exact integration of these equations relates the 3D structure of the surface to an image of a paper. In addition, we can generate such surfaces using the underlying equations. This method only uses information derived from the image of the boundary.; Furthermore, we employ the shape-from-texture method as an alternative to the method above to infer its 3D structure. We showed that for the consistency of normal vector field, we need to add extra conditions based on the surface model. Such conditions are isometry and zero Gaussian curvature of the surface. (Abstract shortened by UMI.)...

Keywords/Search Tags:

Text, Video-based interface, 3D structure, Surface

Related items

1	Surface, Interface And Dislocation Behaviors Of CdZnTe Single Crystals
2	Studies Of Metal/Conjugated Polymer Surface And Interface Using Photoemission Spectroscopy
3	Effects Of Point Defects On The Magnetic And Optical Properties Of Polar ZnO:La/Y Surface And GaN/ZnO Interface
4	Study On Method To Automatically Analyze The Text Structure Based On The Relevancy Computing Of Text Content
5	Research On Video Text Extraction And The Application In Virtual Karaoke
6	Research On Video OCR
7	Reasearch On Video Text Information Extraction Based On Features Integration
8	Text Extraction In Video
9	Research On Method Of Video Structure Mining Based On Content
10	Research On The Technology Of Video Text Information Extraction