Font Size: a A A

Computer vision for scene text analysis

Posted on:2005-05-18Degree:Ph.DType:Dissertation
University:University of Maryland, College ParkCandidate:Zandifar, AliFull Text:PDF
GTID:1458390008997962Subject:Computer Science
Abstract/Summary:
The motivation of this dissertation is to develop a 'Seeing-Eye' video-based interface for the visually impaired to access environmental text information. We are concerned with those daily activities of the low-vision people involved with interpreting 'environmental text' or 'scene text' e.g., reading a newspaper, can labels and street signs.; First, we discuss the development of such a video-based interface. In this interface, the processed image of a scene text is read by off-the-shelf OCR and converted back to speech by Text-to-Speech (TTS) software. Our challenge is to feed a high quality image of a scene text for off-the-shelf OCR software under general pose of the surface on which text is printed. To achieve this, various problems related to feature detection, mosaicing, auto-focus, zoom, and systems integration were solved in the development of the system, and these are described.; We employ the video-based interface for the analysis of video of lectures/posters. In this application, the text is assumed to be on a plane. It is necessary for automatic analysis of video content to add modules such as enhancement, text segmentation, preprocessing video content, metric rectification, etc. We provide qualitative results to justify the algorithm and system integration.; For more general classes of surfaces that the text is printed on, such as bent or worked paper, we develop a novel method for 3D structure recovery and unwarping method. Deformed paper is isometric with a plane and the Gaussian curvature vanishes on every point on the surface. We show that these constraints lead to a closed set of equations that allow the recovery of the full geometric structure from a single image. We prove that these partial differential equations can be reduced to the Hopf equation that arises in non-linear wave propagation, and deformations of the paper can be interpreted in terms of the characteristics of this equation. A new exact integration of these equations relates the 3D structure of the surface to an image of a paper. In addition, we can generate such surfaces using the underlying equations. This method only uses information derived from the image of the boundary.; Furthermore, we employ the shape-from-texture method as an alternative to the method above to infer its 3D structure. We showed that for the consistency of normal vector field, we need to add extra conditions based on the surface model. Such conditions are isometry and zero Gaussian curvature of the surface. (Abstract shortened by UMI.)...
Keywords/Search Tags:Text, Video-based interface, 3D structure, Surface
Related items