A common representation for multimedia documents

Posted on:2003-01-22

Degree:Ph.D

Type:Dissertation

University:University of North Texas

Candidate:Jeong, Ki Tai

Full Text:PDF

GTID:1468390011984093

Subject:Information Science

Abstract/Summary:

Multimedia documents are composed of multiple file format combinations, such as image and text, image and sound, or image, text and sound. The type of multimedia document determines the form of analysis for knowledge architecture design and retrieval methods. Over the last few decades, theories of text analysis have been proposed and applied effectively. In recent years, theories of image and sound analysis have been proposed to work with text retrieval systems and progressed quickly due in part to rapid progress in computer processing speed. Retrieval of multimedia documents formerly was divided into the categories of image and text, and image and sound. While standard retrieval process begins from text only, methods are developing that allow the retrieval process to be accomplished simultaneously using text and image.; Although image processing for feature extraction and text processing for term extractions are well understood, there are no prior methods that can combine these two features into a single data structure. This dissertation will introduce a common representation format for multimedia documents (CRFMD) composed of both images and text.; For image and text analysis, two techniques are used: the Lorenz Information Measurement and the Word Code. A new process named Jeong's Transform is demonstrated for extraction of text and image features, combining the two previous measurements to form a single data structure. Finally, this single data structure is analyzed by using multi-dimensional scaling. This allows multimedia objects to be represented on a two-dimensional graph as vectors. The distance between vectors represents the magnitude of the difference between multimedia documents.; This study shows that image classification on a given test set is dramatically improved when text features are encoded together with image features. This effect appears to hold true even when the available text is diffused and is not uniform with the image features. This retrieval system works by representing a multimedia document as a single data structure. CRFMD is applicable to other areas of multimedia document retrieval and processing, such as medical image retrieval, World Wide Web searching, and museum collection retrieval.

Keywords/Search Tags:

Multimedia, Image, Text, Retrieval, Single data structure, Processing

Related items

1	Research On Key Techniques Of Content-Based Medical Image Retrieval
2	Study Of Content-Based Image Retrieval Methods In Multimedia Data
3	Issues in multimedia databases: Coding for content-based image retrieval and digital copyright protection
4	Research On Web Image Retrieval Based On Web Information And Image Feature
5	Study On Some Key Technologies Of Integrated Multimedia Information Retrieval
6	The Research Of Multimedia Image Retrieval Based On Content
7	Semantic-aware data processing: Towards cross-modal multimedia analysis and content-based retrieval in distributed and mobile environments
8	Multimedia data mining and retrieval for multimedia databases using associations and correlations
9	Research Of Layout Structure-based Document Image Retrieval
10	Image Retrieval Based On Geometric Partitioning For Edge Structure