Font Size: a A A

Multimedia Cross-reference Retrieval And Semantic Annotation,

Posted on:2006-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:C M WuFull Text:PDF
GTID:2208360152470045Subject:Computer applications
Abstract/Summary:PDF Full Text Request
The explosive growth of multimedia data in large-scale information repositories such as digital library (DL) and Web poses new challenges to conventional multimedia information retrieval technologies. A typical DL features a huge amount of heterogeneous data and a large number of non-expert users. Therefore, a retrieval tool that supports semantic-level queries and a variety of multimedia data types is highly desirable in the context of DL, which is however beyond the capability of the up-to-date multimedia retrieval technologies. The work presented in this paper extends the conventional multimedia retrieval research by proposing three key techniques for DL: the multimedia cross-media retrieval framework, automatic image annotation mechanism and multi-modal video analysis based on Maximum Entropy. These techniques provide critical building blocks for retrieval facilities in DLs or similar information repositories.We present an overview of the background, motivation, and basic approaches of our research in the beginning of this paper.In Chapter 2, we present a review of the research on multimedia retrieval and automatic annotation techniques including the fundamental approaches, related works and typical systems. We also introduce and discuss the research issues posed by information retrieval in the context of DLs which contain large amount of multimedia information.In Chapter 3, we propose a novel multimedia cross-media retrieval framework. The most important feature of the system is to integrate the multi-modal data into a seamless retrieval system. The semantic relationship of media objects is extracted from multimedia documents and represented by the cross-media graph. The cross-media search engine, based on the cross-media graph, calculates both the semantic and content level of the similarity between media objects and the query. It is also able to adjust the cross-media graph based on relevance feedback conducted byusers.In Chapter 4, we propose a novel algorithm for automatic image annotation. People have traditionally used manual annotation for linguistic indexing to support semantic-based image management and retrieval. However, this approach is liable to subjective, and requires a huge amount of human effort, especially for large image collections. Our automatic image annotation method employs machine learning and statistical learning technologies. First, support vector machine (SVM) is used to classify images automatically. Then statistical learning model is adopted to select the most appropriate keywords for an incoming image on the basis of the annotated image collections.In Chapter 5, we propose a novel algorithm for multi-modal video analysis based on the Maximum Entropy model. Video contains rich semantic information which can be represented by multi-modal features including textural, visual and audial features. Here, the Maximum Entropy model is implemented for multi-modal video analysis to annotate video semantics.We conclude the whole paper in Chapter 6, with a brief discussion of the application prospect and future research directions.
Keywords/Search Tags:multimedia, cross-media retrieval, relevance feedback, automatic annotation, support vector machine, statistical learning, multi-modal, maximum entropy
PDF Full Text Request
Related items