Cross-modal Multimedia Information Retrieval

Posted on:2016-09-06

Degree:Master

Type:Thesis

Country:China

Candidate:L X Shi

Full Text:PDF

GTID:2308330461968118

Subject:Computer software and theory

Abstract/Summary:

Content-based multimedia retrieval has become a hot spot in the field of multimedia information retrieval since the early 1990s. And it is also a very attractive research direction in computer vision. Generally, the technologies of statistical analysis, pattern recognition, machine learning and human-computer interaction are integrated in the content-based multimedia retrieval. The main purpose is to remedy the limitations of the traditional approaches which is only based on the text, including laborious and time-consuming manual annotation, difference of subjective manual selection, etc. Besides, traditional retrieval methods can only deal with the unitary type of media such as the image, video or audio, which are not able to retrieve the objects across different types of media. With the development of technologies, people urgently need a new retrieval method for different mode of multimedia data. This paper is to research the cross-modal retrieval which can deal with and query different forms of multimedia data flexibility.Most existing retrieval methods for images and videos are based on searching relevant text. For example, Google returns the images according to a set of keywords which are mainly derived from the text associated to the image or the manual annotation of image. However due to the difference of cultural background and professional knowledge between annotators, sometimes the textual information seems to be confusing and unreliable. It is difficult to find an effective and accurate description to character the information of images and videos. Hence traditional retrieval methods are hard to meet the demand with relatively low precision.This paper studies the relevant technologies of multimedia information retrieval firstly and summarizes four typical approaches for cross-modal retrieval, i.e., linear iteration and mapping, the nonlinear manifold, the probability model and the analysis of heterogeneous. Then this paper proposes two novel methods for cross modal information retrieval and both methods can generalize the patterns of different multimedia data. By utilizing CCA, the latent correlation between different multimedia data is learned and modeled so as to achieve a better performance. The first retrieval method is based on doc2vec and ITQ cross modal of multimedia information retrieval; The second approach is based on the model of LDA and ITQ cross modal of multimedia information retrieval; And the third method is based on fusion more characteristics in the cross modal information retrieval method, and in the third method, we have put forward two different fusion methods. The purpose of these three methods is aimed at in a different way to bridge the different modal (image, text, video, audio) of multimedia information.The effectiveness of these approaches is evaluated by the cross-modal multimedia retrieval task, i.e., text retrieval through the image and image retrieval through the text. Two corpora are used in the experiments, i.e., the English Wikipedia data (EG-wikipedia) and the Chinese Wikipedia data (CH-wikipedia). Empirical results demonstrate that the proposed methods can achieve better performance.

Keywords/Search Tags:

Cross-modal, Cross-modal retrieval, image retrieval, Canonical Correlation Analysis

Related items

1	Cross-modal Multimedia Information Retrieval With CCA And Adaboost
2	Research On Single-modal And Cross-modal Retrieval By Hashing Technology
3	Design And Implementation Of DCGAN-based Image-text Cross-modal Retrieval System
4	Cross-modal Music Retrieval Based On Canonical Correlation
5	Cross-modal Retrieval Research Based On Correlation Analysis And Structure Preserving
6	Outdoor Mobile Robot Radar-image Cross-modal Retrieval Technology
7	Research On Cross-modal Hashing Algorithm Based On Kernel Canonical Correlation Analysis And Neural Network
8	Research On Social-Sensed Cross-Modal Retrieval
9	Research On Information Retrieval Based On Cross-modal Association Analysis
10	Research On Cross-Modal Image-Text Retrieval Techniques Based On Semantics And Common Sense