Font Size: a A A

Correlation Mining Based Cross-media Retrieval

Posted on:2008-04-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:1118360215993958Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the past twenty years, content-based multimedia retrieval(CBMR) has been hotreseach issue in the area of computer vision. Psychology researches find that human brainssynchronously process different senses of information, such as visual and audio information.Retrieval requirement of multimedia data needs to process and retrieval different types ofmultimedia data, such as image and audio. This paper aims to provide a new kind ofcross-media retrieval mechanism which smoothly retrieval different types of multimedia data.Intrinsically, the fundamental challenge in cross-media retrieval lies in the heterogeneityof different low-level features. First, canonical correlation between media objects of differentmodalities is explored. An isomorphic subspace is constructed based on the analysis of bothvisual features and auditory features, and the mapping process maximally preserves initialcorrelation unchanged. Also polar coordinates are used to judge the general distance of mediaobjects with different modalities in the subspace. Since the integrity of semantic correlationsis not likely learned from limited training samples, users' relevance feedback is used toaccurately refine cross-media similarities. How to map new media objects into the learnedsubspace is also discussed, and thus, any new media object would be taken as query example.Lots of researches have proved that manifold structure is more powerful than Euclideanspaces for data representation. This dissertation proposes nonlinear multimodal semanticunderstanding methods to discover multimodal correlations in semantics. This methodcalculates multimodal geodesic distance matrix, based on which geodesic basis vectors areworkd out to build semantic subspace. Long-term and short-term strategies are designed torefine cross-media retrieval and update manifold structure.A cross-media correlation reasoning approach is proposed to solve the problem ofcorrelation measure between multimedia data from web pages. Multimedia correlations arerepresented and quantified in a cross-media correlation graph. A unique relevance feedbacktechnique is developed to update the knowledge of cross-media correlations by learning fromuser behaviors, and to enhance the retrieval performance in a progressive manner.Latent semantic index is introduced to analyze underlying feature symbiotic correlationswhile dimension reduciton. An iterative optimization algorithm is described to improve clustering quality of both image and audio datasets in subspace. We design active learningstrategies in relevance feedback to utilize unlabeled data information, and therefore, whenquery data is outside database cross-media retrieval performance is encouraging.This dissertation also discusses grid-based data storage and service management intypical applicaition of digital libraries. As multimedia data is massive, heterogeneous inlow-level content and storaged in distributed places, grid-based system architecture andframework is described to improve multimedia data sharing. Also emulational experiments areimplemented and typical results are given.
Keywords/Search Tags:computer vision, machine learning, cross-media retrieval, image retrieval, audio retrieval, correlation analysis, manifold learning, active learning, correlation reasoning
PDF Full Text Request
Related items