Font Size: a A A

Multimodal Feature Correlation Analysis For Image Understanding

Posted on:2016-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y GuFull Text:PDF
GTID:2348330503994679Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Multimodal feature aims to describe the property of objects in various forms. In fields of image recognition and media retrieval, multimodal features are widely used since they can provide discriminative and robust representation of objects compared with the unimodal features. This paper deals with the applications of multimodal features based on multiple visual features and image-text crossmodal features in two tasks including image annotation and crossmodal hashing. The main contribution can be summerized into the following three parts:1. Image annotation based on latent community detection and multiple kernel learning: The proposed method detects the latent semantic community based on semantic information. Multi-kernel support vector machine is adopted for learning the discriminative feature for each semantic community. Intra-communiy and intercommunity annotation strategies are used for make final decision for tagging. Experiments on NUS-WIDE dataset demonstrate the good performance of the proposed method.2. Automatic image annotation exploiting textual and visual saliency: The model focuses on image annotation based on visual and textual saliency. The visual salient region of image is firstly extracted for generating dual-layer bag of salient feature. The textual saliency is discovered in accordance with the visual salient region.Experiments on NUS-WIDE dataset demonstrate the performance of our method.3. Cross-modality hashing with partial correspondence: The main contribution of this model is to capture the cross-modal correlation without full correspondence information. By effectively preserving the local smoothness using anchor graph, the data without full correspondence are made in use to enhance the hashing performance.For well-corresponded modalities, we map the objects into Hamming spaces in which the modalities are represented with same or similiar binary codes. Experiments on NUS-WIDE and Wiki dataset demonstrates the hashing performance of the proposed hashing strategy.
Keywords/Search Tags:Multimodal Features, Image Annotation, Crossmodal Hashing, Correlation Analysis, Saliency
PDF Full Text Request
Related items