Font Size: a A A

Multimedia Content Analysis And Understanding Based On Social Media

Posted on:2020-12-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q L LiuFull Text:PDF
GTID:1368330623958208Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of digital media technologies,as well as the popularity of the intelligent devices and social networks,more and more information is presented in the form of multimedia,which makes the multimedia content explosively increase.Multimedia data always contains text,image,video,and so on.They have the properties of the rich information,huge appearance difference and massive numbers.As a consequence,how to effectively analyze and understand the large-scale multimedia data with respective to these properties is becoming a hot research topic,but also a challenging problem.Social multimedia contains rich social information,such as the user-provided tags and the attribute information of visual objects,which is helpful for multimedia content analysis and understanding.On the other hand,multimedia content analysis and understanding involves the representation of multimedia data,image-tag correlation prediction,effective indexing and other problems.Therefore,this thesis mainly studies how to learn the attribute features,predict the image-tag correlation and learn the effective indexing by exploring the social media information.Social media-based multimedia content analysis and understanding is researched in this dissertation,including the discriminative supplementary representation learning of images,the tag refinement of social images and the deep hashing indexing.The main contributions are summarized as follows.(1)Supplementary attribute learning-based image content representation.To handle the problems of the limited number and incompleteness of manually defined attributes,we propose a new supplementary feature learning method,which learns discriminative supplementary features to automatically expand the semantic attribute representations of images.To learn the discriminative supplementary features for the few-shot image classification problem,the proposed method jointly learns the supplementary features and the classifiers for the novel-categories based on the manually defined attributes.To make the learned features discriminative,the supplementary features and the classifiers for the novel-categories are jointly learned with the column sparsity constraints for the optimal compatibility of the representations and classifiers.The effectiveness of the supplementary attribute features learned by the proposed method is shown by the results for the few-shot classification problem.(2)Projective nonnegative matrix factorization-based social image tag refinement.To deal with the problem of noisy and missing tags of social images,we propose a new projective nonnegative matrix factorization model,which enables to predict the correlations between images and tags to refine the tags of images.Based on the nonnegative matrix factorization,the image latent representation is assumed to be projected from its original feature representation with an orthogonal transformation matrix,which can well address the out-of-sample problem.Besides,to remove the irrelevant visual features,the proposed method introduces a row-sparse regularization term to select suitable features for the transformation matrix learning.Local geometry preservations of the image space(tag space)are explored as constraints in order to make image similarity(tag correlation)consistent in the original space and the corresponding latent space.The proposed method is applied to the social image retrieval task.The superior search performance well shows that the proposed method can effectively improve the correlations between images and tags to improve the tag quality of social images.(3)Multilevel similarity learning-based deep hashing.To efficiently and effectively index images,we propose a deep cross-modal hashing indexing method based on multilevel similarity learning.It explores the multilevel semantic similarities between images to learn compact and discriminative hash codes for large-scale social image retrieval.The proposed method is the first attempt to incorporate the deep feature representation learning,hash function learning and multilevel semantic similarity learning into one unified framework.It learns discriminative and compact binary hash codes with deep neural networks by exploring the multilevel semantic similarity correlations of multimedia data.Specifically,the multilevel semantic similarity is learned by exploiting the local structure and semantic label information simultaneously.Meanwhile,the bit balance and quantization constraints are taken into account to further make the unified hash codes compact.Experiments on two widely-used multimodal datasets demonstrate that the proposed method can achieve the state-of-the-art performance for both image-query-text and text-query-image tasks.
Keywords/Search Tags:Supplementary Representation Learning, User-provided Tag, Nonnegative Matrix Factorization, Deep Cross-modal Hashing, Image Tag Refinement and Retrieval
PDF Full Text Request
Related items