Font Size: a A A

Analysis Of Multi-Modal Social Media Based On Graph Model

Posted on:2017-09-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:L Q WangFull Text:PDF
GTID:1318330518496019Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of mobile Internet, social media has become the main place for people to access and exchange information. The analysis and understanding of these multimodal social media data has widely and important application potentials. Social media data shows the typical characteristics: heterogeneous low-level content features and consistent high-level semantic features. On the one hand, a single modality is insufficient to describe a thing wholly. Multiple modalities provide different levels of semantic information and complement each other. It is necessary to organize the multimodal data effectively, to truly reflect the social media content. On the other hand, multimodal features are in heterogeneous spaces and with diverse correlations. It is required to study the interconnections and sharing mechanism, to establish the correlation relationship between multi-modal data. This dissertation takes the view of semantic complementary and spatial heterogeneity between multi-modal data and discusses the problems of multi-modal social media analysis and retrieval. All the works in the dissertation are summarized as follows:1. Multimodal social media representation. In social media,image provides visual features, while context information offers semantic features. One modality can provide more correlated information from another modality. Using multi-modal fusion approach to represent social media data as a feature vector for social media analysis with traditional machine learning method is one of the most direct solution.We first study the multi-modal fusion approach at the level of features, aiming to find a content-based feature fusion representation for social media. Context information is used as a regularization item to constrain image visual feature with nonnegative matrix factorization, searching for a latent space by fusing context information into image feature to represent social media. Experimental results show that there is complementarity between content-based and context-based fusion approaches. A combination of the two methods as social media representation can achieve good performance for social media analysis.2. Transductive social media analysis based on combination of multi-modal fusion and multi-label correlations. Multi-label is another dramatic characteristic besides multi-modal.Multi-labels illustrate the co-occurrence of objects in the image; while multi-modal features represent images from different viewpoints. They describe social images from two different aspects. For the two characteristics, the dissertation proposes a transductive hypergraph learning algorithm to integrate multi-modal features and multi-label correlations seamlessly. More specifically, we first propose a new feature fusion strategy by integrating multi-modal features into a unified hypergraph. An efficient multimodal hypergraph(EMHG) is constructed to solve the high computational complexity problem of the proposed fusion scheme.Secondly, we construct a multi-label correlation hypergraph(LCHG) to model the complex associations among labels.Moreover, an adaptive learning algorithm is adopted to learn the label scores and hyperedge weights simultaneously with the combination of the two hypergraphs. Experiments conducted on real-world social image datasets demonstrate the superiority of our proposed method compared with representative transductive baselines.3. Social media relevance analysis based on user-generated tags.User-generated tag is a typical characteristic of social media.The irregularity and subjectivity of user-generated tags makes it unreliable to access the relevant social media contents directly via user-generated tags. In the dissertation,we propose a hypergraph with correlated hyperedges (CHH),which introduces high-order relationship of hyperedges into hypergraph learning. To solve the problem of wrong and missing user tags, a pairwise visual-textual correlation hypergraph (VTCH) model based on CHH is used for social media relevance analysis. To overcome the large number of newly generated hybrid hyperedges, a bagging-based method is adopted to balance the accuracy and speed. Finally,adaptive hyperedges learning method is used to obtain the relevance score for social image search. The experiments conducted on MIR Flickr show the effectiveness of our proposed method. On the premise of rapid hypergraph construction, the proposed approach can reduce the influence of noise hidden in visual and textual words. The strategy by optimizing hyperedge weights is used to reduce the influence of ambiguous hybrid hyperedges. In the meanwhile, a bagging-based random hyperedge selection mechanism is used to solve the computational issue caused by the excessive number of hyperedges. The experimental results show the superiority with traditional hypergraph methods on the two jobs of tag-based social media retrieval and tag assignment.4. Heterogeneous high-order preserving for cross-modal retrieval. The noisy and sparse user-generated tags lead to an asymmetry cross-modal correlation problem between tags and images. To solve the problem, the dissertation proposes a cross-modal correlation algorithm by combining three ingredients: high-order relationship, semantic and nonlinearity. On the basis of modeling the intra-pair correlation, a hypergraph is used to describe the high-order relationship among social media to construct the complex inter-pair cross-modal correlation. The final correlation is achieved by balancing the intra- and inter-pair correlations.The proposed method is similar to mesh topology. It addresses the role of inter-pair correlations, which can be regarded to enlarge the training set indirectly, so as to solve the problem of noisy and sparse tags which exists in cross-modal correlation. For hypergraph construction,supervised and unsupervised scenarios are both discussed. To address the role of semantic information in cross-modal correlation, the random neighborhood selection is proposed for hypergraph construction based on semantic labels.Moreover, kernel tricks are also performed to learn a non-linear projection. Extensive experiments conducted on three public datasets demonstrate the superiority of the proposed methods compared with the state-of-the-art approaches in cross modal retrieval.
Keywords/Search Tags:social media analysis, multi-modal fusion, cross modal correlation, hypergraph learning, high-order relationship
PDF Full Text Request
Related items