Font Size: a A A

Cross-modal Sentiment Analysis Using Supervised Collective Matrix Factorization

Posted on:2019-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:S Q ChenFull Text:PDF
GTID:2428330590992231Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
With the advent of mobile and communication technologies,social medias accumalte a lot of user-uploaded images,words and videos.Through all these multi-media data sources,we could analysis users' sentiments thus creating massive value in both macro and micro perspective.The analysis result could be used in election forecast,public opinion monitoring,customized recommendation and so on.In this paper,we mainly focus on sentiment analysis of user-uploaded images along with the comments from a social media website Flickr.The main innovations of this work lie in following aspects: textual feature extraction using Bag of Glove Vector,feature fusion using supervised collective matrix factorization and ensemble learning based performance promotion.Focusing on above innovations,this paper introduces cross-modal sentiment analysis using supervised collective matrix factorization from following aspects.Visual and textual feature extraction.For the images in social network,due to the complexity of sentiment itself and the lack of reliable sentiments labels,we use the pre-trained models which have achieved remarkable result on image classification tasks to extract deep visual features from images.For the associated texts,we only use the comments as textual data source as the tags are used to automatically generate weak sentiment labels.For these comments,we firstly concatenate all of them and perform stemming.Then we transform each word into word embeddings and then use the vectors for clustering using their valuable distance information.Based on the calculation of cosine distance of each word embedding and the cluster centers,we are able to generate the Bag of Glove Vectors textual feature.Through experiments on threeclass sentiment polarity data and six-class sentiment category data,we verify the validity of visual and textual features themselves and the advantages of BoGV text feature over direct using Glove vectors as textual features.Visual and textual feature fusion.Visual and textual features focus on different data source and could be regarded as two different aspects of describing the sentiment.To better describe the sentiment,these two features should be combined.Conventional methods of such combination include direct concatenating,canonical correlation analysis and matrix factorization.Whatever the method is,its procedure is unsupervised which means that the process of finding associations only focus on the statistical information of the data itself.These methods ignore the fact that samples revealing same sentiment should have similar data distribution.To better use the visual and textual features,this paper introduces Laplacian matrix describing label information into the procedure of collective matrix factorization thus making the procedure supervised and has better performance.This paper also introduces histogram intersection kernel support vector machine which perfectly suits the disigned features.Through experiments on three-class sentiment polarity data and six-class sentiment category data,we verify the effectiveness of histogram intersection kernel support vector machine,the utility of supervised collective matrix factorization method and show that we outperform existing methods.Ensemble learning based performance promotion using above features.Due to the complexity of sentiment,unlike those in image classification tasks,all methods concerning sentiment analysis are actually weak learners.And by the theory of ensemble learning,these methods have a great potential of further performance promotion.This study follows the idea of stacking to train several base learner and further train meta learner on the first-level output thus achieving better result than any single method.We use several heterogeneous learners including gradient boosting decision tree,random forest,multilayer perceptrons and support vector machine as base learner.After training of base learners,we use multilayer perceptrons and logistic regression as meta learner to further training.Through experiments on three-class sentiment polarity data and six-class sentiment category data,we achieve better result than previous stages thus validating the effectiveness of this method.
Keywords/Search Tags:Cross-modal sentiment analysis, Bag of Glove Vectors, Supervised Collective Matrix Facorization, ensemble learning
PDF Full Text Request
Related items