Font Size: a A A

Study And Implementation On Fine-Graind Sentiment Analysis For Microblog Based On Multi-instance Multi-label Learning

Posted on:2016-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:M Q WangFull Text:PDF
GTID:2428330542454646Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As one of the typical social media platforms,the microblog contains a large number of user-generated-content that embedded users' extremely rich emotions.Analyzing the sentiments and emotions in microblog plays significant role for public opinion monitoring and decision making.Currently,most existing sentiment analysis researches focus on the single label sentiment orientation classification.However,it is a common phenomenon that a microblog may contains a variety of fine-grained emotions such as happy,like,sad,surprise and the traditional sentiment analysis methods can hardly solve the problem effectively.Therefore,this thesis studies on the emotion identification problem and multi-label fine-grained sentiment analysis problem,which include the following contributions:(1)For the emotion identification problem,the word2vec word vector model is first leveraged to get the vector representation of each word in a microblog,by which the deep sematic information that embedded in microblog is acquired.Then this thesis proposes a multi-instance learning based method for emotion identification and then compare the proposed method with the traditional emotion lexicon based algorithm and Boolean feature based and TF-IDF feature based machine learning algorithms,the experiment result shows that the proposed method performs better than the traditional method,which validates theeffectiveness of our multi-instance based emotion identification method.(2)Secondly,for the multi-label fine-grained microblogs sentiment analysis problem,a multi-label learning method based on Calibrated Label Ranking(CLR)for multi-labelfine-grained sentiment analysis is proposed in this thesis.In CLR based method,the TF-IDF is chosen as the feature weight.By comparing the CLR based method with the emotion lexicon based method,the experiment result shows that the CLR based method outperforms the emotion lexicon based method,which validates the effectiveness of the CLR based method in the microblog multi-label fine-grained sentiment analysis task.(3)Based on the proposed multi-label learning algorithm,for representing the underlying semantic relationship between the words in microblog short text,this thesis further put forward a multi-instance multi-label learning(MIML)based fine-grained sentiment analysis method for microblogs.The main idea of this method is to degenerate the MIML problem into Multi-Label Learning problem,and then use existing multi-label based method to do the multi-label sentiment analysis.In detail,this thesis proposes two degeneration algorithms,namely the constructive cluster based and word2vec sematic composition based degeneration algorithm.According to the degeneration algorithm,two MIML based method are proposed for fine-grained microblog sentiment analysis in this thesis.In the end,a multi-view learning based algorithm is put forward to combine the TF-IDF word features and the word vector sematic features together to form the final feature vectors,and then use the CLR to solve the multi-label sentiment analysis problem.The experiment result demonstrates that the MIML based method can achieve better performance compared with other algorithm in multi-label fine-grained sentiment analysis task.Compared with other sentiment analysis methods,the proposed MIML based method can effectively detect the underlying sematic information which is effective for the fine-grained sentiment analysis problem.(4)In the end,according to the research achievements,a Chinese microblog multi-label fine-grained sentiment analysis prototype system is designed and implemented in this thesis,which provides the multi-label fine-grained microblog sentiment analysis functions and various display interfaces.In summary,based on the multiple and complex characteristics of human emotions,for detecting the co-existing emotions in one microblog,this thesis dedicates to study the fundamental problem of multi-label fine-grained sentiment analysisand focus on multi-instance based emotion identification,multi-label learning based fine-grained sentiment analysis and MIML based fine-grained sentiment analysis for microblog,and prototype system construction.Lots of theoretical analysis and experiments show that these approaches are efficient and effective.The proposed algorithms and prototype system could further contribute to online public opinion monitoring and analyzing.
Keywords/Search Tags:Microblog Sentiment Analysis, Fine-grained Sentiment, Multi-instance Learning, Multi-label Learning, Word Vector Model
PDF Full Text Request
Related items