Font Size: a A A

Research On Sentiment Analysis For Hot Topic Microblog

Posted on:2014-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2268330422951700Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In today’s rapid development of information technology, users need to face a lot ofdata, massive data contains the information needed by users. As a popular Internetapplication, microblog is developing fast, because of many users, the microblog topic ofdiscussion often become a hot topic, the information submitted by users about the topicoften have subjective meanings. In order to get public opinion on hot topics from userswe studied sentiment analysis tasks on hot topic microblog. Specific research includethe following four aspects:Firstly, the thesis analyzed the characteristics of hot topic microblog, discussed thesupervised methods on microblog subjectivity and polarity judgement, especially howto choose classifier and features. Through experiments the thesis discovered that withSVM, under the microblog subjectivity problem, combine textual and non-textualfeatures, select textual features with2and use emoticons, can lead to bestperformance. Under the microblog polarity problem, combine word level, sentence leveland emoticons features can reach best performance. The experiments proved theeffectiveness of the method.Secondly, facing the problem that sentiment analysis tasks on hot topic microblogdon’t have enough annotated corpus, the thesis introduced semi-supervised strategy,using Transductive SVM along with unlabeled data, achieved higher performance byusing the same features, proved the method’s effectiveness.Thirdly, the thesis introduced the opinion target extraction strategy, discovered thatnoun, noun phrase and hashtag are the vast majority of all targets. At the same time, dueto target extraction’s diversity, this thesis clustered the targets and gave relatedevaluation, improved the topic’s view.Finally, a complete hot topic microblog sentiment analysis system is designed andimplemented. The system integrated the methods that the thesis studied effectively, thewhole system includes microblog data obtaining sub-system, hot topic microblogsentiment analysis sub-system, results data storage sub-system and results visualizationsub-system.
Keywords/Search Tags:hot topic microblog, sentiment analysis, semi-supervised, classification, clustering
PDF Full Text Request
Related items