Font Size: a A A

Research On Sentiment Classification Of Weibo Based On Word2vec And SVM

Posted on:2019-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:L L JiangFull Text:PDF
GTID:2428330566978001Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Web2.0,Weibo has become an important channel for people to obtain information,express their opinions and exchange their feelings.More and more users share their opinions and give their advice on the hot topic in Weibo.This type of data presents an explosive growth,and these ever-increasing data contains a lot of valuable information which has a very important application value for public opinion analysis and hot spot detection.How to identify the users' opinions indicated by the Weibo texts automatically and how to judge the affective information expressed by the Weibo texts accurately is the key points of this thesis.Text sentiment classification methods are divided into two types: supervised methods and unsupervised methods.This paper introduces the main steps and related knowledge of these two kinds of methods,and uses them to judge the emotional tendency of Weibo texts.In the unsupervised method based on sentiment lexicon,the problem is the existing public sentiment word lexicons having low coverage of sentiment words in Weibo.In the supervised method based on machine learning,the problem is the traditional model only focuses on the lexical features of words in sentences while ignores the semantic features when extracting features.Aiming at the above problems,the main research work is as follows:(1)This paper constructs the sentiment lexicon in the Weibo domain and proposes a method to calculate the sentiment of candidate sentiment.Construction of Weibo sentiment lexicon has important research significance and use value.In view of the problem that the existing sentiment lexicon has a low coverage rate of the sentiment words in Weibo,this paper put forward an SO-PMI algorithm base on Good-Turing smoothing to extend Weibo sentiment lexicon on the basis of HowNet and Dalian University of Technology Emotional Ontology.Then using rule-based method to judge the emotional tendency of experimental data.Experimental results show that the method has good effectiveness and accuracy of sentiment classification.(2)This paper proposes a weibo text sentiment classification methods based on the word2 vec tool and SVMperf tools.Firstly,word2 vec is used to train the high-dimensional word vectors of the words in the corpus,the distance method of word2 vec is used to calculate the similarity between different words so as to achieve the purpose of clustering and further expansion of the emotion dictionary,then select features based on sentiment lexicon.Then the KPCA method is used to reduce the dimension of the high-dimensional word vector to obtain a one-dimensional eigenvector,and finally input the one-dimensional eigenvector into the SVMperf classifier for training and classification.Experimental results showed that the method has good effectiveness and accuracy of sentiment classification.
Keywords/Search Tags:Sentiment Classification, Sentiment Lexicon, Good-Turing, word2Vec, SVMperf
PDF Full Text Request
Related items