Font Size: a A A

Research And Implementation Of Sentiment Classification And Opinion Mining In Micro-Blog Text

Posted on:2019-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:M Q LiFull Text:PDF
GTID:2348330563453997Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the age of the Internet,people often express personal opinions about events and product on social networks.We can get peoples' opinions and evaluations about the hot topics and products by analyzing lots of user text data with the help of the methods of data mining and text sentiment analysis,which is helpful to discover Internet public opinion and get feedback about products.Therefore,the study of sentiment analysis of Chinese micro-blog is of high practical significance and commercial value.The language of Chinese micro-blog is simple,changeable and lack of emotional dictionary,which is the main problem of current sentiment analysis.In response to this problem,this paper studies from three aspects: dictionary expansion from Chinese short text sentiment,sentiment classification and opinion mining of micro-blog text.The research content and work of this article include:1)Firstly propose the method of combining double word vector with having the most similarity to expand sentiment dictionary,the method of 2E-SM(2 Embedding and Similarity Maximum)is based on existing emotion dictionary using Glove and Word2 Vec to obtain the candidate words set by calculating similarity among words,this combination can also capture the local and global similarity,then to determine the sentiment of candidate word according to having the most similarity with which sentiment,Experiments show that this method has a increase in accuracy rate compared with SO-PMI.2)Based on CNN and LSTM,some deep learning models are propose in the base of mixing word vector for micro-blog sentiment classification.For convolutional networks,considering the global and local semantic information,using Word2 Vec and Glove as dual channel,to construct double word vector convolution network(2 Embedding CNNs)2E-CNN;Then on the basis of 2E-CNN,after splicing embedding and vector of shallow feature about words,constructing the double fusion word embedding convolutional network(2 Embedding mixed with Simple Feature CNNs),this is called 2ESF-CNN,this model improves the accuracy rate increase about 1.8%,has the advantage of short training time.For two-way long and short time memory network BLSTM,after splicing shallow word features and word embedding similarily,combing with attention mechanism,this paper proposes a model named as ESF-BLSTM-ATT.After the these improvement,the accuracy rate is increased by 1.6% compared with previous models only in need of one fused word embedding..3)For the analysis of micro-blog's viewpoint mining,in order to ensure the generalization of the method,we use the method of combining the syntactic dependency relation with the extended emotion dictionary to conduct the work.First,extract some grammar rules from dependency relations,then combine emotion dictionary with emotion words and evaluation as the center,combine the evaluation and evaluation words together,and finally get a clearer view by fuzzy induction matching.
Keywords/Search Tags:Dictionary expansion, feature vector fusion, deep learning, sentiment classification, opinion mining
PDF Full Text Request
Related items