Font Size: a A A

Research On Sentiment Analysis Model For Weibo Text

Posted on:2020-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2428330575952045Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and mobile communication,people participate in network activities more and more frequently.Weibo generates a large amount of data every day,which includes the user's emotional expression and comment analysis on things.how to extract emotional sentiment from these information has great value.Therefore,this paper studies the sentiment analysis model of Weibo text.By investigating domestic and foreign literatures,the current research on sentiment analysis models mainly includes emotional dictionary method,machine learning method and deep learning method.This paper analyzes these three methods by crawling Weibo data to find the best sentiment analysis model.Research based on traditional emotional dictionary methods.Using the Boson Emotion Dictionary,the text data is segmented and traversed by the dictionary and weighted to obtain the emotional polarity.Then,based on the method of adding emotional adverbs,the effect of the emotional dictionary is improved.The advantage of the emotional dictionary method is that it is fast and easy to judge sentences with clear subjective emotions,but its disadvantage is that the migration ability for different scenes is weak,and it is time-consuming and labor-intensive to construct an emotional dictionary for a certain field.Research based on machine learning methods.Firstly,the data is preprocessed,and the preprocessed data segmentation result is transformed into a word vector by the Skip-gram method in Word2 vec.At the same time,the Tencent open source word vector is used for comparison input,and then the mainstream machine learning classification method is used(Logistic Regression,stochastic gradient descent method,naive Bayesian,support vector machine,random forest,XGBoost)for supervised learning,and finally compare the test set confusion matrix of each model,and found that the model effect of Tencent open source word vector training is better than Word2 vec The word vector trained by the method.Among these methods,the model trained by the integrated thought method such as random forest and XGBoost is far superior to the single classification model.Although the accuracy of the machine learning method model has been greatly improved compared with the traditional sentiment dictionary,the disadvantage is that each trainer involves a large number of tuning parameters,and the migration ability for different business scenarios is not strong,and the machine learning method has developed into a bottleneck.Research based on deep learning methods.Through the comparison experiments of classical multi-layer perceptron neural network,circulating neural network,convolutional neural network and self-attention mechanism,the accuracy of various deep learning models is greater than that of emotional dictionary and machine learning.Among them,the accuracy rate of the self-attention mechanism model in the test set reached 91.12%.By comparing all the models,it is found that the model trained by the Attention Mechanism is superior to other models in terms of training speed and accuracy of the model test set.It uses the self-focus within the sequence to speed up the convergence of the model.Therefore,the model trained by the Attention Mechanism is the best model for comprehensive performance in sentiment analysis tasks.
Keywords/Search Tags:Emotional Analysis, Word Vector, Sentiment Dictionary, Machine Learning, Deep Learning
PDF Full Text Request
Related items