Font Size: a A A

Research On Weibo Sentiment Analysis Based On NLP

Posted on:2022-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:J K FangFull Text:PDF
GTID:2518306335461334Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In recent years,with the economic and social development of China,the popularity of mobile Internet is rising and the number of users of portable mobile terminals such as smartphones is getting bigger and bigger,which provide the basis for the development of sharing platforms of information such as Weibo.The rapid development of microblogs has generated a large amount of text data with subjective emotion.The sentiment analysis of these microblog text data are of great practical significance for business and society.So more and more researchers are interested in the research on sentiment analysis of microblog text in Chinese natural language processing.Sentiment analysis based on traditional methods requires sentiment dictionaries which formulated by researchers themselves.This kind of work not only consumes a lot of human resources,but also cannot adapt to the huge data scale of microblog text data today.With the rapid development of deep learning,sentiment analysis based on deep learning has been widely used.However,the general deep learning models are not comprehensive enough for extracting semantic information from text,which limits the effectiveness of sentiment analysis.Moreover,the sentiment analysis text datasets in Chinese are relatively scarce and the data quality is not high.In order to solve the problems,the main tasks of the research are as follows:Firstly,to address the problem of low quality of the sentiment analysis text datasets in Chinese,a higher quality Chinese microblog sentiment analysis text dataset is established by dealing with data duplication and content formatting problems,reclassifying sentiment polarity,and correcting sentiment polarity classification manually.Second,applying the ALBERT model to the sentiment analysis task.The[CLS]location output of the ALBERT model is stitched with the pre-trained paragraph vector generated by the Doc2Vec model to form a new text representation vector,which is compared with the traditional mean Word2Vec word vector model and the model that simply uses the[CLS]location output of the ALBERT model as the sentence vector.The evaluation metrics of both models applying the ALBERT model increased in Accuracy and F1-score,demonstrating the effectiveness of the ALBERT model for sentiment analysis task;And the model using the new text representation vector works best among the three models,proving that the Doc2Vec paragraph vector can extract the semantic information of the microblog text from different perspectives of the ALBERT model.The Doc2Vec model has an enhancing effect on the performance of the ALBERT model in the sentiment analysis task.Finally,to address the problem that the[CLS]location output of the ALBERT model cannot fully extract the semantic information of the microblog text,combining two types of bidirectional recurrent neural network structures with the ALBERT model to propose the ALBERT?BL?D model and the ALBERT BG D model.Word2Vec+CNN,Word2Vec+Bi-LSTM and Word2Vec+Bi-GRU are used as baseline models for comparison.Compared with the three baseline models,ALBERT BL D model and ALBERT?BG?D model have significant improvement in Accuracy and F1-score.And looking at the effect of the two newly proposed models together,the ALBERT BG D model works better for the sentiment analysis task,proving that the bidirectional gate recurrent unit structure has a better enhancement effect on the semantic feature extraction of the ALBERT model.The combined use of ALBERT model,bidirectional gate recurrent unit structure and Doc2Vec pre-trained paragraph vector can effectively improve the results of microblog sentiment analysis task.
Keywords/Search Tags:Sentiment Analysis, Natural Language Processing, Weibo Text Datase, ALBERT, Doc2Vec
PDF Full Text Request
Related items