Font Size: a A A

Online Public Opinion Text Sentiment Analysis Based On Machine Learning

Posted on:2020-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:W H FanFull Text:PDF
GTID:2428330596975087Subject:Information security
Abstract/Summary:PDF Full Text Request
Text sentiment analysis is one of the main research directions in the field of natural language processing.Its main research is centered on the representation of word vector,text feature extraction,and model establishment.With the gradual deepening of the influence of the Internet on people's lives,Internet text sentiment analysis has not only important research significance in the field of natural language processing,but also has important value for real life.In recent years,research on text sentiment at home and abroad has been ongoing,and many research results have been achieved,but most of the current research has language limitations,and fewer valid text sentiment datasets are currently available,so there are still many limitations to the sentiment analysis of current Chinese texts.Aiming at the sentiment analysis of Chinese texts,this thesis mainly studies the improvement of word vector representation model and text data augmentation strategy and the design of Chinese text sentiment classification model.At the same time,this thesis designed a Internet public opinion monitoring system based on the text sentiment analysis.The main work of this thesis is as follows:(1)Two improved word vector representation models are proposed in this thesis.They are CME(Concatenation Meta-embedding)and AME(Average Meta-embedding).Word2 vec and Glove vectors are integrated in different ways in the two models.Experiments show that the classification model can effectively improve the performance of text classification when applying the improved word vector.(2)Due to the lack of marked sentimental datasets in Chinese corpus,this thesis proposes a text data augmentation strategy based on EDA(Easy Data Augmentation)technology,which generates new text on the original dataset by means of synonym substitution and random insertion of words.Experiments show that the data augmentation strategy is more is more significant when the number of data sets is smaller.(3)HCRNN(Hybrid Convolution and Recurrent Neural Network)model is proposed as the text sentiment classification model in this thesis,the model integrates CNN?BiLSTM and BiGRU network and rely on word vector as the inputs,it can effectively extract the local and context features.Then,attention mechanism is added to the model so that it can pay more attention to the content of the text contributing more to emotional orientation,and then it extracts the TF-IDF features of the text as the auxiliary features of the model.finally,this thesis uses NLPCC2013 data set and open source Weibo sentiment analysis data set on GitHub to test the model.Experiments show that HCRNN model can effectively improve the accuracy of text sentiment classification.(4)According to the actual needs of text sentiment classification in real life,this thesis designs an Internet public opinion monitoring system based on text emotional analysis.This thesis introduces the design of the core functional modules of the system,and displays the prototype system with relevant web interfaces.
Keywords/Search Tags:Sentiment Classification, Word Vector, Data Augmentation, Deep Learning, Public Opinion Monitoring
PDF Full Text Request
Related items