Font Size: a A A

Research On Text Sentiment Analysis Based On Improved Dictionary And Ensemble Learning

Posted on:2020-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:L Y YangFull Text:PDF
GTID:2428330578952411Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of Internet use and the popularity of social media such as online shopping and microblog,the number of texts on the internet has exploded.Mining massive text information has latent consensus value,commercial value and social value.For example,the emotional analysis of microblog posts can control public opinions on hot events and emergencies,and predict policy and political options.The analysis of commodity review information can mine consumers' attitudes and positions on products and merchants for the reference of the platform,as well as providing the basis for the improvement of products for merchants.At present,there are still many problems to be solved in the prediction algorithm of text sentiment analysis.For example,emotional dictionary has problems of few resources and poor timeliness,the generalization ability of single classifier is poor,the large-scale text classification of ensemble learning has the problem of time bottleneck.In the view of the above problems,this paper conducts research.The main work and innovation of this paper are as follows:(1)In order to improve the quality of emotional dictionary,aiming at the shortcomings of selection methods of baseline words and the problem that the difference between positive and negative reference categories is not taken into account in the calculation of sentiment words polarity,this paper proposes the method of center vector with outlier removed to select baseline words and improve the formula for calculation sentiment polarity of sentiment words.Firstly,in the method of baseline words selection,this paper finds out the outliers based on the proximity technology and deletes these outliers.Then,the reference vectors of each emotion category are calculated by the center vector method to dilute the error of the baseline words.Finally,according to the value of similarity between the new word vector and the center vector,the emotion polarity was calculated and added to the emotional dictionary.In the calculation of lexical semantic orientation,two parameters of positive and negative baseline vectors are introduced to improve the formula of semantic orientation.Experimental results show that this method combined with the improved semantic tendency formula can reduce the influence of the difference in the number of positive and negative reference categories on the decline of accuracy and significantly improve the accuracy of emotional dictionary classification.(2)Aiming at the problems that classifiers are highly sensitive to emotional features,and the single classifier has poor generalization ability,this paper proposes an ensemble learning method based on emotion feature optimization.Firstly,based on the improved dictionary,the emotional features of multiple classifiers are optimized by the rule of Chinese expression,and then the ensemble learning method is used to optimize multiple models.Experiments on multiple standard datasets,such as NLPCC,show that the ensemble learning method with optimized features can greatly improve the classification effect.(3)Aiming at the time bottleneck that large-scale ensemble learning in text classification experiment,in this paper,using the Spark distributed computing framework design has realized the ensemble learning and the model parallel algorithm,this algorithm can take full advantage of the cluster computing performance,while ensuring the text emotion classification index under the condition of constant,greatly shortens the time of integrated study text categorization of experiments prove the algorithm can expand sex good,are learning for the integration of massive amounts of text analysis provides a new solution.
Keywords/Search Tags:Word Embedding, Emotion Dictionary, Ensemble Learning, Model Parallelism, Text Classification
PDF Full Text Request
Related items