Font Size: a A A

Sentiment Analysis Based On The Combination Of Dictionary And Machine Learning

Posted on:2018-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:W DingFull Text:PDF
GTID:2348330512989625Subject:Distributed parallel computing
Abstract/Summary:PDF Full Text Request
Emotion is a characteristic of human intelligence.Not only emotion can reflect the changes in the physical state of the body,but also be expressed through the text.Most of the corpus resources of emotional analysis are derived from user comments.The review text has become an important reference for consumers to buy goods.To obtain the emotional information from the text,we must extract the semantic feature information from the text and classify itfirstly.Due to it can not be timely to extract information-rich comments,and the emotional feature based on the dictionary method or the machine learning method is too single-sided,so it can not be very good to assist consumers to make decisions.Therefore,it is significant to extract the emotional characteristics of the commentary text and the subjective and objective classification of the comment text.dictionary-based research relies on emotional dictionaries,due to the new vocabulary and the number of non-registered words,the construction of emotional dictionary is more difficult,and the lack of quantification of the word.The method of machine learning can not solve the problem of emotional divergence caused by multiple emotional words.In this paper,an emotional analysis method based on dictionary and machine learning is proposed to obtain the combination of emotional characteristics which can improve the correct rate of subjective and objective.In this paper,the text of the mobile phone as the object of study related emotional analysis and research work,it will be based on the dictionary and the theme model combination method,based on the machine learning method as well as the dictionary and the machine synthesis method to obtain the emotional characteristic quantization.Experimental comparison the effect of quantified emotional characteristics on subjective and objective classification.The research work of this paper is as follows:(1)Research on dictionary expansion and polarity calculation.For the current general emotional dictionary can not meet the requirements of specific areas of emotional analysis.Based on the SO-PMI algorithm,this paper constructs the exclusive emotional thesaurus in the field of mobile phone which is composed of general dictionary,extended dictionary and special field dictionary.And quantifies the corresponding emotional features by using the combination of emotional word extraction and feature model representation.Experiments show that the method of combining dictionary and subject model further optimizes the quantitative representation of affective features compared with dictionary-based methods.(2)The Emotional Feature Mining of Machine Learning.In the selection of features and combinations,feature dimensions and classification algorithm selection to optimize,to maximize the accuracy of emotional classification.In the field of mobile phone commentary,the Bayesian,Logistic Regression and Support Vector Machines are used to classify the emotion classification.The Bayesian classification is the best.All the words,the double word collocation,all the words and the double word collocation,the abundant information word,the information rich words and the double word collocation as the characteristic choice combination way,the experiment obtains the rich word and the double word collocation is characterized in 1000 dimensional optimal results.(3)Research on Feature Selection and Classification Algorithm.The emotional weights,mean values,standard deviations based on the dictionary method and positive and negative emotional probabilities obtained from the method based on the machine learning method are used as the emotional feature candidates.Combining with information characteristics,attribute characteristics,language characteristics of random forest structure,The influence of different combinations of affective characteristics on the subjective and objective classification prediction is studied by using the random forest classifier to predict the subjective and objective classification of the comment text.The accuracy of the combination of emotional characteristics is the highest,and the accuracy of the stochastic forest classification algorithm is much higher than that of the support vector machine and Bayesian classification algorithm,which is based on the emotional analysis method combined with the dictionary and machine learning.
Keywords/Search Tags:Emotional dictionary, machine learning, emotional analysis, random forest, subjective and objective classification
PDF Full Text Request
Related items