Font Size: a A A

Research On Text Emotion Classification Based On Improved Feature Selection Method

Posted on:2018-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y WangFull Text:PDF
GTID:2348330533463558Subject:Engineering
Abstract/Summary:PDF Full Text Request
With more and more people participating in the Internet,so the Internet has produced massive and researchable text information.How to effectively carry out emotional excavation of text information is one of the hot spots at the present stage.In this paper,through the analysis and research on the text emotion classification at home and abroad,this paper improves the existing feature selection algorithm and designs a combination of emotional dictionary and improved information gain feature selection method,and use the optimized support vector machine classifier to distinguish the positive and negative emotion in text emotion classification.Firstly,based on the existing information gain feature selection,for the problem of the information gain only focusing on the document frequency of the characteristic word and ignoring the influence of the corpus equalization,thus affecting the effective of the feature selection,the information gain is improved by adding the distribution of the word frequency in each class which is the distribution factor and the equilibrium probability,seeking to improve the classification performance.Secondly,in face of the existing information gain ignoring emtional factor used in the text emotion classification,we combine the emotional dictionary and the information gain characteristic selection,match the emotion word on the text,only emotional words are used to perform information gain calculation,achieved feature dimensionality reduction and reflected the importance of emotional words in the classification of text emotions.Thirdly,according to the characteristics of the corpus in this paper,faced to the problem that the text data is sparse and affect the classification performance by only using the matched emotion words.We use the weight of the emotional words and perform the information gain characteristics of all the words,not only reflect the importance of emotion but also improve the classification performance.Finally,in order to further improve the classification effect,the support vector machine classifier is optimized.Three kinds of optimization algorithms are used to optimize the support vector machine parameters and the hybrid kernel function is also used to optimize the classifier.Through the experimental comparison,choose the best optimization method as the final optimized classifier model.
Keywords/Search Tags:text emotional classification, feature selection, support vector machine, information gain, emotional dictionary, parameter optimization, mixed kernel function
PDF Full Text Request
Related items