Font Size: a A A

Research And Implementation Of Short Negative Text Classification Based On Support Vector Machine

Posted on:2018-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2428330569498734Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development and popularization of Internet and mobile Internet terminals,the Internet has become an important medium for people to get information and express opinions.In the current social ideology,Many Internet users often pay attention to negative comments,some of these comments are malicious and belittle evenly.Sometimes,these negative comments easily dominate the public opinion and bring bad effects on society.On the basis of the above,it is more important to identify and classify the negative comment text than the positive comment text.In previous studies,sentiment analysis mainly focuses on the studies of commendatory and derogatory,it is a relatively coarse-grained method.In reality,Internet users are more likely to be attracted by some negative text.Therefore,we need a fine-grained identification and classification on sentiment analysis,only in this way can we grasp the true idea of the speaker and Provide reference for decision makers.Based on the above analysis,it is necessary to focus on more granularity of negative text.In this paper,we focus on the recognition and classification of short negative text.Based on the investigation of domestic and foreign situation in sentiment analysis,this paper pays attention to the study of method,step,feature weight algorithm,text representation and classifier selection in emotion classification.Firstly,we have made a detailed analysis and research on the advantages and disadvantages of the TFIDF algorithm in the feature weight calculation.On the basis of this,In order to improve the performance,we introduce the probability ratio and the information entropy and create a TFORH algorithm which is based on the TFIDF algorithm.After theoretical verification,the TFORH algorithm solved the problem that the TFIDF algorithm does not consider the influence of the interclass and intraclass distribution factors on the feature weight.Then,the paper introduces the Doc2 Vec word vector tool model to identify and classify the negative comment which is belongs to the neural probability text,we found that the effect of the neural network learning method in the emotional analysis is not good enough because of Information compression and short space of the corpus.Next,we use TF,TFIDF and TFORH weight algorithm proposed in this paper,the experimental results demonstrate the effectiveness of the TFORH algorithm.Finally,the paper implements a simple short negative text classification and recognition system.The system uses the Nie Shubin case news comment for the corpus source,trained a number of binary classifiers,and made a fine-grained classification of the negative text in the corpus.After testing,the system achieved good effect.
Keywords/Search Tags:Support Vector Machines, Short Negative Text, Recognition, Classification
PDF Full Text Request
Related items