Font Size: a A A

Research On Sentiment Classification For Web Reviews

Posted on:2018-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y H WangFull Text:PDF
GTID:2348330569486436Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Recently,in order to quickly and effectively extract the sentiment information of the vast amount of Web reviews,sentiment analysis technology has been received a great deal of attention by scholars.As an important part of sentiment analysis,sentiment classification is a very challenging research topic.With the efforts of experts and scholars,the study of sentiment classification has made a series of achievements,but there are still many problems.For example,in the study of monolingual sentiment classification,traditional supervised term weighting scheme has shortcoming,the weight value calculated by the scheme can not accurately measure the importance of the term in the text;in the study of cross language sentiment classification,it is more common to use a single text representation scheme for text processing,but a single text representation scheme can not fully express text content.Therefore,when study how to improve the performance of sentiment classification,aiming at the two problems mentioned above,this thesis has conducted the following research:In the study of term weighting calculation,the researchers only use the local distribution and the global distribution of the terms in the document set to calculate term weighting in the traditional supervised term weighting scheme.It can be seen from the relevant studies that the contribution of the distribution of the terms among the different classes can also help to determine the sentiment tendency of the terms.Hence,this thesis proposes a term weighting scheme which combines the class contribution,local distribution and global distribution of feature terms.The scheme enriches the information of the term weighting value,overcomes the shortcomings of the traditional weight calculation scheme,and improves the effect of the sentiment classification.The experimental results show that the proposed scheme has good performance.Compared with a series of term weighting schemes,the proposed weight calculation scheme significantly improves the classification performance.In the study of cross-language sentiment classification,the traditional text representation method uses bag-of-words(BOW)model.In the BOW model,the word is considered to be relatively independent which will ignore grammar knowledge and the order of words.This will make the text content lost some semantic information in the process of converting to the document vector.This thesis uses two text representation models,BOW and Doc2 Vec,to represent the text which can make up the deficiency of single method and represent the text in a multi angles,so that the semantic information of the text is preserved to the maximum extent.In the model training stage,this thesis adopts the cooperative training algorithm to train the classifier.The method of multi text representation can better play the advantages of cooperative training algorithm.Compared with several other cross-language sentiment classification schemes,the proposed multi text representation scheme significantly improves the classification performance.
Keywords/Search Tags:sentiment classification, term weighting calculation, text representation, cross language sentiment classification
PDF Full Text Request
Related items