Font Size: a A A

Sentiment Analysis For Product Reviews Based On Active Learning And Self-training

Posted on:2018-10-08Degree:MasterType:Thesis
Country:ChinaCandidate:B Q ChuFull Text:PDF
GTID:2348330515469304Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,the network is flooded with massive text information.Sentiment analysis technology is a kind of method to the analysis and processes of a large number of opinion documents.The research of product reviews is a significant and popular branch of sentiment analysis.Through the processing of the comment text,we can effectively extract the opinion information.In the product review texts,different people have different needs,the degree of product requirements and writing habits are also different,thus,the information showed in the texts is very complex.Especially in the domain of book reviews and movie reviews,there will be much evaluation of the film director,the starring,the special effects or the story structure,because it involves the exact experience,and these reviews in the process of sentiment analysis will make a great impact on classification result.In the field of product review,if only the data information of sample is used,many objective descriptions will be taken into account seriously affecting the accuracy of the classification.Thus,the first issue of the study is how to extract the subjective opinions.In addition,in order to achieve the higher accuracy,most studies rely on the use of a large number of labeled samples to train the classifier.Nevertheless,most of the data that can be easily obtained in real life are unlabeled or less labeled.Therefore,how to just extract useful data to be labeled and use few labeled sample to improve the performance of classifier is also one of the main objectives in this study.Regarding the above problems,the main research of this thesis is as follows:1.Considering the emotional noumenon and data information proposed a feature extraction algorithm named “Topic-Sentiment”.Different from the traditional machinery learning method,which is just using the information from the texts,extracting the opinion words through its subject can effectively extract the subjective emotion and improve the accuracy of the classification in later phrase.2.Proposing a sentiment classification method which is based on active learning and self-training.In the process of training the classifier,using both of the active learning and self-training to achieve the extraction of "useful" samples.After manually labeled,the extracted samples can be trained and the classifier can be trained in an iterative manner,with less labeled samples to get higher classification.Experiments were conducted with four categories of product reviews which are books,electronic products,kitchen utensils and movies.The proposed method in this thesis has an average accuracy rate of 79.2%,the highest is 94.126%,and average labeled rate is 23%.Compared with traditional machinery learning method,the new method uses less than 57% of the labeled samples to get a higher accuracy.
Keywords/Search Tags:Sentiment analysis, Product reviews, Emotional noumenon, Active learning, Self-training
PDF Full Text Request
Related items