Font Size: a A A

Research On Internet Review Text Sentiment Analysis

Posted on:2016-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:L C CuiFull Text:PDF
GTID:2308330461484241Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and Internet, more and more people share their experiences and write reviews on various websites. The reviews express opinion and emotion of reviewers. People are accustomed to obtain information from the reviews on Internet and they sometimes make decisions based on it. We can find the evolution of reviewers’emotion by analyzing the reviews published on Internet. Mining potential information of it has great application value. Nowadays, we urgently need to use automation tools to gain the information we exactly want from the large amount of information on Internet, so as to cope with the challenges of information explosion. How to recognize the comments’sentiment and digging potential information from the huge mass of comment text is highly challenging as a Natural Language Processing research topic. It has also become a focus in the field of business intelligence, which caused many researchers to conduct research. The sentiment analysis of the Internet reviews emerged as required.Research on sentiment analysis of Internet reviews is a comprehensive research field which involves multiple disciplines. The main research methods are divided into two types:sentiment analysis using unsupervised learning and sentiment analysis using supervised learning. Sentiment analysis using unsupervised learning means estimating the sentiment orientation of reviews by using the information of sentiment words. Sentiment analysis using supervised learning means applying machine learning algorithms to sentiment classification. Generally, we divide the review dataset into a training set and a test set, then we take the step of word segmentation, deleting stop words, feature selection and representing the reviews of text vector form. Finally we train classifier and make analysis.In this paper, we used the NTTJSD and HowNet Dictionary as basic sentiment lexicon and we did unsupervised sentiment analysis with the help of Bing Search API which provided by Microsoft Azure. We also improved the Python library to make it support Chinese and cache the query result sets. Based on this, we put forward SO-PMI-Lexicon algorithm which improves SO-PMI-IR algorithm. It improves the classification accuracy by adjusting SO-PMI threshold to reserve the words which have obvious sentiment orientation. For sentiment analysis using supervised learning research, we obtained the baseline by applying mainstream supervised learning algorithms(SVM, Naive Bayes, Decision Tree) to electric business review corpus with the help of Weka(a kind of data mining tool). Finally, we tried to use the regularized SO-PMI value of each sentiment word in the corpus as the weight of document vector. The experimental result shows that it works.
Keywords/Search Tags:sentiment analysis, sentiment lexicon, pointwise mutual information, machine learning
PDF Full Text Request
Related items