Font Size: a A A

Research On Opinion Mining Technology For Product Reviews Of B2C Websites

Posted on:2015-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:X X LiFull Text:PDF
GTID:2298330431987187Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the growing of B2C market size, the quantity of online product reviews given by consumers increases explosively. To identify and apply the valuable information hidden in the product reviews accurately and efficiently will bring enormous economic benefits and wide-open market, giving rise to topics on product reviews mining and analysis in recent years. In this paper, the sentiment classification and sentiment polarity analysis of product reviews are researched, taking product reviews of phones obtained from the B2C website-Jingdong mall as research objects. The main works are as follows:Classifying sentiment for product reviews applying Support Vector Machine (SVM) and Naive Bayes (NB) methods. Firstly, the elements of the training set are manually selected from online product reviews. Secondly, the corpus is pre-processing using the NLPIR word segmentation system, and the weight values of feature words are obtained using the TF-IDF formula. Finally, the feature selection methods MI, IG and CHI are compared and analyzed based on SVM and NB classifiers respectively. The experimental results show that, the classification effects of CHI based on SVM and NB are both over80%. Furthermore, the classification effect of each feature selection method on SVM is better than that on NB, and the accuracy of CHI achieves83%.Analyzing the fine-grained sentiment polarities of product reviews applying nearest-neighbor principle based on bidirectional iterative method. Firstly, the seed set of sentiment words is built using PMI-IR algorithm, and the relationships between feature words and sentiment words are obtained applying nearest-neighbor based on bidirectional iterative method. Secondly, a triple sentiment lexicon named Tri-HowNet, which is based on HowNet, is created. Finally, the polarities of sentiment words are indentified by experiments based on HowNet and Tri-HowNet respectively. The result shows that the effects of identifying the polarities based on Tri-HowNet are better when the sentiment words are multi-semantic.Designing a review mining system based on SSH. The system is composed of five modules, including dictionary maintenance, data collection, sentiment classification, sentiment analysis and visualization. Firstly, using the interface provided by the open source Java-Crawler4j, the reviews are obtained by simulating logging in. Secondly, the product reviews are analyzed at two aspects of text sentiment classification and sentiment analysis. Finally, the results are stored in the database, and they can be shown in form of3D bar graph in favor of users querying.
Keywords/Search Tags:Product Reviews, Opinion Mining, Sentiment Classification, SVM, Sentiment Analysis
PDF Full Text Request
Related items