Font Size: a A A

Review Spam Detection Based On User Evaluation

Posted on:2016-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2308330503976823Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The development of social network shortens the distance between people and allows people to publish information various every kind of website. However, some of this information can be unhelpful spam reviews. Especailly in the area of electronic commerce many users will distribute meaningless product review. In order to solve this problem this thesis implements a spam detection system, which can extract comments from the web page automatically and identify the credibity of the reviews. In order to collect comment data, we apply the Firefox Add-on into our detection system. For the purpose of better detection results, we use methods in machine learning to distinguish the spam reviews in the system. The contents of this thesis include:1. By analyzing the difference between Firefox extensions and Add-ons, we research their implementation respectively and design a web browser plugin based on Add-on SDK. This Add-on can analyse the JavaScript code on the page in Taobao.com. By using jQuery it can accurately get the review data, and then the data will be pretreated before saving to the local file.2. Then we complete the SVM classifier, the navie Bayes classifier and Logistic Regression classifier for detecting the spam reviews and change the problem of spam detection into the recognition of review authenticity. After getting the review data we can construct the sample review data set for test and extract the comment attribute which can be used as reference for the reliability.3. In this part we analyze the requirement of the system and design the function relationship of each module. On this basis we realize the system for detecting spam reviews.4. By contrasting the experimental results of SVM to navie Bayes we analyze the effect due to the different size of the training data set and the different attributes of the review comments. It shows that both the two methods can largely improve the accuracy and the detecting speed. But the SVM method can still have a good test result with a small training data set. What’s more, the SVM classifier has different accuracies with different kernal functions.To identify the spam reviews, in this paper we design a review reliability detection system for products on e-commerce sites. It contains an Add-on based on Firefox Add-on SDK for collecting comments and a review analysis tool based on Qt, which applys SVM and navie Bayes classifier separately to test the comments. The result of the experiment turns out that the system is feasible in the review collection and review honesty detection, and can also help users know the authenticity of the reviews, to avoid the work of artificial identification.
Keywords/Search Tags:Spam review, Add-on SDK, Naive Bayes, SVM, Logistic Regression, Qt
PDF Full Text Request
Related items