Feature Analysis And Detection Of Review Spam Based On WEB Quality Feature Model

Posted on:2018-07-19

Degree:Master

Type:Thesis

Country:China

Candidate:X T Liu

Full Text:PDF

GTID:2348330521950783

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Growing e-commerce makes online shopping popular,customer reviews,the most important customer feedback, has a large and explosively increasing scale. For fairness and interaction, e-commerce platforms usually make the reviews public, so that, besides helping manufactures improve their products and service, the reviews can be good references to those potential buyers. Good rated products attract more buyers,otherwise sales would be worse. Based on this, some unscrupulous merchants could conduct deceptive positive commence to raise their own reputation or deceptive negative commence to frame their competitors.This thesis focus on the differences between spam reviews and truthful reviews,feature analysis is done from multi-dimensions which is inspired by Web Quality Model (WebQM).In this thesis, we extraction 3-dimension features which are from review source, review content,and review expression. Based on these high-discriminability features,we provide 2 different algorithms to achieve review spam detection.Two true data sets are used. For the gold-standard dataset,we focus on the differences between truthful reviews and spam reviews. Based on this, feature extraction is done from the review content and review expression, We proposed a modified PU-learning and make it used in the detection of review spam. The obtained results show that the proposed PU-learning method outperformed the original machine learning approaches, and achieves 86% F1 results.For the Amazon dataset, we labeled the data using Simhash and construction the experiment dataset with 3 thousand reviews. Based on the special properties of Amazon data,we extract the review source features and enlarge the review content and review expression features. Based on this,we used the gradient boosting decision tree (GBDT) algorithm to Amazon review spam detection and verified the feasibility of this algorithm, and achieved 88%F1 results finally.

Keywords/Search Tags:

review spam detection, multi-dimension features, PU-learning, GBDT

PDF Full Text Request

Related items

1	Research On Deceptive Spam Review Detection For Combining Multi-features
2	Fuse Multi-features To Identify Product Review Spam
3	Research On Review Spam Detection Based On Hierarchical Neural Network And Multivariate Features
4	Research And Realization Of Review SPAM Detection System Under Big Data Environment
5	Design And Implementation Of Spam Review Detection System Based On Deep Learning Algorithm
6	Research On Spam Review Detection Based On Integrated Multi-feature
7	Research On Review Spam Detection Method Based On CNN Network Optimization
8	Research On Review Spam Detectiocn Algorithm
9	Research On Spam Review Detection Of Logistics Front-end Trading Platform
10	An Research On Handling User Cold-Start Problem In Review Spam Detection