Font Size: a A A

Identifying Spam Reviews Based On Hidden Variable Model

Posted on:2018-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:M M XueFull Text:PDF
GTID:2428330518455133Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,e-commerce rise rapidly.Many E-commerce websites,such as Taobao,JD and Public Remark,which have become indispensable in people's daily life of good assistants.The online evaluation of the purchased goods has become a must-see process before buying the goods.Objective and impartial evaluation can guide consumers to make a rational decision,which brings economic benefits for the business.Instead,the unreasonable?exaggerated or slander evaluation not only brings trouble to the consumers and businesses,also pollutes the entire e-commerce environment.Therefore,it is particularly important to discover the behavior of malicious evaluation in the user evaluation data,which called detection of spam reviews.In this thesis,taking the user evaluation data of e-commerce application as the background,we proposed the definition of abnormal evaluation at first.By extracting the characteristics of abnormal evaluation,we adopted hidden Bayesian network(BN),an traditional probabilistic graphical model,and construct the BN to reflect the abnormal evaluation behavior of users.Then,we implement the detection of abnormal evaluation based on the mechanism of BN's probabilistic inferences.Consequently,we can obtain the real evaluation data,which help us to make a more complete e-commerce environment.In particular,the main work is as follows:(1)In view of the evaluation of data in the process of evaluation of the user is not true,and the evaluation content is too subjective,we select and define a representative of the four characteristics of the abnormal evaluation——user score,actual deviation of more users Account exception,store characteristics and emotional density.(2)For identifying Spam Reviews,we defined a probabilistic graphical model which possesses hidden variable,called HSRBN(Hidden variables of Spam Reviews Bayesian Network):for the structure learning,we give the method for inserting hidden variables based on the mutual information packet method and use the BIC scoring function to select the best hidden variable model.Meanwhile,we use EM-algorithm to make parameter learning.(3)In order to quantify the abnormal evaluation,we uses the exact inference algorithm based on the variable elimination method,and make the anomaly evaluation as the query variable.Then we evaluates the value of the maximum value of the spam variable as the evaluation result,which avoids the computational complexity's index growth because of the increase of.the variables' number.(4)Upon on the Public Remark site data,this thesis implements and tests the build and approximate inference approach of HSRBN,and verify the feasibility of the model;we designed and developed the identifying spam reviews software to illustrated our proposed methods.
Keywords/Search Tags:Online reviews, Bayesian network, Hidden variable model, Variable elimination, Identifying spam reviews
PDF Full Text Request
Related items