Font Size: a A A

Review Spammers Based On E-Business Review

Posted on:2017-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:J B WangFull Text:PDF
GTID:2308330485460341Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of the era, e-commerce has become an important part of people’s life. How to pick out numerous goods suitable has become common problems, with reference to the comments of the buyer to pick goods can be regarded as a good solution.But then recoiled as some illegal businessmen to hire spammers, make consumers cannot judge very well.The purpose of this paper is to identify spammers, shopping convenience for consumers. This paper presents a new type of spammers identification model, the model has the data acquisition and data analysis of two main parts. Data acquisition part of this paper points out the comment electricity acquisition scheme, the data analysis part of this paper used a combination of text and behavior is used to identify the spammers.In conclusion, the main work is as follows:(1)This paper proposed across the e-commerce sites way of data collection.The e-commerce of spammers most employed platform businesses, through the cross-platform data acquisition, can better find hidden deep spammers.(2)Text classification,Different from traditional text corpus, e-commerce business review data with a large amount of data, a single comment on the characteristics of short, traditional classification algorithms do not well for identification.So this paper adopts the LDA model, subject text classification method, LDA model is carried out on the semantic recognition, and the semantic clustering, the theme of the generated text model.This paper use the LDA model for each items to determine commentary subject, think deviates from theme is larger the probability of spammers.(3)User behavior, this paper formally defines the e-commerce review behavior model, and according to the characteristics of the behavior of spammers, put forward the five behavior of the spammers, respectively is:the same commodity comment many times,the same commodity group comment many times, comment early, to score too high or too low, less comments reply.Every comment on calculation of 5 kinds of behavior score, judging by scoring the user is the probability of spammers.Got the comment with the theme of the deviation degree and behavior of the score.(4) Results integration,This paper user linear regression.First of all, according to the experience of each parameter assignment, linear regression after using artificial marking way to authenticate scored higher on comments, if it is wrong to identify the adjusted score, iterative finally be meet the requirements of spammers recognition model.At last, through contrast test to prove the correctness of the identification model.Experiments show that this method can better identify hidden spammers.
Keywords/Search Tags:spammers, e-commerce, review, LDA
PDF Full Text Request
Related items