Font Size: a A A

Research On Review Spam Identification Methods Based On Bi-GRU

Posted on:2021-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LiFull Text:PDF
GTID:2428330626454364Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
The rapid development of mobile Internet enables people to post,transmit and check information through mobile devices at any time.As a platform for sharing data,the Internet has become a core part of people's culture,life and entertainment.Some online shopping websites,review websites and social networking sites have gradually become the common consumption platforms for People's daily consumption.At the same time,these websites also encourage consumers to post their shopping experience on the platforms.Therefore,more and more people are in the habit of referring to the review information before consumption.However,some review spam like fake comments review,malicious review and irrelevant review will seriously affect people's judgment and bring many adverse effects to the consumer market.Therefore,judging and identifying review spam can bring great help to people's consumption of life.It is found that the characteristics of comment behavior can help the model to judge review spam.For the study of review behavior,it is necessary to observe,summarize and design abnormal behavior characteristics.This leads to the cold-start problem of review spam.The single comment from a cold-start user is not representative of his review behavior.Therefore,it is necessary to study the cold-start problem of review spam.This paper merge review text characteristics and review behavior characteristics.BERT model is used to do word embedding as the feature of review text characteristics,and four abnormal behavior features are extracted to represent the behavior characteristics.In addition,the cold-start of review spam is considered.Based on KNN algorithm to solve the problem that cold-start users lack of behavior characteristic.And the recognition model of review spam is built by deep learning Bi-GRU,which introduces attention mechanism.Through the comparison of model results,based on the SVM,the method that based on the text characteristics and review behavior characteristics can improve the precision of model by 11.68% compared with the method based on the text characteristics.For the cold start problem of spam comments,based on Random Forest model,before dealing with the cold-start problem,the detection effect of the model is 85.39%.After solving the cold-start problem,the detection effect of the model is 89.46%.Finally,the results of the model show that BiGRU+Attention has a best detection accuracy of review spam compared with Random Forest and LSTM which is 93.78%.
Keywords/Search Tags:Spam Comments, Word Embedding, Review Behavior Characteristics, Deep Learning, Cold-start Problem
PDF Full Text Request
Related items