Font Size: a A A

Research On Deceptive Opinion Spam Recognition & Review Usefulness Analysis

Posted on:2019-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:B PangFull Text:PDF
GTID:2428330566496872Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous popularization and development of Internet technologies,more and more people are spending their lives on the Internet in connection with food,clothing,housing,and transportation.In order to help users better make consumer choices,Most e-commerce platforms provide users with ways to comment on stars,and even have review platforms like yelp.Consumer users or comment users provide information sharing that indirectly affects other users.Therefore,businesses pay more and more attention to online reviews,and even some shops or businesses invite people to deliberately release a large number of praise comments or high-star ratings to indirectly Seek benefits,which brings many problems,such as reducing The existence value of the comme ntary platform has caused consumers to be deceived.In addition to this,the volume of reviews has increased dramatically.Even if the comments are all true,the user has to spend a lot of time screening the comments that are useful to him.These are all negative impacts on user experience.This article focuses on the spurious identification of reviews and the analysis of their usefulness.For the spurious judgment of the review,this paper has studied from four aspects.(1)This article maps users,review s,and store information into a graph structure.Through the capture of potential relationships between the three,iterative algorithms are used to ultimately discriminate comments.Authenticity.The results show that the setting of prior probability has a great influence on the effect of the graph model.Applying the MRF energy function can improve the model effect.(2)Four characteristics of TF-IDF,unigram,LDA,and POS were extracted,and experiments were performed on these features using a machine learning model.Further experiments were performed in conjunction with the behavior characteristics extracted from the above features and graph models,confirming that The behavioral characteristics are better than the textual features.The combination of the two features is better than the single feature.(3)From the perspective of text semantics,this paper uses the classical models such as CNN,LSTM,and GRU,the combined model constructed by classical models,and the novel VDCNN model to test the problem of false comments.(4)Explore the application of a variety of semi-supervised models on the issue of false comments.The experiments show that the Co-training model has an optimal accuracy of 74.38% after combining text features and behavioral features.On the analysis of the usefulness of reviews,this paper studies them from the perspective of classification problems and regression problems.The experimental results show that the effect of the joint training model is better than that of the single model,and the usefulness of the reviews under the SVR model is the best.
Keywords/Search Tags:spam option recognition, semi-supervisory, graph model, review usefulness
PDF Full Text Request
Related items