Font Size: a A A

Fake Comments Based On Fusion Feature Detection Algorithm

Posted on:2017-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y JiaoFull Text:PDF
GTID:2348330512961287Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Commodity review is an important reference for users to buy goods,and plays an important role in online shopping.Driven by interests,more and more fake reviews occur in the electronic ecommerce,misleading consumers and resulting in serious losses.Therefore,the detection of fake reviews is of great significance.Electronic business platform contains a huge amount of information on the review,and with the continued growth of commodity sales.However,the current detection methods for fake reviews have certain limitations.The detection method based on the content of comments is difficult to meet the different types of goods under the false comment detection,for the field of knowledge has a strong dependence,poor generality.The behavior based detection method depends on the specific user's comments,and the recognition rate is low.In view of the above problems,this paper presents a systematic method to detect the electronic business platform false comments,mainly divided into three steps.First,an identification method of false comment on the target product is proposed.Due to the electronic business platform has a huge amount of reviews across a variety of types of goods,the current detection method accuracy rate decreased.Therefore,in order to effectively improve the detection efficiency and accuracy,we need screen the target product include false reviews.Through analysis and comparative study of mass data,we found user reviews score obey statistical specific law.When commodity score distribution deviates from the statistical law,the possibility of the existence of false comments contained commodity review information is higher.Experimental results show that the proposed method can identify the target product containing false comments.Second,a new algorithm for text similarity measurement is proposed.Traditional text similarity measurement algorithm is with low accuracy,this paper constructs a text tree structure based on the content structure of the comment text,and transforms the similarity measurement into the tree structure.By measuring the similarity of each layer of the tree structure,and then according to the weight of each layer,we find the final overall similarity.Experimental results show that this method is more accurate than the existing methods in the analysis of actual text data.Third,we proposed an algorithm based on fusing the characteristics of the false comments detection.The existing detection methods do not make full use of the user's historical behavior information,and the detection accuracy is' not high enough.In this paper,the dynamic characteristics of the user's behavior is mining by time series model,and then fused these features to find suspicious users.Finally,according to the static characteristics of the suspicious probability,with the aid of PU-Learning strategy to train the classifier,we achieve false comments.Experimental results show that the proposed method is better than the existing methods.
Keywords/Search Tags:fake reviews, similarity measurement, time series analysis, fusion feature, PU-Learning
PDF Full Text Request
Related items