Font Size: a A A

Research On Fake Comment Recognition Method Based On LDA And PW-Word2vec

Posted on:2020-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:S H JiaFull Text:PDF
GTID:2428330596992654Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The maturity of information technology has created conditions for the booming of e-commerce.Compared with offline shopping,consumers are more and more inclined to convenient online shopping.However,the existence of false comments makes it impossible for consumers to make objective assessments of commodities,and induces them to make wrong decisions,which infringe on consumers' rights,and also harm the interests of some honest merchants.Therefore,this paper is based on Yelp Online Comment Data Set,using LDA to process imbalance and PW(Probability Weight)-Word2 vec to construct comment eigenvectors,proposes a false comment detection model based on LDA and PW-Word2 vec.The main research contents are as follows:(1)Training word vector dictionary: Based on hotel and restaurant data,the Word2 vec model is used to complete the training of word vector dictionary.(2)Constructing the false comment detection model of LDA+Word2vec: In view of the imbalance of the true and false comment data volume in the experimental data,this paper proposes a method of LDA sampling imbalance processing,which makes the amount of true and false comments data consistent.Then,the eigenvector of each comment is extracted from the experimental data.Final,the LDA+Word2vec false comment detection model is constructed.(3)Constructing the false comment detection model of LDA+PW-Word2vec: In the process of LDA+Word2vec modeling,there is a problem on missing of comment text information.Thus,this paper further proposes LDA+PW-Word2 vec model for false comment detection.(4)Model experiment comparison: In order to explore the model of false comment detection with good performance,three groups of comparative experiments were conducted in this paper.The first is the comparison of LDA sampling imbalance treatment with random sampling imbalance treatment.The second is the comparison of LDA+Word2vec model and baseline model.The third is the comparison of LDA+PW-Word2 vec model and LDA+Word2vec model.Finally,these three groups of experiments are verified the validity of the proposed LDA sampling imbalance processing and the validity of the LDA+PW-Word2 vec false comment detection model.
Keywords/Search Tags:false comment, imbalance processing, text mining, LDA+Word2vec, LDA+PW-Word2vec
PDF Full Text Request
Related items